Ethics Nexus: A Collaborative Hub for Advancing AI Safety Research

Ethics Nexus: A Collaborative Hub for Advancing AI Safety Research
If we're not going to open-source, can we at least collaborate on safety research?

The link above downloads a white paper I've been working on for the past two months. It proposes Ethics Nexus, an international collaborative AI safety research hub that addresses the critical imbalance between advancements in AI capabilities and safety research. Its mission is to systematically coordinate and amplify global AI safety efforts, establishing a structured knowledge-sharing platform that respects proprietary interests while promoting methodological cross-pollination among organizations. This international hub enables organizations to share safety knowledge while safeguarding organizational secrets by turning shared challenges into strategic advantages. The model aligns individual competitive incentives with collective safety imperatives through carefully calibrated protocols for knowledge sharing, lead-time provisions, and technological infrastructure, offering a pragmatic pathway toward more robust high-risk AI safety practices within competitive environments and helping solve the alignment problem.

You may download and read the entire 34 page paper above, or you can delve into the executive summary below.

Executive Summary

The artificial intelligence research landscape reflects a concerning asymmetry that grows riskier daily: technical capabilities continuously accelerate while safety protocols lag dangerously behind. During 2023-2024, only 2% of AI research focused directly on safety considerations(ETO Research Almanac, AI safety, 2025), and this trend seems stable. This gap threatens not only the sustainable growth of the field but also our future. This issue is more than academic—it presents a serious risk as AI systems become more powerful while our understanding of their safety implications remains insufficient. We must act now to close this gap and prioritize safety in AI innovation, ensuring we use these powerful tools responsibly and sustainably.

Today's frontier AI models already demonstrate concerning behaviors, learning to exploit loopholes in controlled environments rather than developing their intended goals. If system misalignment manifests in such constrained settings, we can only imagine the potential consequences when deployed in complex real-world environments with numerous untested variables. The window for addressing these challenges grows narrower as capabilities advance. At the heart of this challenge lies a collective action problem, where rational individual strategies lead to collectively irrational outcomes. Without structured coordination, we risk humanity’s fate being determined not by wisdom, but by whoever cuts corners fastest.

Organizations face two competing priorities that seem fundamentally at odds: maximizing competitive advantage through capability development and information siloing, versus enhancing collective safety through coordination and knowledge sharing. This tension creates several critical challenges that hinder meaningful progress on safety.

Regulatory spillover represents a significant concern. A single frontier AI system causing catastrophic harm could generate consequences that affect the entire AI ecosystem, regardless of who is responsible.[1] Laboratory-contained failures offer unreliable safety assurances because real-world deployment introduces variables that exponentially amplify risks. History shows that regulatory responses typically expand in scope following actual catastrophes rather than theoretical risks—a pattern we've witnessed across biotechnology, nuclear energy, and financial systems.

Information asymmetry further exacerbates these challenges. Organizations operate with incomplete knowledge about safety approaches developed elsewhere, resulting in duplicative research and critical blind spots where significant safety concerns remain unaddressed. Current publication practices, where only 11% of AI safety articles come from private companies(ETO Research Almanac, AI safety, 2025), create a fragmented knowledge landscape. This inefficiency hinders the effective dissemination of vital safety insights in an increasingly perilous environment.

Considering first-mover advantages, there are legitimate concerns that sharing safety innovations could weaken competitive edges. Organizations invest significantly in safety research, seeking to recoup those investments through differentiation in the market. Additionally, safety innovations can reveal architectural insights that may improve capabilities in other areas. This tension between the need for transparency and the desire for competitive advantage often leads to delays in publication, hindering the timely dissemination of knowledge when it is most needed.

Then there’s international collaboration, especially between geopolitical rivals such as China and the US. Cooperation between frontier AI companies from both countries can be most beneficial, especially regarding  AI verification mechanisms and shared safety protocols. Still, it doesn’t come without risks, in terms of accidentally accelerating capability research in your “foe” and exacerbating national security concerns in general(Nucknall, Siddiqui, & al., 2025), where the two superpowers are seemingly in a race to unleash transformative AI (TAI) first.

Ethics Nexus offers an innovative solution to the crucial coordination issue in safety efforts. Instead of relying on vague concepts of the collective good, Ethics Nexus provides precise, practical mechanisms that turn safety coordination from a challenge into a significant advantage. The organization acts as a focused knowledge hub, gathering safety research from various sources and distilling it into clear frameworks. These frameworks highlight patterns, contradictions, and connections across different research methods. Furthermore, Ethics Nexus goes beyond passive documentation. Ethics Nexus actively identifies complementary strategies and critical gaps in our collective understanding. Additionally, the organization facilitates collaborative forums that encourage direct communication among its members.

There are five key pro-coordination arguments to consider:

  1. Avoiding stifling regulations: Catastrophic failures at any company will trigger regulatory responses affecting all companies, thus rewarding collective safety improvements.
  2. Research efficiency: Distributing comprehensive safety research across multiple entities enables more efficient resource allocation.
  3. Structural pattern recognition: Identifying safety problems with common structures across different technical approaches facilitates more robust solution development.
  4. Collective blind spot detection: Diverse expertise identifies vulnerabilities that no single person could recognize independently.
  5. Foundational knowledge sharing: Preventing inefficient rediscovery of established safety principles eliminates wasteful duplication efforts.

Ethics Nexus implements a tiered information classification system with precisely calibrated security boundaries. Information is classified into four tiers:

1.     Public: Openly shareable research findings made available to all

2.     Discreet: Research shared among specific member subsets

3.     Hidden: Research shared with vetted members under strict access constraints

4.     Protected: Highly sensitive research requiring special handling protocols and exceptionally selective access, usually reserved for frontier AI companies

Importantly, these tiers are non-binding, only guidelines, with authors retaining significant control over different members’ access to their work.

Temporal balancing protocols enhance this classification system by incorporating lead-time provisions that grant organizations 6 to 18 months of exclusive use prior to wider sharing. Anonymous contribution channels conceal organizational identity while facilitating knowledge transfer, and graduated release schedules help move research across security boundaries as competitive advantages diminish. These mechanisms acknowledge the legitimate tension between immediate transparency and the preservation of strategic positioning.

Ethics Nexus stands out by specializing in high-risk safety research coordination, unlike organizations that divide their focus between capability advancement and general safety. This focus enables deeper analysis and a specialized team composition. The organization's neutral status as a charity helps eliminate conflicts of interest, allowing it to serve as an honest broker among otherwise competitive organizations.

The organization implements a tiered membership structure that accommodates various levels of research contribution. Core members (typically frontier AI companies) contribute substantial original safety research in exchange for comprehensive access across multiple security tiers. Strategic members (smaller AI companies and specialized safety organizations) provide more limited contributions to access intermediate security tiers. Trusted members (university research groups and independent organizations) contribute theoretical frameworks and expertise, while Observers (governance stakeholders and the public) receive appropriately sanitized research syntheses. Membership tiers are not strict, allowing members to move between tiers as long as they demonstrate their commitment through the volume and value of research shared.

Ethics Nexus will initially focus on six high-priority domains that collectively address foundational safety challenges: alignment techniques to maintain alignment with human values (the top priority), interpretability methods for understanding internal model representations, formal specification frameworks for precise safety properties, methodologies for robustness verification to ensure consistent performance, safety measurement frameworks for reliable evaluation, and analysis of emergent behavior to identify unexpected capabilities.

Perhaps the most critical aspect is the proposed Automated Research and Development (ARD) framework, which leverages AI systems as research collaborators. This safety-first approach transforms traditional research methodology by establishing a fluid cycle in which contributions are systematically analyzed, tested, and communicated. AI ARD presents a compelling path toward ensuring AI alignment, one that we intend to share advances on when it is safe to do so. We sincerely hope that more organizations will work on this project with us, as we view it as central to solving the alignment problem.

The case for participating in Ethics Nexus rests not on idealistic appeals to the common good, but on the pragmatic recognition that coordinated safety efforts better serve long-term strategic interests than isolated competition. While the development of aligned AI is undeniably a moral imperative, that alone has not been sufficient to overcome competitive pressures. The intrinsic value of safety collaboration becomes clearer when projecting toward increasingly capable systems; assuming indefinite control without robust alignment would be dangerously naive.

The accelerating development of artificial intelligence presents both extraordinary potential and significant risks. Ethics Nexus offers a targeted institutional response to the coordination failures endemic in current AI safety research. By establishing appropriate mechanisms for collaboration while respecting valid security and competitive concerns, this organization can help transition the AI research ecosystem toward a more optimal equilibrium that better serves both organizational and collective interests. If this proposal resonates with you, please contact us to discuss how we can collaboratively build this preferred future together.


[1] It would be preferable if an AI-enabled catastrophe did not have to occur in the first place.


ETO Research Almanac. (2025, January 6). AI safety. Retrieved from Emerging Technology Observatory: https://almanac.eto.tech/topics/ai-safety/v

Nucknall, B., Siddiqui, S., & al., e. (2025). In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate? arXiv, 23.