The Logic of Catastrophe

Cody Albert

10 Jun 2025 • 1 min read

It's that simple.

The following is a logical argument for AI-caused catastrophes, assuming current trajectories of development, broken down into premises that build upon one another until they reach the catastrophic conclusion.

Premise 1: AI systems are demonstrating rapid capability gains across domains, with measurable improvements in reasoning, planning, and autonomous action. Multiple AI labs and forecasters project human-level performance or above on economically significant tasks within 1-5 years, creating an urgent timeline for safety preparation.

Premise 2: Such systems will possess unprecedented capability to cause harm at scale, whether via rogue misalignment with human values, exploitation of vulnerabilities that go unanticipated, or deliberate theft and misuse by bad actors.

Premise 3: Already, we are witnessing AI's resistance to being shut down, demonstrations of deceptive alignment, systematic reward hacking, and the failure of current safety methods against strategic models. There is no reason to suggest we will solve all of these problems very soon.

Premise 4: AI governance mechanisms require years of institutional development, difficult international coordination, and technical implementation. Meanwhile, AI capabilities double approximately every 6-12 months across key benchmarks. This creates a fundamental asymmetry where governance complexity scales linearly while capability complexity scales exponentially.

Premise 5: Advanced AI development is concentrated among a small number of organizations with distinct values and incentives. Perfect alignment with humanity may be impossible, but alignment with specific groups' values is more tractable, creating risks of AI systems that optimize for narrow stakeholder interests rather than broader human welfare.

Premise 6: AI systems capable of autonomous action in interconnected global systems (financial markets, supply chains, communication networks) could trigger cascading failures that outpace human response capabilities. Historical precedents (2008 financial crisis, COVID-19 pandemic) demonstrate how complex systems can rapidly overwhelm governance institutions designed for slower-moving challenges.

Conclusion: The empirical evidence suggests that AI systems capable of causing irreversible global catastrophes will likely emerge before we develop effective safety measures specifically tailored to their unpredictable methods. Without dramatically accelerating safety research and governance development and slowing down capabilities advancement, we risk eventually producing AI-caused global catastrophes.