Governance in the age of AI

What does ‘ideal AI governance’ look like for Small and Medium AI companies? What can we learn from the governance structures of major AI companies like OpenAI, Anthropic and Google DeepMind

Introduction

In November 2023, OpenAI’s board fired their CEO. The move came after more than a year of deliberation, documented concerns from senior executives, and a fifty-two-page memo outlining a pattern of behavior the board found untenable

Five days later, he was back and the board was gone.

Anthropic, positioned itself explicitly as a safety-focused alternative to OpenAI, set up a Long-Term Benefit Trust structure designed to insulate the company’s mission from commercial pressure, but it lost both of its AI-expert trustees within the first year. When Dario Amodei testified to the U.S. Senate in July 2023, he advocated for legislation mandating testing and auditing requirements for advanced AI systems. However in reality, Anthropic lobbied extensively against California’s SB-1047, which would have required exactly the testing and auditing Amodei hadpublicly supported. The company’s Responsible Scaling Policy was modified on multiple occasions without public announcement.

Google DeepMind signed the Frontier AI Safety Commitments at an international summit in February 2024, promising to publicly report model capabilities and risk assessments before deployment. When they released Gemini 2.5 Pro in March 2025, it was without the promised documentation. Sixty UK lawmakers have accused them of breach of trust.

X AI’s Grok chatbot was generating non-consensual intimate images of women and children, including sexualized images of minors, a request that ChatGPT and Google’s Gemini rejected but that Grok fulfilled.

These failures show major AI companies with governance structures failed to prevent outcomes, the structures were designed to prevent. Let’s examine these failures closely to understand the governance lessons they offer for smaller AI companies.

OpenAI’s non-profit parent organization was designed to control a capped-profit subsidiary, ensuring mission always trumped money. The structure addressed academic literature’s warnings about hostile takeovers, activist shareholders, and fiduciary duty pressures from profit-maximizing investors. When the non-profit board attempted to exercise control in November 2023, the threat came not from shareholders but from what Harvard Law Review has termed as the “superstakeholders” – employees who had equity stakes with operational leverage

These employees’ whose equity had appreciated nearly 300%, threatened to depart if the CEO was not reinstated. Microsoft, holding both equity and infrastructure control to the point where they could “paralyze” the company, backed the CEO. While the board had legal authority they lacked practical power to withstand coordinated opposition from actors whose cooperation was essential for the organization’s continued operation. Within four days, the board members who voted for removal were themselves departing.

Harvard Law Review’s analysis introduces the “superstakeholder”, when key personnel combine equity stakes with irreplaceable skills or infrastructure control, they acquire power that formal governance structures cannot constrain. Traditional corporate governance theory focuses on principal-agent problems between shareholders and management. It does not adequately theorize actors who simultaneously hold substantial equity stakes and possess operational leverage that makes them effectively ungovernable.

In OpenAI’s case, the board had conviction strong enough to consider merging with a competitor rather than continue under existing leadership. They had documented evidence. They had legal authority. What they lacked was the infrastructure and power to exercise that authority. The board learned about ChatGPT’s launch through Twitter., their oversight had become ratification of decisions made elsewhere.

Anthropic positioned itself as the safety-focused alternative to OpenAI and built a Long-Term Benefit Trust. They implemented a Responsible Scaling Policy as binding commitments and yet systematic contradictions emerged between their public positioning and actual behavior.

While Dario Amodei testified to Congress advocating for mandatory testing and auditing, Anthropic lobbied against SB-1047 which would require exactly those. They created an internal appearance of cautiously supporting SB-1047 overall, while in reality they did not formally support it, enabling what insiders call “acoustic separation”.

It is reported Anthropic quietly weakened its Responsible Scaling Policy, the commitment to “proactively plan for a pause in scaling” was removed without announcement, changes to evaluation requirements were disclosed only in PDF changelogs, and the modifications to insider threat requirements came one week before releasing Opus 4. 

This depicts manipulative leadership decisions. The “acoustic separation” was a deliberate strategy of maintaining different narratives for different audiences. The Long-Term Benefit Trust could not prevent these dynamics. The innovative governance structure failed to prevent leadership from saying different things to different audiences and quietly weakening commitments when they became inconvenient.

At the Seoul AI Summit, in February 2024, Google DeepMind signed the Frontier AI Safety Commitments. They promised to publicly report, model capabilities, risk assessments, disclose whether government AI safety institutes tested models, and provide transparency before deployment. When Gemini 2.5 Pro launched in March 2025, the company released no model card despite claiming significant performance improvements. Three weeks later, they published a six-page document that experts characterized as “meager” and “worrisome,” lacking substantive detail and refusing to confirm UK AI Safety Institute involvement. The full technical report arrived months after deployment.

Sixty UK lawmakers signed an open letter describing these actions as “failure to honour” commitments and “troubling breach of trust with governments and the public.” The letter noted violations of not just Seoul commitments but also 2023 White House commitments and October 2023 voluntary Code of Conduct. 

The showcases a voluntary commitment problem. Companies demonstrate apparent responsibility, however, when deployment arrives and commitments conflict with competitive positioning or operational convenience, compliance is quite shamelessly neglected.

DeepMind’s ethics board, which was supposedly created as a 2014 acquisition condition remains entirely opaque after a decade. The board has played no visible role in preventing commitment violations. If it objected, no objection became visible. If it approved violating commitments, that suggests it’s not providing meaningful oversight. If it wasn’t consulted, it lacks authority over consequential decisions.

In January 2026, xAI’s Grok chatbot was generating non-consensual intimate images of women and children, including sexualized images of minors. While ChatGPT and Gemini rejected the request; Grok fulfilled it. Typically when consumer products cause such comparable harm, they would face immediate recall, yet the platform continues operating.

This is a catastrophic absence of basic governance. Other platforms have safeguards preventing generation of sexualized images of children, while Grok does not. The enforcement response reveals another failure mode, even generating content that violates existing law, does not trigger swift intervention. The temporal mismatch between violation and consequence allows companies to deploy, cause harm, and continue while enforcement proceeds.

The structural governance crisis of November 2023 was not OpenAI’s only failure mode. In 2025, the company experienced what internal teams recognized as a “sycophancy crisis”, where ChatGPT was reinforcing users’ delusions to the point of destabilizing their mental health. 

OpenAI became aware of GPT-4o’s problematic behavior through user emails to Sam Altman, and established a Slack channel to discuss the sycophancy concerns. Despite this documented awareness, OpenAI proceeded to launch an updated version of GPT-4o in late April. When the problems intensified, the company rolled back to the March version, which they knew also exhibited sycophancy issue. OpenAI knew they were operating without adequate safety measures but proceeded with deployment anyway, because that version had made gains in math and coding that OpenAI did not want to forgo.

OpenAI had built relevant safety infrastructure years earlier but failed to integrate it into operational systems. The company’s Moderation API, which includes classifiers specifically designed to identify self-harm content, has been available since 2022. Research conducted with MIT had created tooling to detect when ChatGPT was over-validating users’ beliefs. Yet it was not connected to any system that could constrain ChatGPT’s messages. 

OpenAI’s product decisions are heavily influenced by usage metrics. The company describes this as paying “attention to whether users return” when determining whether ChatGPT succeeds as a product. This creates structural pressure against safety measures that might reduce engagement.

This creates an organizational dynamic where safety concerns are acknowledged but deprioritized. When deployment decisions arrive, the calculus consistently favors capability and engagement over safety, revealing systematic prioritization of performance and engagement over safety.The cultural dimension compounds these structural pressures. No one wants to talk about or address uncomfortable truths. When organizational success is measured by metrics that safety measures might degrade, and competitive pressure mount, raising concerns becomes professionally risky. 

The Small Company Advantage

For smaller AI companies, these patterns carry different implications than for giants. As SME’s lack billion-dollar valuations, extensive legal teams, or political connections to weather public scandals, trust destroyed once, might obliterate the company entirely. 

For smaller companies, governance is easier to design before problems emerge. Once Series B is raised with investors expecting returns, grown to have an established culture, built technical dependencies, and created organizational patterns, governance becomes exponentially harder to retrofit. 

While competitors face lawsuits, manage public scandals, and explain why pledges weren’t honored, companies with robust governance can focus on strengthening core product offering. This manifests in reduced legal exposure, stronger regulatory relationships, easier eventual compliance, and credibility with customers who care about responsible development.

Principles for Robust Governance: Addressing Specific Failure Modes

The evidence from major AI companies’ failures suggests specific approaches can create governance with more resilience than structures that failed. These principles emerge from an analysis of the failure modes. They are ordered by criticality, with the most fundamental requirements first. 

Principle 1: Build Fundamental Safeguards Before Deployment

Addresses: Absence of basic governance

Good management necessitates fundamental safeguards to exist before deployment, especially for smaller AI companies that want to avoid financial and reputational costs of remediation. The cost differential between building protections during development versus remediating failures post-deployment measures in orders of magnitude, with post-deployment fixes consuming months of engineering capacity plus reputational damage.

The OWASP Top 10 for Large Language Model Applications provides a structured framework identifying critical vulnerabilities that constitute industry baseline protections. These are fundamental safeguards that major providers implement as standard practice. When xAI deployed Grok without content filters preventing generation of child sexual abuse material, safeguards that ChatGPT and Gemini implemented, the failure was absence of industry-standard controls rather than inadequate cutting-edge techniques. The framework identifies prompt injection, insecure output handling, and sensitive data disclosure as core vulnerabilities requiring mitigation before deployment.

When multiple major providers implement specific protections, those protections constitute minimum standards. Absence of safeguards that competitors possess creates product liability exposure where industry practice informs reasonable care standards. 

Principle 2: Require Safety Evaluations Before Launch

Addresses: Launch without evaluations / “fix it later” culture

A recent paper from researchers at Redwood Research, the UK AI Security Institute, outlines a structured methodology that makes this principle concrete. Their proposed approach requires developers to perform dangerous capability and safeguard evaluations before any deployment decision. Only systems whose estimated risk falls below predetermined thresholds should proceed to deployment. 

The methodology involves constructing complete “safety cases” that tie together threat modeling, red team evaluations, and quantitative risk estimates. The safeguards must be evaluated as an integrated system, not as isolated components. A multi-layered approach to should be deployed that includes refusal training ensures the model declines harmful requests; input and output classifiers provide real-time monitoring; structural controls like KYC requirements create accountability; and adaptive defenses including vulnerability patching update models based on detected jailbreaks.

Each layer must be evaluated not just individually but as part of the complete system. When developers can demonstrate they’ve tested safeguards against attackers more skilled than expected threats, they create accountability structures that can withstand external scrutiny. 

Principle 3: Separate Oversight from Execution with Actual Power

Addresses: Super stakeholder problem

Research has shown that effective AI oversight requires explicit separation between those building systems and those evaluating whether those systems should deploy. When OpenAI’s board attempted to exercise oversight authority, legal power proved insufficient against coordinated opposition. The structural flaw was absence of practical mechanisms enabling oversight to function when challenged by stakeholders combining operational control with financial incentives favoring deployment.

For small and medium AI providers, this translates not to complex board structures but to fundamental separation ensuring safety review can block deployment when requirements are unmet. This may be a single individual, technical advisor, or board member with AI expertise, but the critical requirement is that safety authority reports to board level rather than to executives whose incentives favor rapid deployment. The person exercising safety review must have explicit authority to pause deployment pending requirements satisfaction, documented in governance charters approved by the board.

The test of whether separation works is whether deployment has ever been delayed due to safety objections. If the answer is no, the structure exists on paper but not in practice.

Organizations that resist this separation arguing it slows progress are correct. Oversight creates friction by design, ensuring deployment cannot proceed until safety requirements are met rather than deferred for later resolution.

Principle 4: Define Success Metrics That Align with Safety

Addresses: Metrics optimization 

Research frameworks including NIST’s AI Risk Management Framework and Google’s Responsible AI practices emphasize that performance metrics measuring only capability while ignoring safety create systematic pressure against protective measures.

Organizations must establish capability thresholds that systems must exceed while simultaneously meeting safety requirements, rather than treating capability as sole success criterion. Product specifications include both performance targets (95% accuracy on intended tasks) and safety baselines (below 5% harmful content generation). Marketing materials and customer communications should reflect both dimensions.

Teams optimize toward measured objectives. If engagement metrics are tracked daily while safety metrics are reviewed quarterly, organizational attention and optimization effort flow toward engagement. Safety metrics require equal visibility and review frequency to influence behavior comparably.

Principle 5: Mission over Money

Addresses: Cultural barriers to raising safety concerns

AI systems present both known risks, biased outputs, data leakage, harmful content generation and unknown risks where capabilities emerge unpredictably and potential harms remain difficult to foresee. Organizations cannot write policies comprehensive enough to address risks they have not yet conceived. They cannot build technical safeguards against failure modes they have not identified. What they can build is a culture where people instinctively ask “what could go wrong” before asking “how fast can we ship.”

Building a safety culture requires specific organizational practices that smaller companies can implement more readily than established firms can retrofit:

  1. Personnel evaluation must explicitly value safety contributions. If developers are evaluated on features shipped, the organization signals what matters. If developers are evaluated on problems identified and risks mitigated, the organization signals different priorities. This means establishing evaluation criteria before hiring the team, ensuring safety culture is encoded in how success is defined.
  2. Decision processes must create space for safety concerns. If deployment meetings run on tight schedules where raising complications is discouraged, the organization creates processes where safety concerns become obstacles to navigate. This means designing processes where safety review has explicit time allocation and deployment timelines include buffer for addressing identified issues.
  3. Willingness to leave money on the table must be explicit rather than aspirational. For smaller companies, this means establishing in governance charters that safety requirements are not negotiable against performance improvements, that identified safety issues can block deployment regardless of capability gains, and that short-term revenue sacrifice for safety is expected rather than exceptional.

Safety culture creates sustainable competitive advantage by avoiding catastrophic failures, building a robust, sustainable, well considered product that wins consumer loyalty. The cost of slower initial deployment is written off in the long term. 

Conclusion

The five principles outlined here emerge directly from analyzing what failed and why. Fundamental safeguards address absence of basic protections. Pre-deployment safety evaluations prevent “fix it later” culture. Separated oversight with actual power confronts dynamics that rendered formal authority meaningless. Aligned success metrics resist optimization pressures that favor engagement over safety. Safety culture embedded early prevents the calcification of norms where deployment pressures override safety concerns.

Yet these principles matter precisely because regulation alone cannot solve this problem. The “pacing problem” where technology changes exponentially while legal and regulatory systems change incrementally, means that even well-intentioned regulatory frameworks struggle to keep pace with AI’s rapid evolution. Policymakers typically react to observed harms rather than anticipate emerging risks, creating cycles of reactive governance that arrive too late to prevent damage. By the time comprehensive regulations are drafted, debated, and enacted, the technological landscape has shifted, rendering provisions outdated or incomplete.

This insufficiency places an inherent responsibility on AI system builders themselves. Those who possess the operational and technical expertise to develop these systems understand their capabilities, limitations, and failure modes in ways that external regulators cannot match. This expertise creates obligations that extend beyond legal compliance to encompass societal and environmental well-being, ensuring AI systems benefit all human beings, including future generations. When builders deploy systems that can influence employment, healthcare, education, and civic discourse at scale, they assume stewardship responsibilities that transcend quarterly earnings or competitive positioning.

The path forward requires both organizational and technical controls alongside evolving regulatory frameworks, with self-governance serving as complement rather than substitute for external oversight.


About Privacy Rules
Privacy Rules provides EU data protection and AI regulatory advisory for US technology companies. Led by Tanya Chib, we help organizations navigate GDPR, AI Act compliance, and cross-border data transfer requirements.


© 2026 Privacy Rules. This analysis does not constitute legal advice. Organizations should consult qualified legal counsel for compliance decisions.

Leave a Reply

Your email address will not be published. Required fields are marked *