OpenAI and Anthropic have each signed unprecedented deals granting the US government early access to conduct safety testing on the companies’ flashiest new AI models before they’re released to the public.
According to a press release from the National Institute of Standards and Technology (NIST), the deal creates a “formal collaboration on AI safety research, testing, and evaluation with both Anthropic and OpenAI” and the US Artificial Intelligence Safety Institute.
Through the deal, the US AI Safety Institute will “receive access to major new models from each company prior to and following their public release.” This will ensure that public safety won’t depend exclusively on how the companies “evaluate capabilities and safety risks, as well as methods to mitigate those risks,” NIST said, but also on collaborative research with the US government.
The US AI Safety Institute will also be collaborating with the UK AI Safety Institute when examining models to flag potential safety risks. Both groups will provide feedback to OpenAI and Anthropic “on potential safety improvements to their models.”
NIST said that the agreements also build on voluntary AI safety commitments that AI companies made to the Biden administration to evaluate models to detect risks.
Elizabeth Kelly, director of the US AI Safety Institute, called the agreements “an important milestone” to “help responsibly steward the future of AI.”
Anthropic co-founder: AI safety “crucial” to innovation
The announcement comes as California is poised to pass one of the country’s first AI safety bills, which will regulate how AI is developed and deployed in the state.
Among the most controversial aspects of the bill is a requirement that AI companies build in a “kill switch” to stop models from introducing “novel threats to public safety and security,” especially if the model is acting “with limited human oversight, intervention, or supervision.”
Critics say the bill overlooks existing safety risks from AI—like deepfakes and election misinformation—to prioritize the prevention of doomsday scenarios and could stifle AI innovation while providing little security today. They’ve urged California’s governor, Gavin Newsom, to veto the bill if it arrives at his desk, but it’s still unclear if Newsom intends to sign.
Anthropic was one of the AI companies that cautiously supported California’s controversial AI bill, Reuters reported, claiming that the potential benefits of the regulations likely outweigh the costs after a late round of amendments.
The company’s CEO, Dario Amodei, told Newsom why Anthropic supports the bill now in a letter last week, Reuters reported. He wrote that although Anthropic isn’t certain about aspects of the bill that “seem concerning or ambiguous,” Anthropic’s “initial concerns about the bill potentially hindering innovation due to the rapidly evolving nature of the field have been greatly reduced” by recent changes to the bill.
OpenAI has notably joined critics opposing California’s AI safety bill and has been called out by whistleblowers for lobbying against it.
In a letter to the bill’s co-sponsor, California Senator Scott Wiener, OpenAI’s chief strategy officer, Jason Kwon, suggested that “the federal government should lead in regulating frontier AI models to account for implications to national security and competitiveness.”
The ChatGPT maker striking a deal with the US AI Safety Institute seems in line with that thinking. As Kwon told Reuters, “We believe the institute has a critical role to play in defining US leadership in responsibly developing artificial intelligence and hope that our work together offers a framework that the rest of the world can build on.”
While some critics worry California’s AI safety bill will hamper innovation, Anthropic’s co-founder, Jack Clark, told Reuters today that “safe, trustworthy AI is crucial for the technology’s positive impact.” He confirmed that Anthropic’s “collaboration with the US AI Safety Institute” will leverage the government’s “wide expertise to rigorously test” Anthropic’s models “before widespread deployment.”
In NIST’s press release, Kelly agreed that “safety is essential to fueling breakthrough technological innovation.”
By directly collaborating with OpenAI and Anthropic, the US AI Safety Institute also plans to conduct its own research to help “advance the science of AI safety,” Kelly said.