U.S. expands AI model testing deals with Google DeepMind, Microsoft, xAI
Washington widened its AI testing pact to Google DeepMind, Microsoft and xAI, giving CAISI access to unreleased models before launch and after deployment.

The Commerce Department has turned a voluntary testing arrangement into something much closer to a standing federal review channel for frontier AI. On May 5, 2026, the Center for AI Standards and Innovation, housed inside the National Institute of Standards and Technology, said new agreements with Google DeepMind, Microsoft and xAI will let the U.S. government evaluate models before they are publicly released, assess them after deployment and pursue targeted research on frontier AI security risks. CAISI said it has already completed more than 40 evaluations, including reviews of state-of-the-art systems that have not yet been released, while also naming itself industry’s primary point of contact within the U.S. government for commercial AI testing and research.
What the government will actually look for is narrower and sharper than the broad promises that have often surrounded AI safety. CAISI said its evaluations are aimed at demonstrable national security risks, including cybersecurity, biosecurity and chemical weapons concerns, along with assessments of foreign AI systems, backdoors and other covert malicious behavior. The agency also said the deals support information-sharing, voluntary product improvements and work in classified environments, and that developers often provide models with reduced or removed safeguards so CAISI can test them thoroughly. Through the CAISI-convened TRAINS Taskforce, evaluators from across government can join the process and feed back into the assessments.

The new framework builds directly on the Biden-era model, but it is more operational and more explicitly tied to national security. In August 2024, the U.S. AI Safety Institute announced agreements with OpenAI and Anthropic that gave the government access to major new models before and after public release. Those arrangements were grounded in the 2023 AI executive order, which directed federal agencies to manage safety and security risks, including dangerous biological materials, software vulnerabilities and foreign misuse. The earlier deals were framed as collaboration on safety research, testing and evaluation; the new CAISI agreements are being renegotiated under directives from Commerce Secretary Howard Lutnick and the Trump administration’s America’s AI Action Plan.


That shift matters because it tests whether voluntary oversight can harden into something closer to enforceable national policy without Congress writing a new AI law. By giving the largest labs a formal role in the testing regime, Washington is also letting them help shape the rules that may later constrain them. For now, the United States is relying on repeated access, classified evaluations and negotiated safeguards rather than mandatory pre-clearance, but the structure now exists for the government to demand far more than reassurance.
Know something we missed? Have a correction or additional information?
Submit a Tip

