Technology

Anthropic Launches Claude Code Auto Mode, Letting AI Self-Check Actions for Safety

Anthropic's Claude Code auto mode self-checks every action for safety and prompt-injection attacks before executing, with the company yet to disclose exactly how it decides what's risky.

Maria Santos·3/25/2026·3 min read

Published 03:25 AM

Listen to this article•0:00 min

Share this article:

Anthropic Launches Claude Code Auto Mode, Letting AI Self-Check Actions for Safety — Source: techcrunch.com

Anthropic released auto mode for Claude Code on March 24, 2026, resolving one of the most persistent friction points in AI-assisted development: the choice between drowning in permission prompts or handing the model unchecked authority to act.

Anthropic describes the new mode as a middle ground between the default configuration, which requires user approval for each file write and command execution, and skipping permissions altogether, the approach developers previously used to avoid interruptions. Auto mode uses AI safeguards that review each action Claude wants to take before it executes, checking that the model won't do something the user hasn't requested and scanning for signs of prompt injection attacks. If the guardrails deem an action safe, Claude Code proceeds; if not, the action is blocked.

Before each tool call runs, a classifier reviews it to check for potentially destructive actions like mass deleting files, sensitive data exfiltration, or malicious code execution. The classifier can still let risky actions through when user intent is unclear or when Claude lacks sufficient environmental context, and false positives happen too, with benign commands occasionally getting flagged.

The transparency gap is already drawing scrutiny. TechCrunch noted that Anthropic has not yet disclosed the specific criteria by which its safety layer distinguishes between safe and risky actions, a question developers are likely eager to understand before broad deployment, and reached out to the company for additional information.

On the rollout, Anthropic confirmed that Claude Teams users can access auto mode as a research preview immediately, with Enterprise and API users set to receive access in the coming days. The company says the feature currently only works with Claude Sonnet 4.6 and Opus 4.6, and recommends using it in "isolated environments," sandboxed setups kept separate from production systems to limit the potential damage if something goes wrong. Auto mode may also slightly increase token consumption, expenses, and latency on tool calls.

Auto mode follows the earlier launch of Claude Code Review, Anthropic's automatic code reviewer designed to catch bugs before they hit the codebase, and Dispatch for Cowork, which allows users to send tasks to AI agents to handle work on their behalf.

Alongside auto mode, Anthropic separately introduced a fast mode for Claude Code. The feature works with both Claude Sonnet 4.6 and Opus 4.6, the latter announced in early February 2026 with improved coding and agentic task performance. Fast mode is designed to accelerate developer workflows, with Anthropic claiming it can process requests up to 2.5 times faster than standard Claude Code. The feature targets high-intensity development tasks, including debugging, reviewing large codebases, and generating complex scripts under tight deadlines, and is being rolled out first to internal users before gradually expanding to select external developers via the Claude Code platform and Anthropic's developer API.

The auto mode addition signals Anthropic's push toward genuinely autonomous coding agents, systems that can operate for extended periods without human intervention. Whether the safety layer is tight enough to earn that trust at scale remains, for now, an open question Anthropic has yet to answer publicly.

Know something we missed? Have a correction or additional information?

Submit a Tip