Analysis

Rust-Powered ForgeCode Challenges Claude Code as Faster AI Harness

ForgeCode is winning attention by speeding up the orchestration layer, not the model. For Rust users, the real question is which harness gets out of the way fastest.

Nina Kowalski5 min read
Published
Listen to this article0:00 min
Share this article:
Rust-Powered ForgeCode Challenges Claude Code as Faster AI Harness
Source: liranbaba.dev
This article contains affiliate links, marked with a blue dot. We may earn a small commission at no extra cost to you.

The complaint that opened the race

Claude Code was useful right up until the wait started to feel like part of the job. Simple refactors could stall long enough to make the spinner feel like the main event, and that latency kept surfacing in team feedback even after Claude Code had been rolled out more broadly. ForgeCode enters that conversation with a very different pitch: not a smarter model, but a Rust-powered harness that makes the agent layer faster, more reliable, and easier to control.

AI-generated illustration

That framing matters because the argument is no longer about which LLM sounds best in a demo. It is about which orchestration layer can keep up with real developer work, where tool use, recovery from bad outputs, shell integration, and context handling matter as much as the model behind the curtain.

Why ForgeCode is being treated as infrastructure, not just another agent

ForgeCode describes itself as the world’s top-ranked open-source coding harness, and its GitHub numbers show that the project is attracting real momentum, with about 6.5k stars and 1.3k forks. It is released under Apache 2.0, written in Rust, and designed to be model-agnostic rather than tied to one vendor’s stack. That means it can wrap LLMs through OpenRouter or direct API keys and run straight from the shell, which is exactly the kind of low-friction control Rust developers tend to care about.

The built-in agent split is one of the clearest signs that ForgeCode is thinking like an operator’s tool, not a chatbot. It ships with forge for editing, sage for read-only research, and muse for planning, so the user does not have to treat every request the same way. That division of labor is the story here: the tool is trying to reduce failure modes by shaping how the model is used.

ForgeCode’s own history also shows why the project has become a serious alternative. It launched in late January 2025, reached version 2.8.0 by April 2026, and its early-access demand was sharp enough to stand out on its own. In a July 27, 2025 pricing post, the team said signups jumped 17x and usage rose 10x in just a few days, while the heaviest users were making thousands of AI requests a day and sometimes spending more than $500 daily on inference.

The benchmark fight is about more than bragging rights

The benchmark story is where ForgeCode starts to look less like an interesting open-source project and more like a direct challenge to Claude Code. ForgeCode’s site says it leads TermBench 2.0, and its March 3, 2026 benchmark post said the system initially passed 25 percent of the tests before engineering changes pushed it to 78.4 percent state of the art with gemini-3.1-pro-preview.

That said, the benchmark narrative needs the right reading. TermBench 2.0 is a harder, better-verified version of Terminal-Bench, announced on November 7, 2025, but ForgeCode’s own numbers are still self-reported. They are real data points, just not neutral ones, which means the comparison is useful without being the final word.

The more grounding signal comes from independent SWE-bench Verified results, where ForgeCode still leads Claude Code, but by a much smaller margin. That narrower gap is important because it suggests ForgeCode is not winning only in a friendly benchmark setup. It is also showing strength where verification is tighter and the advantage is harder to inflate.

What Anthropic added to Claude Code, and why it still feels different

Claude Code is not standing still. Anthropic launched it on February 24, 2025 as a limited research preview alongside Claude 3.7 Sonnet, which Anthropic priced at $3 per million input tokens and $15 per million output tokens, including thinking tokens. That launch made Claude Code part of a broader push to turn Anthropic’s models into usable coding workflows, not just API endpoints.

On May 22, 2025, Anthropic made Claude Code generally available and added background tasks plus VS Code and JetBrains integrations. Then on September 29, 2025, it pushed the product further with a native VS Code extension, terminal UX updates, and checkpoints for more autonomous operation. The progression is real, and it shows Anthropic tightening the workflow around the model.

Still, the competitive shape is different. Anthropic, under Dario Amodei, is shipping a model-led product that keeps adding orchestration features. ForgeCode is starting from the opposite direction, using Rust to build the control plane first and letting models plug in behind it.

Who actually benefits from the Rust-powered approach

The users most likely to feel the difference are the ones who live in iterative, tool-heavy workflows. If you spend your day asking an assistant to inspect code, rewrite files, plan a refactor, then go back and repair a bad patch, latency becomes a UX problem and recovery becomes a productivity problem. ForgeCode’s pitch is that a model-agnostic Rust harness can reduce both by controlling the path around the model, not just the quality of the prompt.

That also explains why this debate has spread so quickly through the Rust and agent-adjacent communities. A tool that can move between Claude, GPT, O Series, Grok, Deepseek, Gemini, and 300-plus models, while staying shell-first and open source, is not just another client. It is infrastructure that tries to make the whole category more usable.

There is also a direct cost story hiding underneath the performance story. Once top users are making thousands of requests a day and pushing inference bills past $500 daily, every wasted retry matters. Faster recovery, fewer dead ends, and better control over tool use stop being nice-to-haves and start looking like the difference between a sustainable workflow and a noisy one.

The real contest is for the orchestration layer

ForgeCode is challenging Claude Code in the place that matters most to working developers: the layer between intent and execution. Claude Code still has the advantage of Anthropic’s model ecosystem and its growing IDE surface area, but ForgeCode is making a stronger case that Rust can turn the harness itself into the product.

That is the broader shift worth watching. AI coding tools are being judged less like demos and more like systems software, where latency, reliability, and control decide whether the agent feels clever or merely obedient. In that world, the Rust-powered harness is not a side story. It is the battlefield.

Know something we missed? Have a correction or additional information?

Submit a Tip

Never miss a story.
Get Rust Programming updates weekly.

The top stories delivered to your inbox.

Free forever · Unsubscribe anytime

Discussion

More Rust Programming News