Technology

OpenAI unveils Jalapeño AI chip with Broadcom for faster inference

OpenAI and Broadcom unveiled Jalapeño, a first Intelligence Processor built for LLM inference and tested for better performance per watt.

Marcus Williams·6/24/2026·1 min read

Published 02:55 PM

Listen to this article•0:00 min

Share this article:

Follow on Google

OpenAI unveils Jalapeño AI chip with Broadcom for faster inference — Source: The Verge

OpenAI and Broadcom unveiled Jalapeño, a custom AI accelerator built for large language model inference, and OpenAI said the chip moved from design to production in nine months. The company described Jalapeño as its first Intelligence Processor, a sign that it wants more control over the hardware layer behind its models.

OpenAI said early testing showed substantially better performance per watt than current state-of-the-art hardware. Engineering samples are already running machine-learning workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark, a detail that points to the chip’s immediate role in inference rather than general-purpose computing.

The chip was presented to OpenAI chief executive Sam Altman and president Greg Brockman by Broadcom chief executive Hock Tan and semiconductor president Charlie Kawwas. OpenAI and Broadcom said Jalapeño is the first AI accelerator in a multi-generation compute platform they are building together, with the architecture designed to work across large language models as OpenAI pushes to reduce its dependence on existing chip suppliers.

The unveiling came alongside the companies’ broader plan to deploy 10 gigawatts of OpenAI-designed AI accelerators. That partnership, disclosed on October 13, 2025, targeted the start of rack deployments in the second half of 2026 and a full rollout by the end of 2029, putting Broadcom at the center of an infrastructure buildout that could reshape the economics of frontier AI.

Broadcom has argued that custom chips can be tuned across compute, memory, network I/O and packaging for customers’ workloads three, five and even 10 years out. For OpenAI, the move ties its model ambitions more tightly to the hardware stack, where power consumption, reliability and supply control increasingly determine how fast AI systems can scale.

This article was produced by Prism’s automated news system from verified source data, official records, and press releases, then run through automated quality and moderation checks before publishing. The system is built and supervised by the people who set the standards it runs under. Read our full AI policy.

Know something we missed? Have a correction or additional information?

Submit a Tip