Analysis

OpenAI and Broadcom push AI chips to cut inference costs

OpenAI and Broadcom unveiled Jalapeño, a first-generation inference chip built in nine months to make model serving cheaper and more scalable.

Lauren Xu··2 min read
Published
Listen to this article0:00 min
OpenAI and Broadcom push AI chips to cut inference costs
Source: techcrunch.com

OpenAI and Broadcom unveiled Jalapeño on June 24 as OpenAI’s first Intelligence Processor, a chip built around LLM inference rather than raw compute bragging rights. OpenAI said early testing showed the first-generation accelerator should deliver performance per watt substantially better than current state-of-the-art systems, and that the chip moved from design to production in nine months with help from OpenAI’s models.

The bigger signal is how far OpenAI is reaching down the stack. The company said Jalapeño was designed from scratch around its roadmap of models, kernels, serving systems and product needs, with Broadcom and Celestica helping with chip implementation, board and rack integration, networking and scalable production systems. OpenAI also said engineering samples are already running ML workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark.

OpenAI framed the platform as a multi-generation effort meant to be deployed at gigawatt scale with data center partners. Broadcom’s Tomahawk networking silicon is part of the production plan, and OpenAI said the architecture is meant to reduce data movement and bring realized utilization closer to theoretical peak performance. A detailed technical report is due in the coming months, but the direction is clear: serving models is becoming a cost-and-capacity problem as much as a product problem.

AI-generated illustration
AI-generated illustration

That is the part monday.com people should care about. Cheaper and faster inference changes what enterprise teams can realistically automate at scale, because latency drops, per-task costs fall and always-on agents become easier to justify in workflows that run all day. For engineers, that raises the bar on performance tuning, observability and architecture decisions. For product managers, it means a feature can be technically possible long before it is economically sane. For sales, it gives a cleaner story to enterprise buyers that AI is moving out of pilot mode and into industrial infrastructure. OpenAI’s broader move into chips, networking and deployment architecture shows how quickly the AI layer is being vertically integrated from the silicon up.

This article was produced by Prism’s automated news system from verified source data, official records, and press releases, then run through automated quality and moderation checks before publishing. The system is built and supervised by the people who set the standards it runs under. Read our full AI policy.

Did this article answer your question?

Discussion

More Monday.com News