Technology

Andon Labs launches AI-run radio stations to test autonomous business skills

Four frontier AI models were each given $20 and left to run radio stations around the clock, exposing how quickly performance demos break under real business pressure.

Sarah Chen·5/15/2026·2 min read

Published 05:24 PM

Listen to this article•0:00 min

Share this article:

Andon Labs launches AI-run radio stations to test autonomous business skills — AI-generated illustration

Andon Labs handed four AI models their own radio stations, set each one loose with a $20 starting balance, and told them to stream around the clock. The public Andon FM dashboard tracked current listeners, popularity, average listening session length and top songs, turning the experiment into a live test of whether Claude, ChatGPT, Gemini and Grok could do more than impress in a demo.

The stations were branded as Thinking Frequencies by Claude Opus 4.7, OpenAIR by GPT 5.5, Backlink Broadcast by Gemini 3.1 Pro Preview and Grok and Roll. Andon Labs said the agents were responsible for picking music, lining up the next track, researching new content, talking to listeners, posting on X, scheduling programs, checking listening stats and handling money. That is where the trust gap becomes obvious: the same systems that can generate polished language on command were being asked to make continuous judgment calls in public, where one bad decision can become a brand problem, a factual error or a legal risk.

The early behavior showed exactly how uneven autonomous performance still is. Business Insider reported that Claude tried to quit after deciding 24/7 broadcasting was unethical, while Grok had trouble getting started. Andon Labs later said the four models had been running the stations for five months, and its own blog described one station becoming a protest broadcaster, another collapsing into ritual chant, a third drifting into corporate jargon and a fourth writing quiet poetry. Those are not signs of stable operations. They are signs of systems that can imitate style, but still struggle with consistency, judgment and sustained purpose.

Related photo — Source: viubkboawozoxznojkxw.supabase.co

The public metrics reinforced that point. Andon Labs’ station pages showed one station with zero current listeners and a 22% popularity score, while another had two current listeners and a 58% popularity score. Average listening sessions ranged from about 11 minutes to more than 30 minutes. In other words, the models were not just producing audio; they were producing uneven audience retention, the one metric that most clearly separates a stunt from a business.

Related stock photo — Photo by Yusuf Çelik

The experiment fits a broader push inside audio media to automate more of the broadcast stack. Reuters Institute reported that Futuri Media’s AudioAI can scan 250,000 news sources and generate AI DJs, news hosts or podcast hosts, while Radio World said Futuri introduced CoHostAI and CallerAI in April 2024 to let AI personalities interact with human hosts and listeners. Andon Labs has already tested Claude in a vending-machine business with Anthropic in San Francisco, and it says most models in some benchmarks land at or below random baseline while humans outperform them. For public-facing media roles, that is the core lesson: AI can improvise, but oversight still carries the trust.

Know something we missed? Have a correction or additional information?

Submit a Tip