Entertainment

Roblox’s AI chat filters rewrite and block messages, players say

Roblox scans and alters chat with AI in real time, company materials show, players report overblocking that can silence innocuous messages and break gameplay.

Dr. Elena Rodriguez·3/5/2026·3 min read

Published 05:13 PM

Listen to this article•0:00 min

Share this article:

Roblox’s AI chat filters rewrite and block messages, players say — Source: techcult.com

Roblox now scans and alters players’ chat in real time using machine learning models, company documents and technical accounts show, and users report the system sometimes scrubs entire conversations, even harmless words like “hi” and “hello.” The disconnect between Roblox’s safety pitch and gamers’ experience is prompting fresh scrutiny of how the platform balances child protection with everyday play.

Roblox’s public materials emphasize conservative, proactive moderation. “Text messages are scanned as they are typed using machine learning models trained to detect harassment, sexual content, hate speech and attempts to share personally identifiable information,” the company says, adding that “content that violates policy is blocked before it reaches other users rather than removed after reports are filed.” Roblox also describes Sentinel, an AI trained on human-reviewed conversations “to detect early signals of potential child endangerment, such as grooming.”

Technical accounts describe a high-scale, low-latency moderation stack. A ZenML case study says the platform processes an average of 6.1 billion chat messages and 1.1 million hours of voice communication per day for roughly 97.8 million daily active users, and must serve “over 750,000 requests per second for text filtering,” with models that make decisions in milliseconds. Experimental real-time interventions such as in-experience warnings and time-outs have yielded modest gains: ZenML and PYMNTS report a 5% reduction in filtered chat messages and a 6% decline in consequences from abuse reports.

An industry podcast and developer commentary add detail on the models and deployment. A May 9, 2024 podcast excerpt notes Roblox “adopted a technology called Bert and distill Bert” and optimized those models to run much more efficiently so “as you're typing we are checking what you're typing and we just don't let it through.” The podcast and a technical case study reference transformer-based models, quantization and distillation techniques, GPU serving, and multilingual support, though figures differ: one source says models were “originally” extended to 16 languages, while ZenML reports coverage across 28 languages.

That architecture is visible to players in ways the company may not intend. On Roblox’s DevForum, dozens of users describe a cascading censorship problem: after one message is tagged, subsequent messages are automatically censored for minutes, regardless of content. “The chat filter quite literally filters anything that is said, even words like ‘hi’, ‘hello’, etc,” one forum post says. Another user reported losing rounds in the social deduction game Among Us because they “are not allowed to communicate, you can’t make your statement and you might loose the round.” A forum thread posts raw examples of blocked output, including repeated lines of hashes and tokens such as “###############################”, “!clear”, “rip”, and “We love Roblox tags.”

Some forum participants speculate the system’s attention-based context linking may cause the cascade: “IIRC Roblox uses AI with attention context to infer connections between your messages, which means if one message gets censored the rest may be interpreted as continuations and get censored aswell,” a user wrote.

Roblox frames its approach as a safety-first trade-off. “We strive to make our systems as safe as possible by default, especially for our youngest users,” the company says, and acknowledges the difficulty of detecting subtle harms across conversational history. But the evidence in public documentation and player reports leaves open key questions: whether a newly reported “real-time chat rephrasing” rollout is distinct from existing typed-message scanning, how rephrasing versus blocking is implemented, and why age-gated experiences sometimes face strict filters. Those gaps matter for millions of daily players and for debates over automated moderation’s limits and transparency.

Know something we missed? Have a correction or additional information?

Submit a Tip