Catalonia reveals AI search's multilingual retrieval and authority problem
Catalonia shows AI search can get the language right and the authority wrong, surfacing different realities before it ever writes an answer.

The real problem is authority, not translation
Catalonia is a brutal stress test for AI search because two official languages share the same geography, but not always the same institutions, habits, or sources of truth. That is what makes multilingual AI search so much more than a translation issue: the system is deciding which publishers, pages, and local perspectives count as canonical before it ever generates the answer.
That matters even more now that Google says AI Overviews are available in more than 200 countries and territories and more than 40 languages, with a rollout that already passed more than 100 countries and territories and more than 1 billion global users each month. Google also expanded AI Overviews across Europe, including Spain in Spanish and English, plus Belgium and Switzerland in multiple languages. In other words, this is not an edge case anymore. This is the default search layer for a huge chunk of the web.
Why Catalonia exposes the flaw
Ask the same question in Catalan and Spanish, and the result set can diverge in more than wording. Search behavior in Catalonia shows that language identification errors can reshape rankings, citations, and the final AI answer itself. If the system misreads the language, it can pull from the wrong corpus, favor the dominant-language version of a topic, or surface a different institutional perspective as if it were the only one that matters.
That is the authority problem in plain English. AI search does not just localize a result set, it chooses which sources are relevant enough to cite. In a place like Catalonia, that means the model is not only translating the question, it is quietly deciding whether Catalan institutions, Spanish-language publishers, or some blended regional framing gets treated as the default truth.
The language picture in Catalonia is not simple
Catalonia is not just bilingual in a tidy, textbook way. Idescat’s Survey on Language Uses of the Population 2023 shows a notable diversity of Catalan use across the region, and the numbers make the point clearly. In 2023, 32.6% of people aged 15 and over in Catalonia had Catalan as their habitual language, down from 36.1% in 2018.
The shift is even clearer when you look at identification. Catalan as a language of identification fell from 36.3% in 2018 to 30% in 2023. At the same time, mixed usage grew: Catalan and Spanish as a habitual language rose from 7.4% to 9.4%, while other language combinations increased from 3% to 5.6%.
That is exactly why AI search can get this market wrong even when the translation looks fine. Catalonia is layered, not flat. Identity, official language, daily use, and publisher authority do not always line up neatly, and search systems that collapse those layers into one generic Spanish-language answer are flattening the market in a way humans in the region do not live it.
Google’s own guidance tells you what good handling looks like
Google Search Central has been saying for years that multilingual sites need different URLs for different language versions, hreflang annotations, and page language that is obvious in the visible content. Google also says Search tries to find the right locale page for the searcher, and that some sites can be both multilingual and multi-regional.
That guidance matters because it shows the failure mode from the platform side. If the page language is unclear, the URLs are not distinct, or the regional intent is muddy, the retrieval layer has less to work with. The result can be a wrong-language default, a misplaced locale page, or an AI summary that cites the wrong version of the brand’s own story.
For multilingual sites, the practical lesson is blunt:
- Use separate URLs for each language version.
- Add hreflang so search systems can distinguish language and region.
- Make the visible language unmistakable on the page itself.
- Build pages as distinct market entities, not as copied translations.
- Add local proof, local terminology, and region-specific context so the system can recognize the boundary.
What this means for brands trying to be visible in mixed-language markets
The biggest mistake is assuming your translated page is enough. In AI search, a translated page without distinct authority signals can be treated like a weaker duplicate, not a separate market asset. That is especially dangerous in places with overlapping linguistic identities, where a larger-language corpus can swallow the smaller one if the system does not get a clean signal.
That is why Catalonia is such a useful proof point for brands operating in Spain, Quebec, Belgium, Switzerland, or any other mixed-language environment. You are not just competing for keywords, you are competing for which version of your market the model thinks is real enough to surface. If the AI sees one language as the default and the other as an afterthought, visibility erodes in ways traditional SEO reports can miss.
The safest strategy is to treat each language and region as its own authority layer. That means localized pages, distinct editorial framing, and content that proves you understand the market boundary, not just the language. When the model asks, “Which source should I trust here?”, your pages need to answer with structure, not just translated text.
Catalonia is the warning label
The story from Catalonia is not that AI search is bad at translation. It is that AI search can be good enough at translation while still being wrong about authority, relevance, and local truth. That is a more serious problem, because it changes which institutions get cited, which version of a market becomes canonical, and which audience the system thinks it is serving.
For multilingual brands, the lesson is simple and uncomfortable: visibility now depends on whether search understands your market as a real local ecosystem, not just a language variant. In Catalonia, that difference is the whole game.
This article was produced by Prism’s automated news system from verified source data, official records, and press releases, then run through automated quality and moderation checks before publishing. The system is built and supervised by the people who set the standards it runs under. Read our full AI policy.
Did this article answer your question?


