Language is not just a collection of labels for the world – it is the blueprint for how we think, feel and experience reality. As the cognitive scientist Prof Lera Boroditsky puts it, each language “provides its own cognitive toolkit and encapsulates the knowledge and worldview developed over thousands of years within a culture”.
This is especially true of the Arabic language. Its rich linguistic range and unique expression of emotion, thought and culture often have no parallel. However, Arabic – like many other languages – is contending with the rise of AI, an advanced technology that has made its way into our daily lives.
What happens when emerging tech and AI developments fail to account for the Arabic language – especially in commonly used models such as ChatGPT and Claude that have woven themselves into our daily lives? As it stands, popular large language models and chatbots are trained predominantly on English content. According to data from Statista, less than 1 per cent of AI models are trained on Arabic content.
The result? Stilted, unnatural answers that lack nuance. As Mohammed Moneb Khaled, an Emirati Arabic AI researcher, told AramcoWorld magazine in June, models like ChatGPT tend to give Arabic responses that “sound unnatural, and literal translations do not have the same meaning”.
Important cultural subtext gets lost because the AI is essentially trying to force Arabic ideas into an English mould. This “AI translation” approach is not just a minor inconvenience, it can erase linguistic and cognitive diversity. So, as such technology become more ubiquitous, the absence of Arabic-native AI is not a mere oversight, it is a cultural crisis.
Why does this matter? Because AI is not just another tech product – it is poised to become our confidant, adviser and assistant. When AI cannot speak Arabic reliably and fluently, it does not just fail on a technical level, it fails culturally.
Consider the depth lost when AI translates the rich concept of “tarab” merely as “musical enchantment” or reduces the complex emotional resilience described by “sabr” to “patience” when it encompasses so much more – a sense of graceful perseverance and faith during hardship. These are not linguistic gaps, they are chasms of cultural concepts.
These words carry emotional and cultural weight built over centuries. They resist clean translation because they emerge from Arabic’s unique cultural context. When we lose these words, we lose worlds. Crucially, these linguistic nuances are not just vocabulary, they shape emotional processing. An Arabic speaker describing their depression might say they feel a sense of “ghurba” or invoke “sabr” as a coping mechanism.
Beyond miscommunication, there is a deeper issue of identity. Language carries identity, and repeatedly encountering technology that “prefers” English can send a subtle message that Arabic is second-class in the digital world. It is disheartening for Arab users – especially younger generations – to find that talking to Siri or Alexa in Arabic is far less effective than in English, or that an Arabic prompt to a chatbot yields a gibberish answer.
Over time, people using such technology might default to English, slowly estranging themselves from their mother tongue in professional or technical domains. This is how linguistic diversity fades – not by force, but by convenience and neglect.
We urgently need AI that thinks in Arabic – not as an act of ethnic pride, but as a necessity for a fair and rich digital future. Truly Arabic-native LLMs are essential for several reasons. An Arabic-native AI would uphold the nuances of Arabic identity instead of diluting them.
It would use Arabic proverbs, quote Al Mutanabbi’s poetry where relevant, and understand the subtext behind a phrase like “inshallah”. This affirms to users that their language – and by extension their culture – belongs in the digital age. It keeps Arabic alive and evolving on its own terms, preventing the scenario where future generations engage with technology only in English and let their Arabic atrophy. If Arabic is left behind, so are its speakers.
Second, if AI is to serve humanity it must encompass the full range of human thinking patterns. Arabic offers a fundamentally different structure – a right-to-left script, a root-based word system, and flexible word order – and has a treasure trove of classical texts on logic, law and philosophy. Training AI on these will introduce new perspectives into how the AI “thinks”. For instance, Arabic’s logical treatises and unique grammar, such as the dual form and grammatical gender, could broaden AI’s problem-solving approaches.
More broadly, each language an LLM masters makes generative-AI programs smarter and more versatile. Including Arabic is not just good for Arabs, it is good for AI and the world. It forces AI to learn concepts it might never encounter in monolingual training. Embracing global cognitive diversity ensures AI is not one-dimensional. Much as biodiversity strengthens an ecosystem, linguistic diversity strengthens our AI systems’ understanding, avoiding a one-size-fits-all intelligence.
Recent years have witnessed promising Arabic AI development across the region, led by pioneering institutions laying the groundwork for a more linguistically and culturally rooted digital future.
In the UAE, the Mohamed bin Zayed University of Artificial Intelligence’s Jais model, developed with US company Cerebras and Abu Dhabi tech group G42, is one of the world’s most advanced bilingual Arabic-English language models, trained to understand both Modern Standard Arabic and regional dialects.
In Doha, the Qatar Computing Research Institute has launched Fanar – an instruction-tuned Arabic model built entirely in-house, designed to handle culturally nuanced tasks and support the country’s digital priorities. Meanwhile, the Saudi Data and AI Authority, in partnership with IBM, has released ALLaM – an open-source Arabic model built to reflect national values and serve both public and private sectors.
These efforts are vital – not only technologically, but symbolically – affirming as they do that Arabic belongs at the centre of AI development, not at its margins. But to truly meet the scale of what is possible, we need sustained investment, collaborative research and infrastructure that treat Arabic not as a downstream use case, but as a primary driver of innovation.