Interviews, insight & analysis on digital media & marketing

Building a more linguistically rich, inclusive internet: the AI opportunity 

Ram Mohan is Chief Strategy Officer at Identity Digital and an inductee to the Internet Hall of Fame… 

For all its reach, the Internet remains linguistically narrow. Just seven languages dominate its content. And when it comes to the large language models (LLMs) behind today’s generative AI, that field narrows further still, with English and Chinese at the center. The result is a digital divide where billions are disadvantaged simply because their languages are underrepresented online. Yet this critical moment also presents a remarkable opportunity for growth. 

When people can access the Internet in their own languages, they gain fairer access to education, healthcare, jobs, and civic and economic participation. Diverse languages online also safeguard unique histories and worldviews, expand the global pool of knowledge, and give marginalized communities a stronger voice. For technology, language diversity makes AI systems smarter and fairer while opening new markets and opportunities. In short, it ensures the digital world reflects the full spectrum of human experience, not just a narrow slice of it. With urgency and intentionality, we can harness AI to preserve linguistic diversity, expand inclusion, and ensure the Internet reflects the full breadth of human culture.

But opportunities come hand in hand with warnings. Researchers have been sounding the alarm on AI’s threats to digital inclusion. Those excluded from AI tools face inequities in employment, education, and access to healthcare. More broadly, the Internet itself risks collapsing into what Stanford’s Sanmi Koyejo has called a “U.S.-centric culture blob.” AI systems are only as diverse as the data they are trained on, and right now that data is alarmingly limited. Even when Meta launched Llama 3 in 2024, which was touted as its most capable large language model yet, only 5% of its training set was non-English, spanning just 30 languages. That’s progress, but against the backdrop of more than 7,000 living languages, it is also profoundly insufficient.

The trajectory is clear: without intervention, AI will accelerate the homogenization of the Internet. But it’s not too late to act. Earlier this year, I co-founded the Coalition on Digital Impact (CODI), a global alliance of technology leaders and advocacy organizations founded to support the critical work of last-step organizations advancing digital access. Our mission is to ensure that every person can navigate the Internet in their own language. Through education, awareness and advocacy campaigns, we are working to elevate existing efforts, challenge technological barriers and promote linguistic equity.

At CODI, we are hopeful about what AI can do. We’ve seen glimpses of its promise in live translation tools and AI-powered tutors that make language learning easier. But to unlock its true potential, we need to prioritize multilingual testing, inclusive data sets, and greater linguistic oversight in technology development. That’s why our first research project is focused on defining a Minimum Viable Dataset: one that can serve as an ethically sourced, culturally relevant, community-owned foundation for building more inclusive AI. This is a first step, but one we hope sparks a global movement.

As we reflect on those who built the Internet, we must also consider who will build its future, and how we can improve upon the tool that has changed the lives of so many. We’re calling on technologists to design with diversity in mind, on policymakers to set the guardrails, and on platforms to open access and invest in inclusion. Scaling projects like CODI’s requires collaborators around the world, and we invite you to join us in this effort. By building a more linguistically rich Internet, we can broaden online access and increase cultural preservation and exchange. Let’s work together to ensure future visionaries reflect the global knowledge base upon which the Internet was built. 

Ram Mohan is the Chairman and Founder of the Coalition on Digital Impact (CODI), an independent, global coalition created to empower global communities to access and navigate the Internet in their native languages.