Science
India Advances AI Ambitions with Dual Strategy for Sovereignty

Bengaluru recently hosted Google’s annual I/O Connect event, highlighting India’s growing ambitions in artificial intelligence (AI). Over 1,800 developers attended, with discussions focusing on enhancing AI capabilities to address the country’s rich linguistic diversity. With 22 official languages and numerous dialects, India faces significant challenges in developing AI systems that can effectively operate within this multilingual context.
At the event, several startups showcased their innovative solutions to these challenges. Notably, Sarvam AI presented Sarvam-Translate, a multilingual model tailored from Google’s open-source large language model, Gemma. CoRover introduced BharatGPT, a chatbot designed for public services, including applications for the Indian Railway Catering and Tourism Corporation (IRCTC). Google announced that Sarvam, Soket AI, and Gnani are among the startups developing the next generation of AI models, refining them based on Gemma. This approach highlights a dual-track strategy: while some startups are tasked with creating foundational models under the 10,300 crore IndiaAI Mission, others are leveraging existing technology to expedite deployment.
Building competitive AI models from scratch is resource-intensive, and India lacks the luxury of isolation in this endeavor. The country confronts challenges such as limited high-quality training datasets and an evolving computational infrastructure. To address immediate market demands, these startups are adopting a pragmatic approach, fine-tuning existing models to tackle real-world problems while laying the groundwork for future indigenous models.
The open-source initiative known as Project EKA, led by Soket AI in collaboration with IIT Gandhinagar, IIT Roorkee, and IISc Bangalore, exemplifies this layered strategy. This effort aims to develop a sovereign large language model with plans for a 7 billion-parameter model within the next four to five months, followed by a more extensive 120 billion-parameter model within ten months. As co-founder Abhishek Upperwal explains, the project is designed to address four critical domains: agriculture, law, education, and defense. Each domain comes with a dataset strategy sourced from governmental bodies and public-sector use cases.
A key aspect of the EKA initiative is its independence from foreign infrastructures. Training will occur on India’s own GPU cloud, and the resulting models will be open-sourced for public use. Despite this commitment to sovereignty, the team recognizes the need for practicality; hence, they initially employ Gemma for deployments. Upperwal clarifies, “The idea is not to depend on Gemma forever. It’s to use what’s there today to bootstrap and switch to sovereign stacks when ready.”
CoRover’s BharatGPT illustrates a similar dual strategy. Currently fine-tuned to deliver conversational AI services across various Indian languages to clients like Bharat Electronics Ltd and the Life Insurance Corporation, the platform aims to bridge immediate needs while developing its foundational models. CoRover founder Ankush Sabharwal emphasizes the importance of quick deployment and dataset creation, stating, “You begin with an open-source model. Then you fine-tune it, add language understanding, lower latency, and expand domain relevance.”
This strategy reflects a broader trend in India’s approach to AI, as technology policy expert Amlan Mohanty characterizes it as an experiment in trade-offs. By utilizing models such as Gemma, Indian companies can quickly deploy solutions while still pursuing the long-term goal of autonomy, ensuring cultural representation and reducing dependency on geopolitical rivals.
The importance of local context in AI development cannot be overstated. For India, building its own AI capabilities is crucial not only for national pride but also for addressing pressing local challenges. For instance, consider a migrant worker from Bihar visiting a clinic in Maharashtra. If the doctor speaks Marathi and the AI tool explains medical findings in English, significant communication gaps arise. This situation exemplifies the need for AI tools that understand local languages and cultural nuances.
Fine-tuning open models enables Indian developers to meet these urgent needs while concurrently building the datasets necessary for a truly sovereign AI framework. This dual-track strategy may provide India with one of the fastest pathways to develop AI capabilities that accurately reflect local values and contexts.
The IndiaAI Mission represents a national response to growing geopolitical concerns. As AI systems become integral to sectors such as education, agriculture, and governance, reliance on foreign platforms poses risks of data exposure and loss of control. Recent events, such as Microsoft’s abrupt cessation of cloud services to Nayara Energy due to European Union sanctions, highlight the vulnerabilities tied to foreign tech providers.
India’s push for sovereign AI systems also aims to ensure that local values and regulatory frameworks are accurately represented. Most global AI models rely on datasets dominated by English and Western cultures, rendering them ill-equipped to address the complexities of India’s multilingual population and domain-specific requirements.
Mohanty emphasizes that sovereignty in AI involves both control over infrastructure and the ability to make choices about partnerships. “The more choice you have, the more sovereignty you have,” he states, underscoring the importance of strategic alliances in developing AI capabilities.
Despite the momentum, the lack of high-quality training data, especially in Indian languages, presents a significant obstacle. Many of India’s spoken languages have little to no digital presence, limiting the ability of AI systems to learn effectively. According to Manish Gupta, director of engineering at Google DeepMind India, internal assessments revealed that out of 125 spoken languages with over 100,000 speakers, 72 had virtually no digital footprint.
To combat this linguistic challenge, Google has partnered with the Indian Institute of Science to collect voice samples across numerous districts, capturing over 14,000 hours of speech data. The initiative aims to expand coverage to all 773 districts in India, with a focus on enhancing the quality of the data collected.
As India pursues its AI ambitions, the dual-track strategy of leveraging existing models while cultivating sovereign capabilities offers a potential roadmap for other nations in the Global South. By addressing local contexts and building inclusive AI tools, countries can navigate similar challenges without the benefit of extensive resources.
“Full-stack sovereignty in AI is a marathon, not a sprint,” Upperwal notes. As India prepares to launch its sovereign models, the question remains whether it can transition from reliance on open-source tools to complete independence. The future will hinge on the ability to develop a robust AI infrastructure that meets local needs while mitigating dependence on foreign technologies.
-
Sports3 weeks ago
Broad Advocates for Bowling Change Ahead of Final Test Against India
-
Science3 weeks ago
New Blood Group Discovered in South Indian Woman at Rotary Centre
-
Sports3 weeks ago
Cristian Totti Retires at 19: Pressure of Fame Takes Toll
-
World1 month ago
Torrential Rains Cause Flash Flooding in New York and New Jersey
-
World1 month ago
SBI Announces QIP Floor Price at ₹811.05 Per Share
-
Lifestyle1 month ago
Cept Unveils ₹3.1 Crore Urban Mobility Plan for Sustainable Growth
-
Top Stories1 month ago
Konkani Cultural Organisation to Host Pearl Jubilee in Abu Dhabi
-
Science1 month ago
Nothing Headphone 1 Review: A Bold Contender in Audio Design
-
Top Stories1 month ago
Air India Crash Investigation Highlights Boeing Fuel Switch Concerns
-
Business1 month ago
Indian Stock Market Rebounds: Sensex and Nifty Rise After Four-Day Decline
-
Politics1 month ago
Abandoned Doberman Finds New Home After Journey to Prague
-
Top Stories1 month ago
Patna Bank Manager Abhishek Varun Found Dead in Well