Tech Mahindra’s outgoing chief, CP Gurnani, shared a Project Indus update on X, highlighting the development of an LLM specifically designed for Hindi and its 37 dialects.
Retiring after 19 years at Tech Mahindra he said, “I feel proud to share that the GenerativeAI project challenge I took up earlier has been successfully accomplished by our brilliant research team at Makers Lab.” Project Indus is in the beta testing phase within TechM. The pure Hindi LLM consists of 539 million parameters and 10 billion Hindi+ dialect tokens, as shared by Gurnani.
“The model is probably the only one in the world that has all Hindi tokens and has been trained from the ground up. It will set the stage for the years to come as our prowess in deep tech,” he said.
“I now pass the baton over to Mohit Joshi, Nikhil Malhotra, and the team, as well as my talented Tech Mighties, who will take this one notch higher. Thank you for everything, team!” he added.
In October, Tech Mahindra announced its plan to release Project Indus by the end of December or early January. Introduced in August, the model is initially set to support 40 different Hindi dialects, with plans to add more languages and dialects in subsequent releases. Over the last two months, the 15-member Project Indus team has collected 1.2 terabytes of data in Hindi and related dialects.
The update by Gurnani comes in the backdrop of last week, which witnessed a slew of announcements from Indian companies and startups launching their large language models (LLMs). This includes Google-backed CoRover’s BharatGPT, Khosla Ventures-backed Sarvam.ai’sOpenHathi, Microsoft-backed Kissan AI’s Dhenu, and Ola’s Krutrim.