After launching the Generative AI Studio under its amplifAI0->∞ suite of AI offerings and solutions, Tech Mahindra chief executive and managing director CP Gurnani recently took to X to share that it is the first major Indian IT company to be working on their proprietary large language model called Project Indus.
The open-source large language model aims to speak over 40 Indic languages in the first phase, including Kinnauri, Kangri, Chambeli, Garhwali, Kumaoni, Jaunsari and more.
The “civilisational” initiative will be carried out by the Makers Lab of Tech Mahindra to develop India’s foundational model for various Indian languages, starting with Hindi. The project collects Hindi dialect speech data to train a language model using NLP algorithms. Contributors can anonymously submit short to extended speech samples with the option to delete recorded data.
According to the official website, mobile numbers are optionally collected for reference and gamification purposes, with encryption and a retention period of up to seven years. No personal information will be shared with third parties.
However, it is not yet clear whether Makers Lab will build the model from scratch or base it on top of any existing LLMs like GPT-4 or Llama 2 like Stanford’s Alpaca and Vicuna-13B.
Unpacking the Gen AI Vows of Indian IT
When it comes to generative AI, all other Indian IT giants have also poured sufficient funds. However, their excitement in channelling the full potential of generative AI has led to no real use cases.
The Indian IT giants are forming partnerships to advance generative AI adoption. In an earlier interaction with AIM, Wipro CTO Subha Tatavarti said that the company has been working around generative AI for the past two years. Wipro teamed up with IIT Delhi to establish a Center of Excellence (CoE) as part of their USD 1 billion AI-driven innovation initiative in the Wipro ai360 ecosystem. They aim to combine Google Cloud’s generative AI with Wipro’s AI IP, business accelerators, and industry solutions.
Meanwhile, HCL partnered with Microsoft and AWS to enhance their generative AI efforts, while TCS is collaborating with Google Cloud to utilise foundational models like Vertex AI and generative AI Application Builder. Infosys follows an “AI First” strategy, focusing on specialised AI models from open-source LLMs, using them to accelerate clients’ AI initiatives through their Topaz framework, which encompasses generative AI-based services, solutions, and platforms.
No Moat, Only Fluff Talk
This is not the first time that we have seen ITs jumping onto the bandwagon of something new, something that is trending in the tech ecosystem. When Meta introduced Metaverse with much fanfare, we saw a similar reaction.
TCS started the trend, followed by Tech Mahindra and Infosys, all announcing their metaverse-related products and services in February of 2022. As a part of the metaverse programme, TCS’ “themaTiCS” suite targets improving remote work experiences where only 25% of employees are in the office at any time. Following the lead, Infosys introduced the Metaverse Foundry similar to Topaz, offering ready-to-use templates and is said to have found 100 use cases for enterprises to embrace metaverse offerings spanning XR consulting, blockchain consulting, digital twin and more. Tech Mahindra also introduced the TechMVerse, leveraging its 5G capabilities to deliver immersive experiences.
However, when Meta pulled the plug on their Metaverse dream, Indian ITs seemed to have lost interest too.
This time, Indic Languages are coming back to riding the generative AI wave, building indigenous LLMs in India is a huge growth opportunity considering that it is home to 122 major languages and 1599 other languages, along with 22 official languages, as per Census 2001.
At present, 58.8% of the content is in English, followed by Russian, Spanish and French. Forget about native languages like Garhwali or Kumaoni, even Hindi does not make it to the top ten highlighting a significant shortage of local language content.
Project Bhashini, was introduced in collaboration with Microsoft. Finance Minister Nirmala Sitharaman also introduced the National Language Translation Mission (NLTM) in the 2021-22 budget.
Project Indus is an important initiative in this direction. Gurnani urged people from different vocabularies to contribute to making this project successful as an LLM is only as good as the data it is trained on. And Tech Mahindra is the only company to work on its promises.
Choosing AI Upskilling Over Model Building
When it comes to building its own LLMs, India is taking a different approach, focusing on the upskilling of their employees. This can be attributed to the fact that historically, technology adoptions have increased work volume, requiring more expertise and hands.
All the heads of these companies like K Krithivasan (TCS), Salil Parekh (Infosys) and C Vyakumar (HCL) have highlighted generative AI as the quarter one’s focal point of this year, with clients exploring its potential for enhancing productivity, content creation, and customer service.
However, while generative AI is seeing a strong interest, clients cutting back on IT spending is a concern for HCL and Infosys, impacting revenue growth forecasts, as per a report by The Register. Despite the short-term hype, executives believe AI will bring meaningful long-term benefits, although measuring its effectiveness remains challenging.
Tech Mahindra, Infosys, TCS, HCL and Wipro have expanded their partnerships with major tech players like AWS, Google and Microsoft partnerships to upskill employees in generative AI. Traditionally, the huge talent pool of India has been seen as beneficial for a quick adopter of technology and not building inhouse. And now with the upskilling, the process will only get faster and smoother.
Read more: Wipro’s Tryst With Generative AI Began Way Before ChatGPT