Tech Mahindra has teamed up with Indosat Ooredoo Hutchison to build ‘Garuda,’ a Large Language Model (LLM) to preserve Bahasa Indonesia, the official and national language of Indonesia and its dialects.
Garuda will be built on the principles of Tech Mahindra’s indigenous LLM ‘Project Indus‘, a foundational model designed to converse in a multitude of Indic languages and dialects.
The IT giant signed a Memorandum of Understanding (MoU) at Mobile World Congress (MWC) 2024.
As part of this partnership, Tech Mahindra will leverage its technology expertise to gather and curate data in the Indonesian language, which will be pre-trained and released as a conversational model for Indosat.
Garuda will be developed with 16 billion original Bahasa tokens, providing 1.2 billion parameters to shape the model’s understanding of the Bahasa language. These parameters will influence how the model processes input and formulates output.
A beta version of the Garuda model will be released for testing by Indosat and Bahasa Indonesia speakers. The model will be further improved using RLHF (Reinforcement Learning from Human Feedback) techniques to ensure its robustness for conversation. Additionally, any specialized use cases will be developed using the LIMA (Less is More for Alignment) method.
“The LLM market is expected to reach 40.8 billion USD by 2029. In this direction, the emergence of LLMs such as Garuda and Indus can enable people and enterprises to communicate online in their local dialects and languages, creating new opportunities in the digital world.
“We believe that the model will significantly promote Indonesia’s linguistic diversity and unlock new business opportunities for enterprises in the region,” said Harshvendra Soin, President – Asia Pacific and Japan Business, Tech Mahindra.