UHG
Search
Close this search box.

Amazon Unveils New AI Language Model that Beats GPT-3

The new language model outperformed OpenAI’s GPT-3 and Google’s PaLM on various NLP benchmarks

Share

Amazon Alexa AI researchers recently unveiled Alexa Teacher Models (AlexaTM 20B) that beats GPT-3 on NLP benchmarks. The 20-billion-parameter sequence-to-sequence (seq2seq) language model showcases SOTA capabilities on few-shot learning. The model is yet to be released publicly. 

Check out the GitHub repository here

Unlike OpenAI’s GPT-3 or Google’s PaLM, which are decoder-only models, AlexaTM 20B is a seq2seq model that contains an encoder and a decoder allowing better performance on machine translation (MT) and summarization. 

Sequence-to-sequence model is a special class of recurrent neural network architecture, typically used to solve complex language problems, including machine translation, creating chatbots, question answering, text summarisation, etc. 

With 1/8 number of parameters, the new language model by Amazon outperformed GPT-3 on SQuADv2 and SuperGLUE benchmarks. The multilingual model achieves excellent performance on few-shot MT tasks, even on low-resource languages, on the Flores-101 dataset. 

On several other benchmarks like MLSum, AlexaTM outperformed all other models for 1-shot summarization in Spanish, German, French and most language pairs on 1-shot MT tasks. On low-resourced languages like Tamil, Telugu, and Marathi, the improvement was significant. On English-based languages, the model outperformed GPT-3 on MT tasks but came second to the larger PaLM model.

Saleh Soltan, senior applied scientist on Amazon, said that, “the proposed style of pretraining enables seq2seq models that outperform much larger decoder-only LLMs across different tasks, both in a few-shot setting and fine-tuning.”

📣 Want to advertise in AIM? Book here

Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words.
Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.