UHG
Search
Close this search box.

You Don’t Mess with Google

‘Attention’ has returned to Google

Share

Illustration by Nikhil Kumar

The headline seems apt for Google at the moment. Google DeepMind’s latest model, Gemini 1.5 Pro’s experimental version, 0801, recently claimed the top spot in Arena, surpassing GPT-4o and Claude-3.5 with an impressive score of 1300. It also achieved first place on the Vision Leaderboard.

It excels in multilingual tasks and delivers a robust performance in technical areas like math, hard prompts, and coding. This version of Gemini 1.5 Pro is available for early testing and feedback in Google AI Studio and the Gemini API, where developers can try it out. 

“Very impressed with the extraction capabilities from images and PDFs. It does feel super close to what GPT-4o gives in terms of vision capabilities and what Claude 3.5 Sonnet gives me on code generation and PDF understanding/reasoning,” said Elvis Saravia, co-founder, DAIR.AI.

“The new Gemini model can execute code and is quite ‘sharp.’ We might see it become the first competitor to the Code Interpreter,” said Ethan Mollick, assistant professor at Wharton. He further said that, while experimenting with it for data analysis, some capabilities seemed odd, like the fact that it couldn’t access files unless they are uploaded each time. “Nevertheless, it seems promising so far,” he said.

Most recently, Google also upgraded its free tier Gemini model with 1.5 Flash, which brings improvements in quality and latency, with especially noticeable enhancements in reasoning and image understanding. The tech giant also expanded the context window in Gemini Advanced, quadrupling it to 32K tokens. 

Following in the footsteps of its competitor OpenAI, Google has announced improvements to Gemini 1.5 Flash, reducing input costs by up to 85% and output costs by up to 80%, effective from August 12, 2024.

Apart from focusing on the LLMs, Google entered the SLM scene as well by releasing Gemma 2. This new model outperformed GPT-3.5 on the Chatbot Arena leaderboard and is optimised for efficient deployment across a wide range of hardware, from edge devices to robust cloud environments. Leveraging NVIDIA’s TensorRT-LLM library optimisations, it supports deployments in data centres, local workstations, and edge AI applications. 

InflectionAI, AdeptAI, CharacterAI. Who’s next?

Google recently brought back Noam Shazeer, one of the original authors of the famous Transformers paper ‘Attention is All You Need’, who had left Google to create his own chatbot company, Character.AI. Surprisingly, Character.AI has more web traffic in the US than Gemini, according to Similarweb. 

This is similar to what Microsoft and Amazon have done with Inflection and Adept AI, respectively. Microsoft hired much of Inflection’s leadership team and employees, agreeing to pay a licensing fee of approximately $650 million. Meanwhile, Amazon hired Adept’s co-founder and CEO David Luan, along with several other co-founders and key team members.

“Absolutely delighted to welcome my longtime Google colleague Noam Shazeer back to Google!” said Jeff Dean, Google Deepmind’s chief scientist.

Similarly, others from Deepmind followed suit in welcoming Shazeer back. “Welcome back Noam Shazeer to Google! It’ll be a great time working together again since 2018. Let’s take Gemini which is #1 and continue expanding the limits of its capabilities,” said Dustin Tran, research scientist at Google Deepmind. 

In a new arrangement, Google will pay a licensing fee to Character.AI for its models. Around 30 of Character.AI’s 130 staff members, specialising in model training and voice AI, will join Google to advance its Gemini AI initiatives. 

Interestingly, Character.AI recently introduced Character Calls, allowing users to engage in two-way voice conversations with their favourite characters, like Michael Jackson, Sherlock Holmes, or Albert Einstein. “It’s like having a phone call with a friend,” the company said.

“So this also means that google is getting into the “ai companion” market to compete with openai’s “her” lol,” one user posted on X in response. 

Character.AI allows users to create their own chatbots, defining their personalities, traits, and backstories. These characters can be based on real people, fictional characters, or entirely new creations. 

This is similar to what Meta AI is doing by introducing customised AI characters across its social media platforms. Meta recently introduced AI Studio, a new platform that allows users to create, share, and discover custom AI characters without needing technical skills.

OpenAI recently rolled its advanced voice mode to a small group, which has been received well with the users finding several usecases for it. However, GPT-4o is still not multimodal yet. 

“Advanced voice mode on ChatGPT is very impressive, but it doesn’t yet have the full features enabled: no multimodal video or images in, no GPTs, no new image generator, no code interpreter or internet access,” said Mollick.

On the other hand, Google Deepmind’s latest models AlphaProof and AlphaGeometry 2 solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving a score equivalent to a silver medalist in the competition. 

The word on the street is that OpenAI is also working on advanced reasoning capabilities for its next frontier model with ‘Project Strawberry’, which is most likely to be GPT-5. However, it appears that it is not coming out anytime soon.

“Little birdies told me Q* hasn’t been released yet as they aren’t happy with the latency and other little things they want to further optimise,” posted an insider who goes by the name Jimmy Apples on X.

“Not sure if it’s big patience or small patience.”

📣 Want to advertise in AIM? Book here

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.