At the OpenAI Spring Update, OpenAI CTO Mira Murati unveiled GPT-4o, a new flagship model that enriches its suite with ‘omni’ capabilities across text, vision, and audio, promising iterative rollouts to enhance both developer and consumer products in the coming weeks.
With GPT-4o, OpenAI trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. While introducing the model, OpenAI made several demonstrations to showcase its capabilities. Here, we have cherry-picked the top ones.
For customer service
This was a fun one! Take a look at 2 AI agents resolving a customer service claim with #OpenAI new #GPT4o.
— Joe Beutler (@JoeBeutler) May 14, 2024
Working with customers to build transformational solutions always gets me fired up. The potential solutions we can build with this new SOTA model has my head spinning! pic.twitter.com/86SNgNI6Tl
OpenAI’s GPT-4o is capable of engaging in natural and realistic voice conversations. This capability of ChatGPT makes it an ideal solution for building customer service chatbots, where two AI agents can collaborate to resolve customer service claims.
Real Time Translation
Live audience request for GPT-4o realtime translation pic.twitter.com/VSj5phFKM6
— OpenAI (@OpenAI) May 13, 2024
During the spring update event, OpenAI’s CTO, Mira Murati demonstrated the real-time translation capabilities of GPT-4o, successfully translating Italian to English and vice versa. This feature poses a significant threat to Google Translate and Duolingo, which offer similar services.
Interestingly, Duolingo stock fell 3.5%, wiping out ~$250M in market value, within minutes of OpenAI demoing the real-time translation capabilities of GPT-4o.
Human-Computer-Computer Interaction
Introducing GPT-4o, our new model which can reason across text, audio, and video in real time.
— Greg Brockman (@gdb) May 13, 2024
It's extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction): pic.twitter.com/VLG7TJ1JQx
GPT-4o can reason across text, audio, and video in real-time. It’s extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction). In this demo, you can see how OpenAI President Greg Brockman moderated a conversation between two ChatGPTs.
AI Education and Tutor
This demo is insane.
— Mckay Wrigley (@mckaywrigley) May 13, 2024
A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.
Imagine giving this to every student in the world.
The future is so, so bright. pic.twitter.com/t14M4fDjwV
In another demo presented by Khan Academy, a student shared their screen with ChatGPT using GPT-4o. ChatGPT assisted the student step-by-step in solving a mathematical problem. Unlike providing the entire solution at once, ChatGPT guided the student towards the solution. Additionally, students can also share their notebooks using their mobile camera, and ChatGPT will be able to understand the content.
Meeting AI with GPT-4o
One demo that's easy to miss, but I think is significant in what it shows is likely to be possible soon, is this demo — GPT-4o for meetings: https://t.co/UeT5285R9c
— Greg Brockman (@gdb) May 13, 2024
GPT-4o, through the desktop, can join online meetings and moderate them as well, giving its own valuable inputs, which can be crucial in making decisions. Moreover, it can transcribe and summarize meeting discussions in real-time, ensuring that no important details are missed and providing a reliable reference for participants.
Assistant for Visually Impaired Individuals
GPT-4o as tested by @BeMyEyes: pic.twitter.com/WeAoVmxUFH
— Greg Brockman (@gdb) May 14, 2024
BemyEyes, a mobile app designed for visually impaired individuals, tested GPT-4’s vision capabilities to assist a visually impaired person in navigating the city. ChatGPT was able to accurately identify the location and minute details of the surroundings.
Unlike human volunteers who may not be available at all times, GPT-4o can offer continuous support, ensuring that visually impaired users have access to assistance whenever they need it.
Interview Prep
Interview prep with GPT-4o pic.twitter.com/st3LjUmywa
— OpenAI (@OpenAI) May 13, 2024
In this demonstration, ChatGPT helps a candidate prepare for an interview. Using the front camera, ChatGPT can tell whether the candidate is dressed appropriately. Moreover, it can also help with preparations by conducting mock interviews and providing feedback on answers, highlighting strengths and areas for improvement to enhance performance.
Jam with ChatGPT
Lullabies and whispers with GPT-4o pic.twitter.com/5T7ob0ItuM
— OpenAI (@OpenAI) May 13, 2024
GPT-4o has a surprise talent – it can sing! Users can request personalised songs for special occasions like birthdays, anniversaries, or just for fun. The chatbot can generate a variety of tunes and melodies based on emotions or specific details provided by the user, from soft whispers to energetic anthems.
AI Coding Assistant
Live demo of coding assistance and desktop app pic.twitter.com/GlSPDLJYsZ
— OpenAI (@OpenAI) May 13, 2024
OpenAI has introduced the ChatGPT app for desktop. The app allows for voice conversations, screenshot discussions, and instant access to ChatGPT, acting as your friendly, go-to colleague in times of crisis. This is like an AI assistant who is always there to help you out. It can help you out with any problem you come across from writing codes to brainstorming ideas.
Rock, Paper, Scissors with GPT-4o
6. Rock, Paper, Scissors with GPT-4o pic.twitter.com/oMuMRRbrKO
— Angry Tom (@AngryTomtweets) May 13, 2024
With ChatGPT, you can enjoy playing fun games like Rock, Paper, and Scissors, with ChatGPT as the perfect referee. It can also hype you up and cheer for you during the game.