Recently, Bland AI put up a cool billboard advertising promoting its AI agent that can handle all sorts of phone calls for businesses in any voice, and it’s creating a buzz.
However, this and others like Devin and Devika are just a glimpse of what’s to come.
“A lot of people talk about the ‘ChatGPT moment’, where you’re like ‘Wow, never seen anything like this’. I think you have not used planning algorithms; many people will have a kind of a ‘Wow I couldn’t imagine an AI agent doing this’ moment,” said Andrew Ng, founder of DeepLearning.AI and AI Fund, at Sequoia Capital’s AI Ascent.
Further, he said he ran live demos in which something failed, and the AI agent rerouted around the failures. “I’ve actually had quite a few of those ‘Wow, you can’t believe my AI system just did that autonomously’ moments,” he added.
Ng said that today, most of us use LLM models with a non-agentic workflow, where we type a prompt, and the LLM generates an answer. However, an agentic workflow is more iterative, where you can have the LLM write an essay outline, do the research, write the first draft, analyse what parts need revision, and then revise the draft.
“In such a workflow you may have the LLM do some thinking, revise the article, then do some more thinking, and iterate this through a number of times. And, what not many people appreciate is that this delivers remarkably better results,” he said.
He also shared an example of how his team analysed some data using the ‘HumanEval’ coding benchmark. When they used GPT-3.5 zero-shot prompting, it got 48% right. GPT-4 delivered a much better performance, with 67%.
However, GPT-3.5, wrapped in an agentic workflow, performed better than the zero-shot GPT-4.
“This has significant consequences for how we all approach building applications. If you’re looking forward to running GPT-5/ Claude 4/ Gemini 2.0 (zero-shot) on your application, you might already be able to get similar performance with agentic reasoning on an earlier model,” he emphasised.
Everybody Seems to be Bullish on ‘AI Agents’
Recently, venture capitalist Vinod Khosla, envisioned a future in which internet access will be mostly through agents. He predicted a future in which most consumer access to the internet will be agents acting for consumers doing tasks and fending off marketers and bots. “Tens of billions of agents on the internet will be normal,” he wrote.
Meta CEO Mark Zuckerberg also spoke about how if a business is trying to interact with a customer then the interaction is no longer limited to “the person sends you a message and you just reply”. It’s a multi-step interaction where the business would want to think through how it can accomplish the person’s goals. So, the job of the AI is no longer to just respond to the question.
“If someone else solves reasoning and we’re sitting here with a basic chatbot, then our product is lame,” he said, envisioning a kind of Meta AI general assistant product that will shift from something that feels more like a chatbot to things where you’re giving it more complicated tasks and then it goes away and does them.
“I think a big part of what we’re going to do is interacting with other agents for other people whether it’s for businesses or creators. A big part of my theory on this is that there’s not going to be just one singular AI that you interact with because every business is going to want an AI that represents their interests,” he added.
He further took the example of 200 million creators on Meta platforms and how they want to engage with their community but are limited by the hours in the day. He explains that if you could create something where that creator can basically own the AI, train it in the way they want, and can engage their community, then that’s going to be super powerful.
Agents, agents everywhere!
Recently Google introduced Vertex AI Agent Builder, a platform that enables the easy creation of autonomous agents with little to no coding required.
NVIDIA has also teamed up with the AI healthcare company Hippocratic AI to develop GenAI agents that not only outperform human nurses on video calls but also cost a lot less per hour.
Tech giants like Microsoft, OpenAI, and Google also seem to be racing to build more agent capabilities to position their technologies as essential tools.
Source: LinkedIn
Despite scepticism, Devin, for instance, resolved nearly 14 out of every 100 issues, this advancement marked notable progress in AI’s capability to autonomously understand and address software development issues, enhancing its potential to support developers. Devin can even do real jobs on Upwork!
It recently raised $175 million at a $2 billion valuation from Founders Fund.
Then there is Devika, an Indian open source AI software engineer capable of understanding human instructions, breaking them down into tasks, conducting research, and autonomously writing code to achieve set objectives.
All these developments further strengthen the belief held by many that the future of AI is going to be Agentic. “Honestly the path to AGI feels like a journey rather than a destination but I think agent workflows could help us take a small step forward on this very long journey,” as said by Andrew Ng.
Are You Ready?
“I think it’s very likely but perhaps not in a nice way,” said Kailash Nadh, CTO at Zerodha, told AIM, when asked about his opinion about agents running the internet.
Further, he said that there already exist agents taking instructions from us and going on the internet and getting them executed, and with LLMs it is only getting worse.
“I’ve seen bots… I’ve seen agents being used by people to order pizza. So, are we headed towards a future where this will be the case? I think absolutely!” said Nadh, adding that it is only a matter of time. “Is it going to be a nice one? I don’t know, I don’t think so. There are people who ruin everything.”
Even Ng said that the agents today don’t work fully reliably and that “they are kind of finicky”.
However, since we can iterate agents and they can recover from their failures, it makes them a lot more powerful. With continuously evolving agents, better agentic models, advanced tools and frameworks, the finicky aspects of agents might start to get reduced, painting an optimistic picture for the future.
Recent advancements like Anon building the identity backbone for the AI-powered Internet to enable billions of AI agents to securely access user accounts and transform our digital lives, also seem promising.
All of this could enable developers to build next-generation consumer and enterprise agent workflows, transforming how people interact with AI. Also, when done with security and proper frameworks in mind, the future of AI agents could be truly exciting!