Only last month, Cognition Labs’ Devin, the world’s first fully autonomous AI software engineer, took the internet by storm. Devin was set to make developers obsolete, that is if you believed the social media chatter.
However, a software developer decided to look closely, and in a YouTube video, claimed that the whole demo video Cognition Labs published a month ago was staged.
If you’ve been tracking developments in the AI realm, this might feel like déjà vu, reminiscent of Google’s controversy last year when it was accused of fabricating its Gemini video.
That is not all, Google DeepMind last year claimed its AI tool GNoME found 2.2 million new crystals, including 380,000 stable materials that could power future technologies. Along with Devin, Google DeepMind’s claims were two of the biggest in terms of what AI can do.
Yet, in a perspective paper featured in Chemical Materials, Anthony Cheetham and Ram Seshadri from the University of California concluded that none of these structures pass a three-part test assessing their credibility, usefulness, and novelty.
False claims or conflicting views?
Cheetham and Seshadri analysed a random subset of the 380,000 structures revealed by DeepMind, contending that the substances identified are, in fact, crystalline inorganic compounds and should be classified as such, rather than being broadly termed “materials.”
They propose reserving the term “material” for substances exhibiting tangible utility. However, Google responded, affirming to 404media that they uphold all assertions presented in DeepMind’s GNoME paper.
While we are restricting ourselves from jumping to any conclusions yet, it’s hard to side with Google on this one courtesy the staged Gemini video and the number of anti-competitive lawsuits filed against the company.
Devin, on the other hand, is a different scenario entirely. The startup, backed by Peter Thiel also responded to the criticism.
A developer working at Cognition Labs took to X to clarify: “The primary criticism was that I didn’t transcribe the prompt verbatim, which looking back at the screenshot is accurate — I was thinking since Devin already runs inside an EC2 instance I’d try to get it to just do the job directly instead of writing instructions.”
Fuelling hype?
Cognition Labs is led by Scott Wu, a prominent programmer who was recognized as a child prodigy at an early age. His team also includes highly-talented developers who have previously worked with Google DeepMind, Cursor, ScaleAI, and other prominent tech companies.
Given Wu and his team’s stellar reputation, it appears unlikely that they would intentionally pass off a staged video and expect no one to notice. However, it does make one wonder if they released a product (albeit with limited access) prematurely due to investor pressure.
Notably, when Google unveiled its Gemini video, it apparently did so under immense pressure to introduce an AI product, driven by concerns of lagging behind competitors such as Microsoft and OpenAI in the AI race.
Meanwhile, Cognition Labs is reportedly seeking to raise fresh money at a USD 2 billion valuation and may have been forced to release a half-baked product. If true, in today’s fast-paced environment, such a revelation wouldn’t entirely be a shocker.
Silicon Valley founders are frequently criticised for overhyping their products to sustain enthusiasm and boost sales, reminiscent of Elon Musk’s assertions about achieving level five autonomy by 2021 and 2023.
Moreover, Musk, alongside NVIDIA CEO Jensen Huang, has asserted that we will witness superintelligent AI within this decade.
However, this claim could be interpreted as merely perpetuating the AI hype, especially considering that Geoffrey Hinton, the godfather of AI, believes it will take another 20 years for such advancements to materialise.
AI has its limitations
The reality is that Devin still has a long way to go before it acquires the ability to master software development completely and make developers obsolete. The Cognition Labs developer did claim that Devin ‘makes mistakes’ and ‘often fails’.
One day, AI might discover 20 million new minerals, however, between then and now, there are many hurdles that AI systems might have to overcome.
For instance, numerous AI startups today are pouring significant resources into R&D, often being valued at exorbitant levels based on what their AI systems might achieve in the coming years, rather than their current capabilities.
Take Cognition Labs for example, after its latest funding round, could be valued at USD 2 billion without even shipping a product.
However, soon these companies will face the imperative of turning a profit, a hurdle that many may fail to overcome. Like Stability AI, once lauded for its Stable Diffusion technology, is now grappling with financial distress, teetering on the brink of collapse.
Nonetheless, despite the hype, one thing is clear that AI is on the right trajectory and will achieve all those things people say it will. However, current AI systems do have their limitations.