OpenAI News, Stories and Latest Updates https://analyticsindiamag.com/news/openai/ Artificial Intelligence news, conferences, courses & apps in India Tue, 13 Aug 2024 10:08:48 +0000 en-US hourly 1 https://analyticsindiamag.com/wp-content/uploads/2019/11/cropped-aim-new-logo-1-22-3-32x32.jpg OpenAI News, Stories and Latest Updates https://analyticsindiamag.com/news/openai/ 32 32 Humanoids: The New Employees Who Work Cheap and Never Complain  https://analyticsindiamag.com/ai-origins-evolution/owning-a-humanoid-will-soon-be-cheaper-than-humans/ https://analyticsindiamag.com/ai-origins-evolution/owning-a-humanoid-will-soon-be-cheaper-than-humans/#respond Thu, 08 Aug 2024 13:38:15 +0000 https://analyticsindiamag.com/?p=10131932

Human labour can rake up additional costs, something that cannot happen with a humanoid.  

The post Humanoids: The New Employees Who Work Cheap and Never Complain  appeared first on AIM.

]]>

Brett Adcock, founder of Figure AI, predicted that everyone will own a robot in the future, much like everyone owns a car or phone today.

Interestingly, the robotics company unveiled its second-generation humanoid robot, Figure 02. The company said it is one step closer to its goal of selling production humanoids to industrial users, with the newer design refining every element of the original Figure 01.

The first-generation robot, Figure 01, took its first steps within a year of its development. As technology advances, owning a humanoid could indeed become more cost-effective than employing human workers.

While the initial investment in robotics may be high, the long-term savings in wages, benefits, and training can be substantial. Meanwhile, Figure AI has gotten both investment and a strong partnership deal from OpenAI.

According to a report, Goldman Sachs estimates that the cost of the Figure 01 humanoid is around $30,000 to $150,000 per unit. But, with future production and advanced adoption in factories, it is possible that the cost can come down in the long run. 

Interestingly, labour costs at major US automakers like Ford, GM, and Stellantis are approximately $64, though this is expected to rise to $150 per hour. These figures include wages, health care expenses, bonuses, and other benefits. Furthermore, human labour can rake up additional costs, something that cannot happen with a humanoid.  

Bengaluru-based Control One shared similar sentiments. The startup has focused on the warehousing sector, where there is a significant labour shortage and a high demand for automation-based solutions, especially in the global market.

“The warehousing market is facing a huge labour crisis. Our system enables one person to manage multiple robots, effectively multiplying their productivity,” said Pranavan S, Control One’s founder and CEO, in an interaction with AIM 

However, this shift raises ethical questions about job displacement and the value of human labour. Balancing economic efficiency with social responsibility will be crucial as we navigate this transformation, ensuring that technological progress benefits society as a whole without exacerbating inequality.

Furthermore, Elon Musk recently stated that Tesla aims to produce “genuinely useful” humanoid robots to start operating in its factories next year.

Humanoid Race

Back in August 2021, Tesla introduced a prototype humanoid robot, which Musk believes could help humanity achieve quite ambitious goals. Leading into this, at an event in October 2022, Musk expressed his hope to eventually produce millions of Optimus robots.

Optimus, a Tesla-built humanoid robot, weighs 56 kg, stands at a 170 cm tall, and is priced under $20,000 (€18,000) for mass production.

However, the robot had limited capabilities and Musk stated that he wouldn’t assign it more complex tasks because he “didn’t want it to fall on its face.” He said, “There’s still a lot of work to be done to refine Optimus. I think Optimus is going to be incredible in five or ten years.”  

Robot for the House

Bindu Reddy, CEO of Abacus.AI, posted on X, saying that the next trillion dollar company will be the one that ships a mass-market humanoid robot under $30k, capable of handling household chores like laundry, loading the dishwasher and cooking. 

While humanoids may be employed in factories and warehouses, a bigger application is probably finding a use case in the household. For instance, Figure’s humanoid can be dubbed as a robot housekeeper that is capable of performing a variety of household chores.

This humanoid robot is designed as a general purpose solution capable of thinking, learning, and interacting with its environment. It is set to support the global supply chain and address labour shortages by performing structured and repetitive tasks. 

The robot utilises AI and machine learning algorithms to understand and execute tasks such as cleaning and organising. Equipped with sensors and robotic arms, it can navigate complex home environments, ensuring efficiency and precision. 

Interestingly, the company had shared an image of them shipping the humanoid to their first customer, which turned out to be automotive giant BMW.

Similarly, the German robotics company NEURA has unveiled a video of their humanoid robot, 4NE-1. It is one of the first to participate in the early access NVIDIA Humanoid Robot Developer Programme.

Not That Easy

Several users have expressed their views about how owning humanoid robots would be impractical for certain tasks and also very expensive. 

Despite the advancements predicted, creating a fully strong robot that matches human intelligence remains extremely costly. Certain aspects of the manufacturing process cannot be easily scaled, and it takes years to develop a single human-like AI. As a result, they are incredibly expensive.

“Robots need to be able to deal with uncertainty if they’re going to be useful to us in the future. They need to be able to deal with unexpected situations and that’s sort of the goal of a general purpose or multi-purpose robot, and that’s just hard,” said Robert Playter, CEO of Boston Dynamics, in an interview with Lex Fridman last year.

Playter emphasised the immense difficulty of advancing robotics. Boston Dynamics, which started developing general-purpose robots in the early 2000s, only introduced its humanoid robot Atlas in 2013. Besides facing challenges in securing investments for robotics, training robots has always been a significant hurdle.

Simpler Robots 

While humanoids are the advanced version, simpler home-grown alternative robotic solutions are being developed. For instance, one Indian student has developed an AI-powered machine that completes homework in his handwriting. 

The machine, developed by Devadath PR, a robotics and automation engineering undergrad student, has now garnered significant attention, with over 1,000 people inquiring about purchasing it. He built the device using parts from his old 3D printer and is now working on a second prototype.

The post Humanoids: The New Employees Who Work Cheap and Never Complain  appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/owning-a-humanoid-will-soon-be-cheaper-than-humans/feed/ 0
Is OpenAI Intel’s Biggest Regret Ever? https://analyticsindiamag.com/ai-origins-evolution/is-openai-intels-biggest-regret-ever/ https://analyticsindiamag.com/ai-origins-evolution/is-openai-intels-biggest-regret-ever/#respond Thu, 08 Aug 2024 11:10:36 +0000 https://analyticsindiamag.com/?p=10131908 Is OpenAI Intel’s Biggest Regret Ever?

Intel had the opportunity to acquire a 15% stake in OpenAI for $1 billion in 2017.

The post Is OpenAI Intel’s Biggest Regret Ever? appeared first on AIM.

]]>
Is OpenAI Intel’s Biggest Regret Ever?

Intel is going through some trouble lately. The company’s recent earnings fell short of analysts’ expectations, resulting in a 26% single-day selloff that brought its market cap below $100 billion for the first time in three decades. 

Also, CEO Patrick Gelsinger announced to employees last week that the company would reduce its workforce by 15% and cut about 15,000 jobs as part of a significant cost-cutting measure. All this, while he’s also been posting proverbs from the Bible, which made people stress even more about the company. 

But this unfortunate time could have played out differently had the company made that one single investment back in 2017. According to reports, in 2017 and 2018, the tech giant had the opportunity to acquire a 15% stake in OpenAI for $1 billion. Additionally, Intel could have secured another 15% stake by offering OpenAI its hardware at cost, according to the sources.

This would have made the company acquire a 30% stake in OpenAI, which is contentiously the leader in generative AI for the past few years. OpenAI sought Intel as an investor to reduce its dependence on NVIDIA, whose chips are possibly powering the entire AI world right now.

A Bet Gone Wrong

Intel declined the offer, partly because it doubted the immediate viability of generative AI models in 2018, which it believed would impact a timely return on investment. 

Cut to the present, it can be said that Intel is pushing really hard to make a strong presence in the AI industry. Once a world-leader in chips, Intel failed to capitalise on the AI boom and propelled NVIDIA to become one of the most valuable companies globally. 

But it is not just Gelsinger who is possibly praying for his business. NVIDIA chief Jensen Huang is also reportedly paranoid about the future of his company. In a recent podcast with Lex Fridman, Perplexity AI chief Aravind Srinivas revealed that he once asked Huang how he handles success and stays motivated. 

To this, Huang had replied, “I am paranoid about going out of business. Every day I wake up in a sweat, thinking about how things could go wrong.” Huang explained that in the hardware industry, planning two years in advance is crucial because fabricating chips takes time. 

“You need to have the architecture ready. A mistake in one generation of architecture could set you back by two years compared to your competitor,” Huang said. This definitely puts into perspective how a single investment could have changed Intel’s fortune, since even the CEO of the leading company is paranoid about things going wrong at any moment.

But Intel is Not Sitting Ducks

However, there is some positive news from Intel as well, which shows that the company is not giving up. 

For years, Intel focused on enabling CPUs, like those in laptops and desktops, for AI processes, rather than prioritising GPUs, which are more effective for AI calculations. In contrast, NVIDIA and AMD have thrived by concentrating on GPUs, while Intel largely missed the opportunity. 

However, in the third quarter, Intel plans to release its Gaudi 3 AI chip, which Gelsinger claims will outperform NVIDIA’s H100 GPUs, possibly even challenging NVIDIA Blackwell architecture.

Continuing with its focus on chips, Intel has also announced that Panther Lake and Clearwater Forest, the leading products on Intel 18A, are now out of the fab and ready to run on operating systems. These would be ready for production by next year.

Several people have cited that Gelsinger would bring Intel back on its feet after having almost lost in the AI race. The OpenAI failed deal of 2017-18 is something that Gelsinger, if he had been leading the company at that time, might have been able to make successful. 

In May, Gelsinger had said that the company’s AI strategy is on the right track, which made everyone think Intel was living in denial. “We’re really starting to see that pipeline of activity convert,” said Gelsinger.

But apart from GPUs, Intel’s CPU and NPU plans are still seemingly strong, along with a focus on edge use cases and on-device AI. Since Intel is the majority holder of the laptop industry, with the future of AI racing towards smaller models, it is possible that Intel might rise in a year or two as the leader, spearheading the AI PCs game.

Intel anticipates shipping 40 million AI PCs in 2024, featuring over 230 designs spanning from ultra-thin PCs to handheld gaming devices. There are no PCs without Intel – that’s for sure.

Failed Deals and Poor Quarters are Part of the Game

Undoubtedly, failed AI deals are part of the business. Recently, Elon Musk’s xAI cancelled the $10 billion deal with Oracle. Also, Apple has denied a partnership with Meta for AI.

Intel is not the first one that failed to convert an OpenAI deal. Not many are aware that IT consulting giant Infosys, together with Musk, AWS, YC Research, and a few others, had donated a sizable $1 billion to OpenAI back in 2015, when the latter began as a non-profit organisation. But the donation did not turn into an investment.

What if it is a bigger regret for OpenAI to not have Intel as one of its partners? Since the cost of running its business and AI offerings powered by NVIDIA is significantly higher for the company and is making it struggle to earn revenue, Intel could have helped OpenAI make its own hardware by now.

When it comes to Intel, maybe owning this huge part of OpenAI would have been a failed strategy. Since Microsoft owns 49% of the shares in OpenAI now, things could have been quite different for all three companies. 

Moreover, Intel has a sweet spot for India. It has partnered with several companies in India, such as Krutrim, Bharti Airtel, Zoho, and several others, to provide its enterprise and data centre computing services. Maybe, Gelsinger’s interest in India would put Intel on the driving seat soon in the generative AI race. 

The post Is OpenAI Intel’s Biggest Regret Ever? appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/is-openai-intels-biggest-regret-ever/feed/ 0
Sam Altman Confirms OpenAI’s Project Strawberry https://analyticsindiamag.com/ai-news-updates/sam-altman-confirms-openais-project-strawberry/ https://analyticsindiamag.com/ai-news-updates/sam-altman-confirms-openais-project-strawberry/#respond Thu, 08 Aug 2024 06:27:14 +0000 https://analyticsindiamag.com/?p=10131840

OpenAI's teams are working on Strawberry to improve the models' ability to perform long-horizon tasks (LHT), which require planning and executing a series of actions over an extended period. 

The post Sam Altman Confirms OpenAI’s Project Strawberry appeared first on AIM.

]]>

OpenAI chief Sam Altman has hinted in a cryptic post that the AI startup is working on a project known internally as “Project Strawberry.” On X, Altman shared a post saying, “I love summer in the garden,” accompanied by an image of a pot with strawberries.

Project Strawberry, also referred to as Q*, was recently revealed in a Reuters report, which said that it will significantly enhance the reasoning capabilities of OpenAI’s AI models. “Some at OpenAI believe Q* could be a breakthrough in the startup’s search for artificial general intelligence (AGI),” said the report. 

Project Strawberry involves a novel approach that allows AI models to plan ahead and navigate the internet autonomously to perform in-depth research. This advancement could address current limitations in AI reasoning, such as common sense problems and logical fallacies, which often lead to inaccurate outputs.

AI Insider who goes by the name Jimmy Apples recently revealed that the Q* hasn’t been released yet as they ( OpenAI) aren’t happy with the latency and other ‘little things’ they want to further optimise.  

OpenAI’s teams are working on Strawberry to improve the models’ ability to perform long-horizon tasks (LHT), which require planning and executing a series of actions over an extended period. 

The project involves a specialised “post-training” phase, adapting the base models for enhanced performance. This method resembles Stanford’s 2022 “Self-Taught Reasoner” (STaR), which enables AI to iteratively create its own training data to reach higher intelligence levels.

OpenAI recently announced DevDay 2024, a global developer event series scheduled to take place in San Francisco on October 1, London on October 30, and Singapore on November 21. While the company has stated that the focus will be on advancements in the API and developer tools, there is speculation that OpenAI might also preview its next frontier model.

Recently, a new model in the LMsys chatbot arena showed strong performance in math. Interestingly, before the release of GPT-4o and GPT-4o Mini, these models were also observed in the chatbot arena a few days earlier.

The internal document indicates that Project Strawberry includes a “deep-research” dataset for training and evaluating the models, though the contents of this dataset remain undisclosed.

This innovation is expected to enable AI to conduct research autonomously, using a “computer-using agent” (CUA) to take actions based on its findings. Additionally, OpenAI plans to test Strawberry’s capabilities in performing tasks typically done by software and machine learning engineers.

Last year, it was reported that Jakub Pachocki and Szymon Sidor, two leading OpenAI researchers, used Ilya Sutskever’s work to develop a model called Q* (pronounced “Q-Star”) that achieved an important milestone by solving math problems it had not previously encountered.

Sutskever, raised concerns among some staff that the company didn’t have proper safeguards in place to commercialise such advanced AI models. Notably, he  left OpenAI and recently founded his own company called Safe Superintelligence. Following that Pachocki was appointed as the new chief AI scientist. 

What is Q*? 

Q* is probably a combination of Q-learning and A* search.  OpenAI’s Q* algorithm is considered a breakthrough in AI research, particularly in the development of AI systems with human reasoning capabilities. Q* combines elements of Q-learning and A* (A-star search), which leads to an improvement in goal-oriented thinking and solution finding. 

This algorithm shows impressive capabilities in solving complex mathematical problems (without prior training data) and symbolizes an evolution towards general artificial intelligence (AGI).

Q-learning  is a foundational concept in the field of AI, specifically in the area of reinforcement learning. Q-learning’s algorithm is categorised as model-free reinforcement learning, and is designed to understand the value of an action within a specific state. 

The ultimate goal of Q-learning is to find an optimal policy that defines the best action to take in each state, maximising the cumulative reward over time.

Q-learning is based on the notion of a Q-function, aka the state-action value function. This function operates with two inputs: a state and an action. It returns an estimate of the total reward expected, starting from that state, alongside taking that action, and thereafter following the optimal policy. 

OpenAI has recently unveiled a five-level classification system to track progress towards achieving artificial general intelligence (AGI) and superintelligent AI. The company currently considers itself at Level 1 and anticipates reaching Level 2 in the near future.

Other tech giants like Google, Meta, and Microsoft are also exploring techniques to enhance AI reasoning. However, experts like Meta’s Yann LeCun argue that large language models may not yet be capable of human-like reasoning.

The post Sam Altman Confirms OpenAI’s Project Strawberry appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/sam-altman-confirms-openais-project-strawberry/feed/ 0
Wait… Did OpenAI Just Solve ‘Jagged Intelligence’?  https://analyticsindiamag.com/ai-origins-evolution/wait-did-openai-just-solve-jagged-intelligence/ https://analyticsindiamag.com/ai-origins-evolution/wait-did-openai-just-solve-jagged-intelligence/#respond Wed, 07 Aug 2024 12:36:48 +0000 https://analyticsindiamag.com/?p=10131786

“9.11 > 9.9” is now in the OpenAI docs as a problem solved by requesting JSON-structured output to separate final answers from supporting reasoning.

The post Wait… Did OpenAI Just Solve ‘Jagged Intelligence’?  appeared first on AIM.

]]>

Today, OpenAI came up with its latest update featuring Structured Outputs in its API, which claims to enhance the model’s reasoning by ensuring precise and consistent adherence to output schemas. This is as demonstrated by gpt-4o-2024-08-06, achieving “100% reliability” in evals, perfectly matching the output schemas, ensuring accurate and consistent data generation.

Funnily, the OpenAI docs include the problem of “9.11 > 9.9” as an example resolved using JSON structured output to distinguish between final answers and supporting reasoning. It was in reference to the term ‘Jagged Intelligence’ coined by Andrej Karpathy for LLMs’ struggle with dumb problems. 

This new feature helps ensure that the responses from models follow a specific set of rules (called JSON Schemas) provided by developers. OpenAI said that they took a deterministic, engineering-based approach to constrain the model’s outputs to achieve 100% reliability.

“OpenAI finally rolled out structured outputs in JSON, which means you can now enforce your model outputs to stick to predefined schemas. This is super handy for tasks like validating data formats on the fly, automating data entry, or even building UIs that dynamically adjust based on user input,”posted a user on X.

OpenAI has used the technique of constrained decoding. Normally, when a model generates text, it can choose any word or token from its vocabulary. This freedom can lead to mistakes, such as adding incorrect characters or symbols.

Constrained decoding is a technique used to prevent these mistakes by limiting the model’s choices to tokens that are valid according to a specific set of rules (like a JSON Schema).

A Stop-Gap Mechanism for Reasoning?

Arizona State University, Prof Subbarao Kambhampati argues that while LLMs are impressive tools for creative tasks, they have fundamental limitations in logical reasoning and cannot guarantee the correctness of their outputs. 

He said that GPT-3, GPT-3.5, and GPT-4 are poor at planning and reasoning, which he believes involves time and action. These models struggle with transitive and deductive closure, with the latter involving the more complex task of deducing new facts from existing ones.

Kambhampati aligns with Meta AI chief Yann LeCun, who believes that LLMs won’t lead to AGI and that researchers should focus on gaining animal intelligence first. 

“Current LLMs are trained on text data that would take 20,000 years for a human to read. And still, they haven’t learned that if A is the same as B, then B is the same as A,” Lecun said

He has even advised young students not to work on LLMs. LeCun is bullish on self-supervised learning and envisions a world model that could learn independently.

“In 2022, while others were claiming that LLMs had strong planning capabilities, we said that they did not,” said Kambhampati, adding that their accuracy was around 0.6%, meaning they were essentially just guessing.

He further added that LLMs are heavily dependent on the data they are trained on. This dependence means their reasoning capabilities are limited to the patterns and information present in their training datasets. 

Explaining this phenomenon, Kambhampati said that when the old Google PaLM was introduced, one of its claims was its ability to explain jokes. He said, “While explaining jokes may seem like an impressive AI task, it’s not as surprising as it might appear.”

“There are humour-challenged people in the world, and there are websites that explain jokes. These websites are part of the web crawl data that the system has been trained on, so it’s not that surprising that the model could explain jokes,” he explained. 

He added that LLMs like GPT-4, Claude, and Gemini are ‘stuck close to zero’ in their reasoning abilities. They are essentially guessing plans for the ‘Blocks World’ concept, which involves ‘stacking’ and ‘unstacking, he said.  

This is consistent with a recent study by DeepMind, which found that LLMs often fail to recognise and correct their mistakes in reasoning tasks. 

The study concluded that “LLMs struggle to self-correct their reasoning without external feedback. This implies that expecting these models to inherently recognise and rectify their reasoning mistakes is overly optimistic so far”.

Meanwhile, OpenAI reasoning researcher Noam Brown also agrees. “Frontier models like GPT-4o (and now Claude 3.5 Sonnet) may be at the level of a “smart high schooler” in some respects, but they still struggle on basic tasks like tic-tac-toe,” he said.

Interestingly, Apple recently used the standard prompt engineering for a bunch of their Apple Intelligence features, and someone on Reddit found the prompts. 

The Need for a Universal Verifier 

To tackle the problem of accuracy, OpenAI has introduced a prover-verifier model to enhance the clarity and accuracy of language model outputs.

In the Prover-Verifier Games, two models are used, the Prover, a strong language model that generates solutions, and the Verifier, a weaker model that checks these solutions for accuracy. The Verifier determines whether the Prover’s outputs are correct (helpful) or intentionally misleading (sneaky).

Kambhampati said, “You can use the world itself as a verifier, but this idea only works in ergodic domains where the agent doesn’t die when it’s actually trying a bad idea.”

He further said that even with end-to-end verification, a clear signal is needed to confirm whether the output is correct. “Where is this signal coming from? That’s the first question. The second question is, how costly will this be?”

Also, OpenAI is currently developing a model with advanced reasoning capabilities, known as Q* or Project Strawberry. Rumours suggest that, for the first time, this model has succeeded in learning autonomously using a new algorithm, acquiring logical and mathematical skills without external influence.

Kambhampati is somewhat sceptical about this development as well. He said, “Obviously, nobody knows whether anything was actually done, but some of the ideas being discussed involve using a closed system with a verifier to generate synthetic data and then fine-tune the model. However, there is no universal verifier.”

Chain of Thought Falls Short

Regarding the Chain of Thought, Kambhampati said that it basically gives the LLM advice on how to solve a particular problem. 

Drawing an analogy with the Block World problem, he explained that if you train an LLM to solve three- or four-block stacking problems, they could improve their performance on these specific problems. However, if you increase the number of blocks, their performance significantly dies. 

Kambhampati quipped that Chain of Thought and LLMs remind him of the old proverb, “Give a man a fish, and you feed him for a day, teach a man to fish, and you feed him for a lifetime.” 

“Chain of Thought is actually a strange version of this,” he said. “You have to teach an LLM how to catch one fish, then how to catch two fish, then three fish, and so on. Eventually, you’ll lose patience because it’s never learning the actual underlying procedure,” he joked.

Moreover, he said that this doesn’t mean AI can’t perform reasoning. “AI systems that do reasoning do exist. For example, AlphaGo performs reasoning, as do reinforcement learning systems and planning systems. However, LLMs are broad but shallow AI systems. They are much better suited for creativity than for reasoning tasks.” 

Google DeepMind’s AlphaProof and AlphaGeometry, based on a neuro-symbolic approach, recently won a Silver Medal at the International Maths Olympiad. Many, to an extent, feel that neuro-symbolic AI will prevent generative AI bubbles from exploding.

Last year, AIM discussed the various approaches taken by big-tech companies, namely OpenAI, Meta, Google DeepMind, and Tesla, in the pursuit of AGI. Since then, tremendous progress has been made. 

Lately, it’s likely that OpenAI is working on Causal AI, as their job postings, such as for data scientists, emphasise expertise in causal inference.

LLM-based AI Agents will NOT Lead to AGI 

Recently, OpenAI developed a structured framework to track the progress of its AI models toward achieving artificial general intelligence (AGI).  OpenAI CTO Mira Murati claimed that GPT-5 will reach a PhD-level of capability, while Google’s Logan Kilpatrick anticipates AGI will emerge by 2025

Commenting on the hype around AI agents, Kambhampati said, “I am kind of bewildered by the whole agentic hype because people confuse acting with planning.”

He further explained, “Being able to make function calls doesn’t guarantee that those calls will lead to desirable outcomes. Many people believe that if you can call a function, everything will work out. This is only true in highly ergodic worlds, where almost any sequence will succeed and none will fail.”

The post Wait… Did OpenAI Just Solve ‘Jagged Intelligence’?  appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/wait-did-openai-just-solve-jagged-intelligence/feed/ 0
LLMs Have Stalled the Progress of AGI https://analyticsindiamag.com/ai-insights-analysis/llms-have-stalled-the-progress-of-agi/ https://analyticsindiamag.com/ai-insights-analysis/llms-have-stalled-the-progress-of-agi/#respond Tue, 06 Aug 2024 12:30:00 +0000 https://analyticsindiamag.com/?p=10131679

Current LLMs show low user trust, low accuracy and reliability. This is not a new conversation as the reasoning and logical problems with current LLMs have been brought up several times

The post LLMs Have Stalled the Progress of AGI appeared first on AIM.

]]>

Everyone has questions about generative AI, the validity of LLMs, and the future they hold. While LLMs are often seen as overhyped in research circles, they are also considered by some as a major roadblock to achieving artificial general intelligence (AGI).

Reiterating this point, Mike Knoop, the co-founder of Zapier, recently expressed scepticism about the progress of AI language models toward achieving AGI. “LLMs have stalled in the progress to AGI and increasing scale will not help what is an inherently limited technology,” said Knoop in a recent interview.

Knoop’s assessment was based on the fact that current LLMs show low user trust, low accuracy and reliability. “And these problems are not going away with scale,” he added. This is not a new conversation as the reasoning and logical problems with current LLMs have been brought up several times. 

No Roads Lead to AGI

Regardless, the scepticism has not deterred the big-techs from the mad rush of building the best LLMs. Google, OpenAI, Microsoft, and Meta have been racing towards building LLMs – bigger and smaller alike. 

All this is while Yann LeCun, the chief of Meta AI, has obsessively said that LLMs won’t lead to AGI and the researchers getting into the AI field should not work on LLMs. 

Similarly, Francois Chollet, the creator of Keras, also recently shared similar thoughts on this. “OpenAI has set back the progress towards AGI by 5-10 years because frontier research is no longer being published and LLMs are an offramp on the path to AGI,” he said in an interview.

Knoop’s concerns are also well established and are heightened by his participation in the introduction of the ARC Prize, a competition designed to encourage novel approaches to AGI, particularly through the Abstraction and Reasoning Corpus (ARC).

Since its establishment, this benchmark—which evaluates the capacity for efficient skill acquisition— has not shown much improvement, supporting Knoop’s contention that present AI models do not seem to be headed towards AGI.

Forget AGI, Aim for Animal-level Intelligence

LeCun says AI should reach animal-level intelligence before heading towards AGI and Knoop has his concerns about LLMs. Likewise, Andrej Karpathy, the founder of Eureka Labs, has been quite vocal about the issues with LLMs. 

In his latest experiment, Karpathy proved that LLMs struggle with seemingly simple tasks and coined the term “Jagged Intelligence” to prove his point.

Even Yoshua Bengio, one of the godfathers of AI, said in an exclusive interview with AIM that when it comes to achieving the kind of intelligence that humans have, some important ingredients are still missing.

These problems are not unique. Noam Brown, a research engineer at OpenAI, experimented with LLMs by making them play basic games like tic-tac-toe. The outcome was dismal, since LLMs performed poorly in this simple game. 

Additionally, another developer tasked GPT-4 with solving tiny Sudoku puzzles. Here too, the model struggled, often failing to solve these seemingly simple puzzles. Even a study by Google DeepMind has proved that LLMs lack genuine understanding and, as a result, cannot self-correct or adjust their responses on command. 

Subbarao Kambhampati, professor of AI at Arizona State University, agrees with the notion. “They are just not made and meant to do that,” he said. He gave an example of how LLM-based chatbots are still very weak at maths problems.

Too Early to Write off LLMs

But, there is still time. While it’s true that we haven’t reached human-level intelligence yet, it’s not that we are never going to achieve it. 

OpenAI had claimed that GPT-4, released in March, has beaten human psychologists in understanding complex emotions. Infact, OpenAI CTO Mira Murati in a recent interview claimed that GPT-5 will have ‘PhD-level’ intelligence.

Meanwhile, Ilya Sutskever, the former chief scientist at OpenAI and founder of Safe Superintelligence, believes that text is the projection of the world. But how much of that is linked with LLMs is still questionable. LLMs, in a way, are building cognitive architecture from scratch, echoing the evolutionary and real-time learning processes, albeit with a bit more electricity.

Last month, AIM had discussed the various approaches taken by big tech companies, namely OpenAI, Meta, Google DeepMind, and Tesla, in the pursuit of AGI. Since then, tremendous progress has been made.

Undeterred, the research on LLMs also is still going strong with companies finding various ways to constantly improve the models. Recently, OpenAI released a research paper on Prover-Verifier Games’ (PVG) for LLMs which can solve Knoop’s problem. 

Similarly, Causality with LLMs would enable AI to understand cause-and-effect relationships, similar to human reasoning. Then, there is neurosymbolic AI that can enhance LLMs efficiency. 

We can safely say that LLMs are worth more than one shot to smoothen the road to AGI.

The post LLMs Have Stalled the Progress of AGI appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-insights-analysis/llms-have-stalled-the-progress-of-agi/feed/ 0
OpenAI Faces Wave of Resignations as Top Talent Quits https://analyticsindiamag.com/ai-news-updates/openai-faces-wave-of-resignations-as-top-talent-quits/ https://analyticsindiamag.com/ai-news-updates/openai-faces-wave-of-resignations-as-top-talent-quits/#respond Tue, 06 Aug 2024 10:14:48 +0000 https://analyticsindiamag.com/?p=10131624

"First time to relax since co-founding OpenAI 9 years ago,” says OpenAI co-founder Greg Brockman.

The post OpenAI Faces Wave of Resignations as Top Talent Quits appeared first on AIM.

]]>

OpenAI, the company behind ChatGPT, is undergoing significant leadership changes with the recent departures of three key figures: co-founders John Schulman and Greg Brockman, and product manager Peter Deng. 

OpenAI co-founder Greg Brockman has taken an extended leave of absence, alongside John Schulman, who has left to join Anthropic, and Peter Deng, highlighting ongoing leadership changes within the company.

“I’m taking a sabbatical through the end of year. First time to relax since co-founding OpenAI 9 years ago,” said Brockman, highlighting that the mission is far from complete; they still have a safe AGI to build.  

Meanwhile, Schulman announced his intentions of joining Anthropic, stating, “This choice stems from my desire to deepen my focus on AI alignment and to start a new chapter of my career where I can return to hands-on technical work.” 

Further, he clarified, saying “To be clear, I’m not leaving due to lack of support for alignment research at OpenAI. My decision is a personal one, based on how I want to focus my efforts in the next phase of my career.”

It looks like Schulman left the company while remaining in their good books. Sam Altman thanked Schulman for his contributions to OpenAI, calling him “a brilliant researcher, a deep thinker about product and society, and mostly, you are a great friend to all of us,” reminiscing about their first meeting in 2015, where Schulman laid out much of OpenAI’s initial strategy in just 15 minutes.  

However, the reason for Deng leaving the company remains unknown. It is most likely that he’s planning to start his own AI venture, given his passion and experience in building products like ChatGPT.  

This development follows the departure of former OpenAI co-founder Andrej Karpathy, who left to start his own AI company, Eureka Labs. Eureka Labs is an AI+Education company focused on creating an AI-native learning environment. The goal of Eureka Labs is to revolutionise education by integrating generative AI with traditional teaching methods.

Previously, Ilya Sutskever, chief scientist at OpenAI, also announced his resignation and launched a new company called Safe SuperIntelligence. This venture is dedicated to developing advanced and safe AI systems, with a strong emphasis on safety alongside capabilities. Sutskever’s new company aims to create superintelligent AI systems that are both advanced and secure, while avoiding the distractions and pressures faced by larger AI firms.

Along with Sutsekver, Jan Leike, OpenAI’s head of alignment, also resigned and joined Anthropic. Leike announced he would continue working on the “super alignment mission”, focusing on scalable oversight, weak-to-strong generalisation, and automated alignment research.

The OpenAI mafia is just getting bigger and bigger with each passing day. The company certainly knows how to hire the best talent in the world, churning out folks so that big tech can eventually acquire them later on. 

The post OpenAI Faces Wave of Resignations as Top Talent Quits appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-faces-wave-of-resignations-as-top-talent-quits/feed/ 0
Kwai’s Kling vs OpenAI Sora  https://analyticsindiamag.com/ai-breakthroughs/kwais-kling-vs-openai-sora/ https://analyticsindiamag.com/ai-breakthroughs/kwais-kling-vs-openai-sora/#respond Wed, 31 Jul 2024 11:53:14 +0000 https://analyticsindiamag.com/?p=10130890

Given that OpenAI only grants a limited number of select creators access to Sora, Kling AI might just be the top choice.

The post Kwai’s Kling vs OpenAI Sora  appeared first on AIM.

]]>

Kuaishou Technology, a Chinese AI and technology company, launched a new text-to-video model called Kling this year. Su Hua, a Chinese billionaire internet entrepreneur, is the co-founder and CEO of the video platform Kuaishou, known outside of China as Kwai.

Several AI enthusiasts shared their creations from Kling on X that captured the hearts of internet users worldwide. A series of animals and objects were featured enjoying a meal of noodles. From a panda munching on a bowl of ramen to a kangaroo slurping up some udon, the videos are both hilarious and heartwarming. 

A few others include blueberries turning into puppies and a tray of apples turning into guinea pigs, which mess with your head.

The level of detail and realism in the videos is a testament to the capabilities of Kling and the progress made in the field of AI.

Known for the creation of TikTok competitor, Kuaishou joined the race with other Chinese tech companies to rival OpenAI’s Sora

With simple text prompts, it can generate highly realistic videos in 1080p high-definition resolution. The videos can be up to two-minutes long. Sora, on the other hand, makes 60-second videos with text prompts.

Kling boasts the ability to produce realistic motions on a large scale, simulate the attributes of the physical world, and weave concepts and imagination together setting a new benchmark in AI-powered video creation. 

However, as impressive as Kling AI may be, its accessibility is primarily limited to a few users even though the company claimed it would be available worldwide. This poses significant challenges for its global adoption.

For some users who were looking forward to accessing its offerings, this situation may feel like a huge letdown. 

A worldwide release creates expectations of inclusivity and accessibility; when these are unmet, it can harm the company’s reputation. Despite Kling AI’s impressive features, its primary hurdle is limited availability. 

Currently, its access is mostly limited to invited beta testers, with some users in China able to experience a limited demo version through the Kuaishou app, as claimed by ChatGPT on Quora. 

At a time when the US is heavily debating AI ethics and incorporating ‘Responsible AI’, China seems unperturbed and is likely responding to these AI ethicists with a Kling. 

The AI company hit the headlines recently by announcing the global launch of its International Version 1.0, a platform designed to revolutionise industries worldwide. This milestone release features advanced machine learning, multilingual support, and enhanced data analytics, promising unparalleled efficiency and innovation across sectors. 

AI Video Generator War Begins! 

While systems like OpenAI’s Sora and Kuaishou’s Kling have showcased impressive capabilities, they remain accessible only to a select group of users. Similarly, Luma AI’s Dream Machine also boasts remarkable features but is limited to a restricted audience.

Interestingly, Kuaishou’s AI tool entered the market shortly after Vidu AI, another Chinese text-to-video AI model known for producing HD 1080p 16-second videos.

This model’s launch coincides with a flurry of activity in the generative AI sector, as startups and tech giants compete to develop advanced tools that create realistic images, audio, and video from text inputs.

It has a user-friendly interface that supports text-to-video or text-to-image generation. 

Unlike Runway, Haiper, and Luma Labs, it prompts up to 2,000 characters, enabling highly detailed descriptions. It performs better with lengthy, well-crafted prompts.

This cutting-edge AI model employs variable resolution training, enabling users to produce videos in various aspect ratios. Remarkably, it can showcase full expression and limb movement from a single full-body image. 

AI video creation seems like the next battleground for tech companies with contenders like OpenAI’s Sora, Microsoft’s VASA-1, Adobe’s Firefly, Midjournery, and Pika Labs, already in the game. 

Furthermore, Google recently introduced Veo, a new text-to-video AI model, at Google I/O to compete with OpenAI’s Sora. Veo improves on previous models, offering consistent, high-quality over-a-minute-long 1080p videos.

While some were impressed with Veo’s capabilities, others argue that it may not exactly be state-of-the-art in its latency or abilities compared to Sora.

Now that Kling is here, the benchmark of making cinematically impressive and real-world-like videos has gone up. 

Why is Kling a big deal?

This month, Runway introduced Gen-3, which offers enhanced realism and the ability to generate 10-second clips. Last month, Luma Labs unveiled the impressive Dream Machine. 

These new model updates were initially spurred by the release of Sora earlier this year, which remains the benchmark for AI video generation. Recently, a series of short films on YouTube showcased Sora’s full potential. Additionally, Kling played a significant role in the wave of updates.

It also adopts a unique approach to AI by incorporating generative 3D in its creation process. It provides Sora-level scene changes, clip lengths, and video resolution. Given that OpenAI only grants a limited number of select creators access to Sora, Kling AI might just be the top choice for now.

Capabilities of Kling AI

Kling AI is accessible via the Kuaishou app, available on both iOS and Android platforms. This mobile app puts Kling AI’s advanced video generation capabilities directly at users’ fingertips, enabling them to create high-quality, realistic videos from their smartphones.

For users outside China, accessing Kling AI often requires navigating around these barriers. Some have resorted to emailing Kuaishou directly to request access, explaining their interest in becoming beta testers. 

The competitive landscape is evolving, but the restrictions on access can hinder Kling’s ability to gain traction outside China.

Chinese attempts to lure domestic developers away from OpenAI – considered the market leader in generative AI – will now be a lot easier, after OpenAI notified its users in China that they would be blocked from using its tools and services. 

“We are taking additional steps to block API traffic from regions where we do not support access to OpenAI’s services,” said an OpenAI spokesperson.

OpenAI has not elaborated about the reason for its sudden decision. 

ChatGPT is already blocked in China by the government’s firewall, but until this week developers could use virtual private networks to access OpenAI’s tools in order to fine-tune their own generative AI applications and benchmark their own research. Now the block is coming from the US side.

The OpenAI move has “caused significant concern within China’s AI community”, said Xiaohu Zhu, the founder of the Shanghai-based Centre for Safe AGI, which promotes AI safety, not least because “the decision raises questions about equitable access to AI technologies globally”.

The post Kwai’s Kling vs OpenAI Sora  appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/kwais-kling-vs-openai-sora/feed/ 0
Llama 3.1 Vs GPT-4o – Detailed Comparison https://analyticsindiamag.com/ai-trends-future/llama-3-1-vs-gpt-4o/ https://analyticsindiamag.com/ai-trends-future/llama-3-1-vs-gpt-4o/#respond Sun, 28 Jul 2024 13:30:56 +0000 https://analyticsindiamag.com/?p=10130437

The Llama 3.1 405B model matches top closed models, supports 128k context length, eight languages, and excels in code generation, complex reasoning, and tool use.

The post Llama 3.1 Vs GPT-4o – Detailed Comparison appeared first on AIM.

]]>

Since Meta released its new model, Llama 3.1 405B, the tech world has been buzzing with excitement. After a leak of Llama 3.1, Meta officially launched the Llama 3.1 405B, an advanced open-source AI model, along with its 70B and 8B versions. They also upgraded the existing 8B and 70B models.

It is the first openly available model that rivals the top AI models in terms of state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. 

As Meta CEO Mark Zuckerberg mentioned in a post, Meta’s long-term vision is to build general intelligence, open-source it responsibly, and make it widely available so everyone can benefit.

“I believe we should fear people who think open-source is dangerous when, in fact, open-source is the foundation for all greatness,” said Zuckerberg.

The Llama 3.1 model is reported to outperform GPT-4. Let’s examine the various parameters where Llama 3.1 excels and surpasses GPT-4.

Compare Llama 3.1 Vs GPT 4o

Availability Comparison

Meta Llama 3.1 is an open-source model, making it freely available for download and development. This aligns with Meta’s commitment to open-source innovation, allowing developers to use and modify the model without restrictions.

Additionally, the model can be downloaded from platforms like Hugging Face and Meta’s own distribution channels, making it widely accessible for developers and researchers. 

Meanwhile, GPT-4o is a closed-source model. Users can access GPT-4 through APIs provided by OpenAI, but they cannot customise or fine-tune the model in the same way as with open-source models like Llama 3.1.

Benchmark Performance Comparison

The benchmark performance highlighted vital areas such as math reasoning, coding, and common sense reasoning. In the Math Reasoning (GSM8K benchmark), Llama 3.1 scored 96.82%, outperforming GPT-4o, which scored 94.24%, demonstrating superior capability in grade school math reasoning tasks. 

In contrast, for the Coding (HumanEval benchmark), GPT-4o excelled with a score of 92.07%, surpassing Llama 3.1’s 85.37%, indicating better performance in coding tasks.

Regarding Common Sense Reasoning (Winograde benchmark), Llama 3.1 again showed its strength with a score of 86.74%, compared to GPT-4o’s 82.16%.

Cost Efficiency Comparison

Meta claims that operating Llama 3.1 in production costs approximately 50% less than using GPT-4. This cost advantage is particularly appealing for organizations looking to implement AI solutions without incurring hefty operational expenses associated with proprietary models. 

While closed models are generally more cost-effective, Llama models offer some of the lowest costs per token in the industry, according to testing by Artificial Analysis.

Pricing Comparison

According to a prediction analysis done by artificial analysis, Llama 3.1 405B is expected to be positioned as a more cost-effective alternative to current frontier models like GPT-4o and Claude 3.5 Sonnet, offering similar quality at a lower price. 

Providers will likely offer FP16 and FP8 versions at different price points. FP16 requires 2x DGX systems with 8xH100s for operation. 

The FP8 versions of Llama 3.1 405B may become the more prominent offering, potentially delivering frontier-class intelligence at prices between $1.50 and $3 (blended 3:1). Projections suggest that FP16 will be priced between $3.5 and $5 (blended 3:1). At the same time, FP8 will range from $1.5 to $3.

Multilingual Capabilities Comparison

Llama 3.1 is designed to handle conversations in multiple languages, including Spanish, Portuguese, Italian, German, Thai, French, and Hindi. This broad multilingual support enhances its utility for various global applications and diverse user bases. 

GPT-4o demonstrates superior language comprehension, particularly in complex contexts and nuanced language use. It employs advanced techniques for ambiguity handling and context-switching, which allows it to maintain coherence in more extended conversations. 

Final Take 

When compared, both models present a win-win scenario. Reddit conversations and discussions are ongoing about which model is better. Users are discussing the limitations of running the 405B model locally and the potential for improved models through continued training.

Additionally, a user mentioned on Reddit that GPT 4o has a significant advantage with its new voice and vision feature, which is very realistic and fast. No other model has come remotely close to the realism and response time showcased in the demos. This feature is essential because it’s the future of how people will interact with chatbots.

The post Llama 3.1 Vs GPT-4o – Detailed Comparison appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-trends-future/llama-3-1-vs-gpt-4o/feed/ 0
GenAI Startup Rabbitt.AI, Founded by IIT-D Alum, Raises $2.1M https://analyticsindiamag.com/ai-news-updates/genai-startup-rabbitt-ai-founded-by-iit-d-alum-raises-2-1m/ https://analyticsindiamag.com/ai-news-updates/genai-startup-rabbitt-ai-founded-by-iit-d-alum-raises-2-1m/#respond Sun, 28 Jul 2024 10:54:25 +0000 https://analyticsindiamag.com/?p=10130432 Rabbitt AI

The startup claims to achieve 25% more annotations, 30% more success, and 15% more clients than OpenAI.

The post GenAI Startup Rabbitt.AI, Founded by IIT-D Alum, Raises $2.1M appeared first on AIM.

]]>
Rabbitt AI

Rabbitt.AI raised $2.1 million from TC Group of Companies. The genAI startup enables businesses to create and deploy advanced AI applications with tools for custom LLM development, RAG fine-tuning, and data-centric AI. Their platform features MLOps integration, voice bot AI agents, and prioritises privacy-first strategies in AI deployment.

Rabbitt.AI is founded by IIT-D alum and the recent funding round had participation from big tech executives including NVIDIA, Meta and Microsoft. “Smaller, custom and industry specific fine-tuned models are making more moves than one big model. At Rabbitt.AI we are the advocates of these and helping companies adopt Open Source AI models,” said Harneet S.N., founder and chief AI officer of Rabbitt.AI.

The London-headquartered startup has its majority development team based in India, with an office in Delhi. 

GenAI Services for Enterprises

Rabbitt.AI collaborates with enterprises to customise LLMs for specific use cases and develop AI applications using Generative AI models. One of the products being developed is a genAI-powered autonomous software engineer that can create production-ready software with no human intervention. This system auto-improves using Rabbitt’s proprietary agentic framework and fine-tuned LLMs.

“We are helping organisations own their data and own their AI. In the new world, we are helping companies become the landlords of the AI world rather than just the tenants like the past Internet world,” said Harneet, who is also a mentor and advisor at Delhi University’s Startup Incubation Fund and an official forum member at Confederation of Indian Industry (CII) Industry-Academia Partnership Forum. 

Rabbitt.AI also provides data annotation and curation services across domains including healthcare, education, marketing, and customer relationship management.

Interestingly, Rabbitt.AI claims to outperform services offered by OpenAI and have claimed to achieve 25% more annotations, 30% more success, and 15% more clients than OpenAI. 

Source: Rabbitt.AI

The post GenAI Startup Rabbitt.AI, Founded by IIT-D Alum, Raises $2.1M appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/genai-startup-rabbitt-ai-founded-by-iit-d-alum-raises-2-1m/feed/ 0
​​OpenAI’s SearchGPT Could Blow Up Google’s $2 Tn Monopoly https://analyticsindiamag.com/ai-origins-evolution/openais-searchgpt-could-blow-up-googles-2-tn-monopoly/ https://analyticsindiamag.com/ai-origins-evolution/openais-searchgpt-could-blow-up-googles-2-tn-monopoly/#respond Fri, 26 Jul 2024 12:42:32 +0000 https://analyticsindiamag.com/?p=10130400

Not really.

The post ​​OpenAI’s SearchGPT Could Blow Up Google’s $2 Tn Monopoly appeared first on AIM.

]]>

Google consistently builds cool AI products, while OpenAI seems to wait for these launches to steal the spotlight. 

“Be OpenAI, wait for Google to drop a cute math model, then launch a competing search engine that could potentially blow up Google’s $2T internet search monopoly and send Google execs into existential dread,” quipped Aidan McLau, the chief executive of Topology Invest. 

A few hours after Google launched its new model capable of solving International Math Olympiad problems, OpenAI grabbed attention with its new Google alternative, SearchGPT. It combines OpenAI’s AI models with real-time web information to provide fast and relevant answers to user queries. 

Currently in the prototype phase, it is available to a limited group of 10,000 test users.

“We believe there is room to make search much better than it is today. We are launching a new prototype called SearchGPT. We will learn from this prototype, improve it, and then integrate the technology into ChatGPT to make it real-time and maximally helpful,” said OpenAI chief Sam Altman.

This isn’t the first time OpenAI has done something like this. Earlier this year, when Google released Gemini 1.5, OpenAI announced Sora on the same day. Then, just a day before Google I/O 2024, OpenAI hosted its Spring Update and released GPT-4o

However, OpenAI has yet to make Sora publicly available, and the voice features for ChatGPT are still not accessible.

Who knows, OpenAI might have achieved AGI internally. “OpenAI has probably already achieved gold in the Math Olympiad—something even the most optimistic AI researchers would not have expected before 2025,” posted a user on X who goes by the name Chubby. 

This might actually be true, as Altman responded to Google DeepMind’s IMO score with a simple ‘lol’.

Math all the Way 

Solving problems at the IMO Olympiad is any day a greater achievement than launching an AI-based web search. 

“AlphaProof is one of the most exciting applications of LLMs combined with RL. The Gemini model automatically translates natural language problem statements into formal statements (i.e., formalizer network),” said Elvis Saravia, the co-founder of DAIR.

“LLMs are alien beasts. It is deeply troubling that our frontier models can both achieve a silver medal in the Math Olympiad and fail to answer. “Which number is bigger, 9.11 or 9.9?’” said Jim Fan, the lead of Embodied AI (GEAR Lab) at NVIDIA.

He further explained that AlphaProof and AlphaGeometry-2 are trained on formal proofs and domain-specific symbolic engines. “In a way, they are highly specialized towards solving Olympiads, even though they build on a general-purpose LLM base,” he said. 

Meanwhile, OpenAI is quietly developing Project Strawberry to significantly boost the reasoning capabilities of its AI models. While details about the project remain undisclosed, it focuses on a novel approach that enables AI to plan ahead and autonomously navigate the internet for in-depth research.

Internally, OpenAI has tested AI that scored over 90% on the MATH dataset, which benchmarks championship-level math problems. This progress aims to tackle current AI reasoning limitations, such as common sense issues and logical errors that often lead to inaccurate outputs.

Previously known as Project Q*, which was leaked last year and could solve new math problems, Project Strawberry is now working to enhance long-horizon tasks (LHT). This involves a specialised “post-training” phase, adapting base models for better performance, similar to Stanford’s 2022 Self-Taught Reasoner (STaR), which enables AI to generate its own training data for improved intelligence.

Do we Really Need SearchGPT? 

While SearchGPT is a nice feature to have in ChatGPT, it appears that Perplexity AI has already taken the lead in this segment. 

“By the way, Perplexity is awesome. They made me rethink what search & AI integration could be. It’s often the first place I go to now when I need to start researching a new topic,” posted Lex Fridman on X.

“SearchGPT is like a nice feature at this point. Nobody can take the crown from Perplexity after they introduced agentic search – it’s simply too good and OpenAI cannot just steamroll them,” posted a user on X. 

Building a tool like Perplexity AI isn’t particularly difficult today, so it’s puzzling why it took OpenAI so long. “This is cool, but the name sounds like something a high schooler would put on their resume as their first solo project,” joked a user on X.

On a related note, Bishal Saha, a dropout from Lovely Professional University, created Omniplex, an open-source alternative to Perplexity AI, over a single weekend.

It’s unclear how concerned Perplexity AI chief Aravind Srinivas is right now, but Google’s stock did plummet after the OpenAI SearchGPT demo. In response to competitors like Perplexity AI and ChatGPT, the search giant introduced ‘AI Overviews’ at Google I/O 2024. This feature generates summaries for user queries.

The post ​​OpenAI’s SearchGPT Could Blow Up Google’s $2 Tn Monopoly appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/openais-searchgpt-could-blow-up-googles-2-tn-monopoly/feed/ 0
Is OpenAI ‘Sam’ of All Trades, Master of None?  https://analyticsindiamag.com/ai-origins-evolution/is-openai-sam-of-all-trades-master-of-none/ https://analyticsindiamag.com/ai-origins-evolution/is-openai-sam-of-all-trades-master-of-none/#respond Fri, 26 Jul 2024 11:29:58 +0000 https://analyticsindiamag.com/?p=10130315 Is OpenAI Sam of All Trades Jack of None

The question arises - “How long can OpenAI continue operating this way?” Without raising funds soon, the company may run out of cash to sustain its operations.

The post Is OpenAI ‘Sam’ of All Trades, Master of None?  appeared first on AIM.

]]>
Is OpenAI Sam of All Trades Jack of None

OpenAI is treading on thin ice. According to a recent report, the ChatGPT maker could lose as much as $5 billion this year. Regarding costs, OpenAI is  projected to spend nearly $4 billion this year on renting Microsoft’s servers to power ChatGPT and its underlying LLMs.

OpenAI’s expenses for training, including data acquisition, could soar to nearly $3 billion. Additionally, with a rapidly growing workforce of around 1,500 employees, its costs could reach approximately $1.5 billion, driven partly by fierce competition for technical talent with companies like Google.

In total, OpenAI’s operational expenses this year could reach up to $8.5 billion.

OpenAI revenue

However, there is still hope. OpenAI recently generated $3.4 billion annual revenue. 

ChatGPT Plus, the premium version of OpenAI’s popular chatbot, emerged as the primary revenue driver, contributing $1.9 billion. The service boasts 7.7 million subscribers paying $20 per month. Following closely is ChatGPT Enterprise, which brought in $714 million from 1.2 million subscribers at $50 per month. The API generated $510 million, while ChatGPT Team added $290 million from 80,000 subscribers paying $25 monthly.

The question arises – “How long can OpenAI continue operating this way?” Without raising funds soon, the company may run out of cash to sustain its operations.

OpenAI may end up losing $5 billion this year and run out of cash in 12 months unless it raises more money. Investors should ask: What is their moat? Unique tech? What is their route to profitability when Meta is giving away similar tech for free? Do they have a killer app? Will the tech ever be reliable? What is real and what is just a demo?” posted Gary Marcus on X. He may not be wrong.

OpenAI has raised approximately $13.3 billion in total funding to date from Microsoft.

OpenAI’s Jugaad 

OpenAI is cutting costs with GPT-4o Mini. At 15 cents per million input tokens and 60 cents per million output tokens, GPT-4o Mini is over 60% cheaper than GPT-3.5 Turbo. Recently, CEO Sam Altman posted that GPT-4o mini is “already processing more than 200B tokens per day!” 

If we do some number crunching, processing 200 billion tokens daily, GPT-4o Mini costs OpenAI $90,000 per day, compared to $225,000 with GPT-3.5 Turbo. This transition allows OpenAI to save $135,000 daily, showcasing enhanced efficiency and cost-effectiveness.

OpenAI Has no Money, no Moat 

Regardless of these reports, OpenAI has now announced SearchGPT, a prototype with AI search features that delivers quick, timely answers, with clear and relevant sources. However, Perplexity AI, the much touted Google-search alternative, already offers similar capabilities. 

Moreover, Altman has also revealed that the voice feature on GPT-4o will be rolled out to Alpha users next week. 

However, Hume AI has already been leading the way with its Empathetic Voice Interface (EVI). Companies such as SenseTime and Kyutai are also developing voice-based products like SenseNova 5.0 and Moshi, respectively. 

At Google I/O 2024, Google introduced its AI agent Project Astra, which processes and integrates multiple modalities of data, including text, speech, images, and video

Apple has also upgraded Siri with ‘Apple Intelligence,’ enabling it to perform more actions for users.

From the perspective of LLMs, Meta released Llama 3.1, which performs even better than OpenAI’s GPT-4o and GPT-4o Mini in categories such as general knowledge, reasoning, reading comprehension, code generation, and multilingual capabilities. 

Tweet by NIC

“Guys, fine-tuned Llama 3.1 8B is completely cracked. Just ran it through our fine-tuning test suite and it blows GPT-4o mini out of the water on every task,” posted a user on X. “There has never been an open model this small, this good.”

Within a day, Paris-based Mistral AI has also released Mistral Large 2, which offers substantial improvements in code generation, mathematics, and multilingual support, even outperforming Llama 3.1 in code generation and math abilities. 

With a 128k context window and support for dozens of languages, including French, German, Spanish, and Chinese, Mistral Large 2 aims to cater to diverse linguistic needs. It also supports 80+ coding languages, such as Python, Java, and C++.

This puts the race to building LLMs and SLMs back into perspective. The case now seems that OpenAI, which was the pioneer in generative AI, is lagging behind when it comes to releases.

Not to forget, OpenAI has been postponing the launch of Sora, which is now expected later this year. Meanwhile, the video generation space has seen a surge of startups like Kling, RunwayML, and Luma AI. Recently, Kling AI, the Chinese company, announced the global launch of Kling AI International Version 1.0.

Simply put, OpenAI is certainly a jack of all trades, master of none.  

The post Is OpenAI ‘Sam’ of All Trades, Master of None?  appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/is-openai-sam-of-all-trades-master-of-none/feed/ 0
Meta’s Llama 3.1 is Starting to Look a Lot Like OpenAI’s GPT-4o https://analyticsindiamag.com/ai-origins-evolution/metas-llama-3-1-is-starting-to-look-a-lot-like-openais-gpt-4o/ https://analyticsindiamag.com/ai-origins-evolution/metas-llama-3-1-is-starting-to-look-a-lot-like-openais-gpt-4o/#respond Tue, 23 Jul 2024 12:01:23 +0000 https://analyticsindiamag.com/?p=10130007

Rumours suggest that Meta has already begun training Llama 4, which is expected to be multimodal with audio features and integrated into the Meta Ray-Ban glasses.

The post Meta’s Llama 3.1 is Starting to Look a Lot Like OpenAI’s GPT-4o appeared first on AIM.

]]>

Mark Zuckerberg seems to be on the right track as Meta prepares to unveil the next iteration of Llama 3.1, expected to be released today. The new version will come in three sizes: 8B, 70B, and 405B, with a context length of 128K. 

However, even before Meta could officially release the model, its benchmark card was leaked, and it is doing rounds on social media.

According to the leaked information, Llama 3.1 has been trained on over 15 trillion tokens sourced from publicly available datasets. Their fine-tuning data comprises publicly available instruction-tuning datasets, along with an additional 15 million synthetic samples. 

The models are explicitly advertised as multilingual, offering support for French, German, Hindi, Italian, Portuguese, Spanish, and Thai. 

According to benchmarks, Llama 3.1 outperforms OpenAI’s GPT-4o in categories such as general knowledge, reasoning, reading comprehension, code generation, and multilingual capabilities.  “Open-source is about to be SOTA — even the 70B is > gpt-4o, and this is before instruct tuning, which should make it even better,” posted  a user on X. 

Llama 3.1 405B achieves a macro average accuracy of 85.2% on the MMLU benchmark, whereas GPT-4o scores 87.5%. This indicates that while GPT-4o performs well, Llama 3.1 is highly competitive.

“The 70b is really encroaching on the 405b’s territory. I can’t imagine it being worthwhile to host the 405B. This feels like a confirmation that the only utility of big models right now is to distil from it,” posted another user. 

Llama 3.1 405B is expected to be highly effective in generating datasets for smaller models. One user on Reddit pointed out that this could be a major advancement for “distillation”, likening it to the relationship between GPT-4 and GPT-4o. 

They suggested using the 3.1 70B for “fast inference” and the Llama 3.1 405B for dataset creation and critical flows. “Who will use Llama-3.1-405B to create the best training datasets for smaller models?” asked Jiquan Ngiam, founder of Lutra AI. 

“Honestly might be more excited for 3.1 70b and 8b. Those look absolutely cracked, must be distillations of 405b,” posted another user on Reddit, who goes by the name thatrunningguy.

OpenAI co-founder Andrej Karpathy also explained that in the future, as larger models help refine and optimise the training process, smaller models will emerge. “The models have to first get larger before they can get smaller because we need their (automated) help to refactor and mould the training data into ideal, synthetic formats.”

Last week, we saw the release of several small models that can be run locally without relying on the cloud. Small language models, or SLMs, are expected to become the future alongside generalised models like GPT-4 or Claude 3.5 Sonnet.  

“For everyday use, an 8B or even a 70B LLM will suffice. If you don’t need to push a model to its limits, a SOTA model isn’t necessary for routine questions.” 

OpenAI has just caught its breath

OpenAI’s recent compact and cost-effective model, GPT-4o mini, has excelled on benchmarks, achieving 82% on MMLU, 87% on MGSM for maths reasoning, and 87.2% on HumanEval for coding tasks. However, Meta’s Llama 3.1 70B Instruct is closely competitive, matching these impressive scores.

“GPT-4o mini, launched just 4 days ago, is already processing over 200 billion tokens per day! I’m very happy to hear how much people are enjoying the new model,” posted OpenAI chief Sam Altman on X.

OpenAI’s ongoing concern has been the computational resources required, which delays the development of their next frontier model. Notably, GPT-4o’s voice capabilities have not yet been made available, and Sora remains unpublished for general use. 

Meanwhile OpenAI has been holding talks with chip designers including Broadcom about working on the chip to reduce its dependency on NVIDIA. Notably, CEO Jensen Huang personally hand-delivered the first NVIDIA DGX H200 to OpenAI. 

OpenAI has recently begun training its next frontier model most likely to be GPT-5  and the company anticipates the resulting systems to bring us to the next level of capabilities on our path to AGI. 

At Microsoft Build, CTO Kevin Scott said that if the system that trained GPT-3 was a shark and GPT-4 an orca, the model being trained now is the size of a whale. “This whale-sized supercomputer is hard at work right now,” he added.

“We’re bringing in the latest H200s to Azure later this year and will be among the first cloud providers to offer NVIDIA’s Blackwell GPUs in B100 as well as GB200 configurations,” said Microsoft chief Satya Nadella.

On the other hand, earlier this year, Zuckerberg announced that they are building massive compute infrastructure to support their future roadmap, including 350,000 H100s by the end of this year, and a total of nearly 600,000 H100-equivalent compute units.

With Llama 3.1, Meta has made it clear that their focus spans the entire LLM market, regardless of size. Rumours suggest that Meta has already begun training Llama 4, which is expected to be multimodal with audio features and integrated into the Meta Ray-Ban glasses.

The post Meta’s Llama 3.1 is Starting to Look a Lot Like OpenAI’s GPT-4o appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/metas-llama-3-1-is-starting-to-look-a-lot-like-openais-gpt-4o/feed/ 0
AI Still Needs Humans for the Cumbersome Task of Data Annotation https://analyticsindiamag.com/industry-insights/ai-still-needs-humans-for-the-cumbersome-task-of-data-annotation/ https://analyticsindiamag.com/industry-insights/ai-still-needs-humans-for-the-cumbersome-task-of-data-annotation/#respond Mon, 22 Jul 2024 09:20:01 +0000 https://analyticsindiamag.com/?p=10129795

Currently, foundational model-based platforms can incorporate feedback and corrections from domain experts to improve the accuracy of annotations.

The post AI Still Needs Humans for the Cumbersome Task of Data Annotation appeared first on AIM.

]]>

According to recent studies, OpenAI’s GPT-4 has been successful in accurately annotating cell types using marker gene information in single-cell RNA sequencing analysis.

Generative AI has expanded the use cases of data annotation and labelling. However, human expertise will always remain core to its success.

Acknowledging the same, Radha Basu, the co-founder and CEO of iMerit, told AIM in an exclusive interaction, “Traditional data annotation methods, which relied on low-skill crowdsourced workforces, were effective for simple tasks but were limited in scalability and efficiency for complex data sets.”

Now, GPT-4’s annotations aligning closely with manual ones across various tissues and cell types has helped cut down the time and expertise needed for annotation.

“Firstly, such (generative AI) models inherently have a larger context window than traditional predictive AI. Secondly, the need for data training is more at the expert practitioner level since the underlying foundation model is built with unstructured or semi-structured data,” added Basu. 

However, there have been mixed opinions on the future of data annotation platforms with generative AI. 

As per recent reports, the global data annotation market is projected to be valued at around $8.22 billion by 2028. But Voxel51 co-founder and University of Michigan professor Jason Corso is of the opinion that data annotation jobs are slowly becoming obsolete.

Contrary to his expectations, however, technology is not eating up traditional data annotation jobs.

Generative AI’s larger context window and expert-level training capabilities differentiate it from traditional data annotation methods, making it more adaptable. 

Currently, foundational model-based platforms can incorporate feedback and corrections from domain experts, such as medical professionals, agronomists, or mathematicians, to refine and improve the accuracy of annotations.

“This expert collaboration ensures the data used to train AI models is of higher quality,” Basu highlighted. “Combining generative AI’s processing power with the knowledge of trained professionals leads to more reliable and accurate AI systems.”

Basu noted that LLM-based systems can handle more intricate data labelling tasks by integrating automation into workflows, improving efficiency and scalability without long hours of manual effort. Meanwhile, human annotation has been vital for AI, encouraging the growth of supervised machine learning.

So now, although self-supervised learning and auto-annotation are emerging, they cannot fully replace human annotations.

How Does iMerit Fit in?

iMerit’s proprietary model Ango Hub leverages generative AI to propel growth in different industries such as medicine, autonomous mobility, and precision agriculture. Ango Hub offers a flexible workflow manager that integrates human and machine efforts, allowing public models to be included in the process. 

The platform utilises a curated network of experts for tasks like reinforcement learning with human feedback (RLHF) and supervised fine tuning (SFT), providing high-quality, domain-specific data training. 

The ability of GenAI to handle and integrate various data types will further improve its role in data annotation, leading to more comprehensive and accurate AI models.

Apart from being one of the exceptional technologists, Basu is also known to foster a good work culture. Her leadership strategy centres on authenticity, dedication, and innovation, focussing on giving one’s all and taking risks. 

“My leadership style is one of constant innovation while staying focused on my company’s role in the ecosystem and what our customers actually need from us. I am always looking at embracing change, mixing and matching the three pillars – technology, talent, and technique,” she added.

In the context of AI and ML innovations, the vision for iMerit is to build a responsible, inclusive organisation by prioritising a diverse, motivated workforce, applying technology to societal needs, and maintaining sustainable business practices with strong financial discipline. 

About 52% of the organisation comprises women. 

India as a Market

Over the past decade, the company has adapted to the shifting data requirements by evolving from simple, prescriptive tasks to more complex, domain-specific projects requiring a consultative approach and expert collaboration. 

This evolution included the adoption of heavy dashboards and production metrics, automation, and workflow orchestration. 

Basu said that India has managed to become a rapidly relevant market due to the rise of GenAI companies.

“India is a very exciting market due to the rapid growth in GenAI companies and the interest in building an India stack, including local languages and problems to be solved,” said Basu.

The revenue share from the Indian market has seen rapid growth during 2023-24.

Meanwhile, there has also been a noticeable increase in inquiries for such data corpus creation, domain tuning, and red-teaming in various local languages, leading to active collaborations with customers on Indian language-based stacks.

“We have to do justice to both [the customer and the workforce] in order to achieve quality and consistency,” Basu emphasised.

This has further been recognised as iMerit and Ango Hub have managed to win two awards in India this year alone for being best in class in terms of machine learning, application and solutions provided.

Looking ahead, Basu believes that the integration of generative AI with multimodal data (combining image, speech, text, LiDAR, and video) is expected to change the visual domain in industries such as medical AI and autonomous mobility.

The post AI Still Needs Humans for the Cumbersome Task of Data Annotation appeared first on AIM.

]]>
https://analyticsindiamag.com/industry-insights/ai-still-needs-humans-for-the-cumbersome-task-of-data-annotation/feed/ 0
The OpenAI Mafia Just Got Bigger https://analyticsindiamag.com/ai-origins-evolution/the-openai-mafia-just-got-bigger/ https://analyticsindiamag.com/ai-origins-evolution/the-openai-mafia-just-got-bigger/#respond Mon, 22 Jul 2024 09:12:35 +0000 https://analyticsindiamag.com/?p=10129790

Close to 75 employees have left OpenAI and founded around 30 AI startups. 

The post The OpenAI Mafia Just Got Bigger appeared first on AIM.

]]>

The OpenAI mafia just got bigger. Last year, AIM set out to investigate how many former OpenAI employees had quit the company to launch their own ventures. The results are astonishing! Close to 75 employees have left OpenAI and founded around 30 AI startups

Most recently, OpenAI co-founder Andrej Karpathy turned his decades-old passion for AI and education into a company called Eureka Labs. “It’s a new kind of school that is AI-native, combining generative AI with traditional learning methods,” said Karpathy. 

However, Karpathy’s departure from OpenAI wasn’t as dramatic as Ilya Sutskever. 

“‘Nothing “happened,” said Karpathy, “and it’s not a result of any particular event, issue, or drama (but please keep the conspiracy theories coming as they are highly entertaining :)).'”

Meanwhile, Sutskever left OpenAI, citing AGI safety concerns. Months later, he launched Safe Superintelligence. 

Along with Sutsekver, Jan Leike, OpenAI’s head of alignment, also resigned and joined Anthropic, an AI company founded by former OpenAI employees. At Anthropic, Leike announced he would continue working on the “super alignment mission”, focusing on scalable oversight, weak-to-strong generalisation, and automated alignment research.

“I think the OpenAI mafia will reach #1 in a few years. Ex-employees of OpenAI in senior positions left to start their own cutting-edge projects like Anthropic and xAI, and they are wildly successful,” posted a user on X. “Prediction: the group at OpenAI will make the PayPal mafia look like peanuts,” posted another user.

To be honest, OpenAI is an intriguing place to work. Recently, Steven Heidel, an employee at OpenAI, posted, “The only real downside to working at OpenAI is all the cool stuff you get to see but aren’t allowed to talk about yet.” 

Super Dramatic Founders  

Another notable startup emerging from OpenAI is Perplexity AI. Interestingly, when Aravind Srinivas joined OpenAI as a research intern, Elon Musk left the company. “I got into OpenAI for an internship in 2018, when Musk was still there,” recalled Srinivas in a recent interview

“I remember Musk left around the time I joined. I was idolizing this guy, and then he calls an all-hands meeting, announces that he’s no longer going to be involved, and swears at people left, right, and center before leaving the room. It was a lot of drama.”

Notably, last year Musk founded his own AI startup, xAI. One of xAI’s first major products is Grok, an AI chatbot integrated with X (formerly Twitter). Grok was unveiled in November 2023, with subsequent improvements leading to the Grok-1.5 model, which includes long-context capabilities and image understanding. 

One cannot definitely overlook Anthropic, founded by ex-OpenAI employees Dario Amodei and Daniela Amodei, which is now giving OpenAI stiff competition. The company recently released Claude Sonnet, a model that rivals GPT-4o in performance.

Tim Salimans, a former research scientist and team lead at OpenAI, founded Aidence, a company specialising in generative modeling, semi-supervised and unsupervised deep learning, and reinforcement learning in 2015. Tim Shi, previously a member of technical staff at OpenAI, founded Cresta in 2017, which uses AI to help sales and service agents enhance their customer interactions.

In 2021, a former research intern at OpenAI, Anish Athalye, started Cleanlab, a data-centric AI platform that automatically finds and fixes errors in ML datasets. Another former research engineer at OpenAI, Taehoon Kim, recently founded Symbiote AI, a real-time, 3D avatar generation platform. 

Other prominent startups from the OpenAI alumni network include Pilot, Covariant, Adept, Living Carbon, Quill (now part of Twitter), and Daedalus, among others.

Will we see more? It’s highly likely that more employees might leave the company to start their own ventures. Last year, Altman bragged that OpenAI had a very lean team of about 375 employees. However, when he was fired, at least 745 of OpenAI’s approximately 750 staff signed a letter demanding his reinstatement, indicating that the company’s headcount is gradually increasing.

OpenAI had approximately 2,500 employees as of last month and who knows how many more AI startups would spin out of the company. 

It is fascinating to see new tech entrepreneurs being born out of the OpenAI ecosystem, which is acting as a training ground for future AI leaders. “For a company like ours, the researchers and engineers that create the tech have far more impact than the CEO,” said Altman.

A quick look at its work culture shows that OpenAI encourages its employees to take creative risks in pursuit of advancements in AI and emphasis on collaboration, open communication and more. 

The post The OpenAI Mafia Just Got Bigger appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/the-openai-mafia-just-got-bigger/feed/ 0
The Need for Prover-Verifier Games for LLMs  https://analyticsindiamag.com/ai-breakthroughs/the-need-for-prover-verifier-games-for-llms/ https://analyticsindiamag.com/ai-breakthroughs/the-need-for-prover-verifier-games-for-llms/#respond Fri, 19 Jul 2024 04:47:32 +0000 https://analyticsindiamag.com/?p=10129573 Prover- Verifier Games

The methodology will further remove reliance on human judgement for AI model legibility.

The post The Need for Prover-Verifier Games for LLMs  appeared first on AIM.

]]>
Prover- Verifier Games

OpenAI sprung out of its silent zone with a new research paper on ‘Prover-Verifier Games’ (PVG) for LLMs. PVGs look to improve the ‘legibility’ of LLM outputs or rather make sure that LLMs produce understandable and logical text even for complex tasks such as solving maths problems or coding. 

In this method, OpenAI trained advanced language models to generate text that can be easily verified by weaker models. It was observed that this training improved the comprehensibility of the text for human evaluators, which hints at improving legibility. 

The ‘Prover’ and ‘Verifier’

“Techniques like this seem promising for training superhuman models to explain their actions in a way that humans can understand better (and get less fooled by). I’d be excited to see this method tried on harder tasks and with stronger models,” said Jan Leike, the co-author and former researcher at OpenAI, who had worked on the recent PVG paper. 

The paper is based on the first concept of PVG released in 2021, which is a game-theoretic framework designed to incentivise learning agents to resolve decision problems in a verified manner. 

Akin to a check system, a ‘prover’ generates a solution which a ‘verifier’ checks for accuracy. OpenAI’s method trains small verifiers to judge solution accuracy, encourages “helpful” provers to produce correct solutions approved by verifiers, and tests “sneaky” provers with incorrect solutions to challenge verifiers. 

It was noticed that over the course of training, the prover’s accuracy and the verifier’s robustness to adversarial attacks increased. Interestingly, the PVG system alludes to a form of reinforcement learning, something OpenAI’s co-founder and former chief scientist Ilya Sutskever was a strong advocate of. 

Prover-Verifier Games for LLMs

Source: X

Looking back at the history of OpenAI’s models much before ChatGPT, the company had been extensively working on reinforcement learning systems. In 2018, OpenAI Five, which was built on five neural networks, defeated human teams at Dota 2. The system played 180 years worth of games against itself – a sort of reward mechanism in the loop to train itself. 

“The neural network is going to take the observations and produce actions and then for a given setting of the parameters, you could figure out how to calculate how good they are. Then you could calculate how to compute the way to change these parameters to improve the model,” said Sutskever at an old Berkeley EECS seminar

Interestingly, PVG also works on similar lines. However, it comes with its limitations. The experiment was done on maths problems which have an answer that can be tested via the right and wrong method. However, with topics that come with broad subjectivity, the PVG system for an LLM may struggle. 

“It’s hard and expensive to codify the rules of life. How do we objectively determine whether one poem is more beautiful than another?

“I think a very interesting metric would be to measure the accuracy of the fine-tuned models on unrelated tasks to see if the lessons learned to be better at explaining maths problems would help the model perform better on explaining other problems (such as logic or reasoning),” said a user on HackerNews

PVG for Superintelligence

Source: X

The prover-verifier gaming system looks to improve the accuracy of LLM-generated results. Not just that, it also sets the next path for achieving superintelligence. 

The methodology has a significant advantage in reducing the dependence on human demonstrations or judgments of legibility. This independence is particularly relevant for future superintelligence alignment. 

While the study focused on a single dataset and currently necessitates ground truth labels, it is anticipated that these methodologies will prove pivotal in the development of AI systems. Their goal is not only to ensure correctness in outputs but also to facilitate transparent verification, thereby enhancing trust and safety in real-world applications. However, will the new method form the next standard for LLM accuracy, is something that remains to be seen. 

The post The Need for Prover-Verifier Games for LLMs  appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/the-need-for-prover-verifier-games-for-llms/feed/ 0
OpenAI Introduces GPT-4o Mini, 30x Cheaper than GPT-4o https://analyticsindiamag.com/ai-news-updates/openai-introduces-gpt-4o-mini-30x-cheaper-than-gpt-4o/ https://analyticsindiamag.com/ai-news-updates/openai-introduces-gpt-4o-mini-30x-cheaper-than-gpt-4o/#respond Thu, 18 Jul 2024 18:48:56 +0000 https://analyticsindiamag.com/?p=10129558

And smarter than Claude Haiku and Gemini Flash

The post OpenAI Introduces GPT-4o Mini, 30x Cheaper than GPT-4o appeared first on AIM.

]]>

LLMs are getting cheaper. OpenAI just released GPT-4o mini, a highly cost-efficient small model designed to expand AI applications by making intelligence more affordable. 

Priced at 15 cents per million input tokens and 60 cents per million output tokens, GPT-4o mini is 30x cheaper than GPT-40 and 60% cheaper than GPT-3.5 Turbo. 

OpenAI chief Sam Altman made a cost comparison, saying, “Way back in 2022, the best model in the world was text-davinci-003. It was much, much worse than this new model. It cost 100x more.”

The model excels in various tasks, including text and vision, and supports a context window of 128K tokens with up to 16K output tokens per request. GPT-4o mini demonstrates superior performance on benchmarks, scoring 82% on the MMLU, 87% on MGSM for math reasoning, and 87.2% on HumanEval for coding tasks. It outperforms other small models like Gemini Flash and Claude Haiku in reasoning, math, and coding proficiency.

GPT-4o mini’s low cost and latency enable a wide range of applications, from customer support chatbots to API integrations. It currently supports text and vision, with future updates planned for text, image, video, and audio inputs and outputs.

Safety measures are integral to GPT-4o mini, incorporating techniques like reinforcement learning with human feedback (RLHF) and the instruction hierarchy method to improve model reliability and safety.

GPT-4o mini is now available in the Assistants API, Chat Completions API, and Batch API. It will be accessible to Free, Plus, and Team users in ChatGPT today, and to Enterprise users next week. Fine-tuning capabilities will be introduced soon.

GPT-4o mini comes after OpenAI co-founder Andrej Karpathy recently demonstrated how the cost of training large language models (LIMs) has significantly decreased over the past five years, making it feasible to train models like GPT-2 for approximately $672 on “one 8XH100 GPU node in 24 hours”.

“Incredibly, the costs have come down dramatically over the past five years due to improvements in compute hardware (H100 GPUs), software (CUDA, cuBLAS, cuDNN, FlashAttention) and data quality (e.g., the FineWeb-Edu dataset),” said Karpathy. 

That explains how Tech Mahindra was able to build Project Indus for well under $5 million, which again, is built on GPT-2 architecture, starting from the tokeniser to the decoder. 

It would be interesting to see what innovative applications developers will create using this new AI model. 

Looks like it’s already in motion. A few days back a mysterious model had appeared on the Chatbot Arena. Unsurprisingly, that model is none other than GPT-4o mini. 

With over 6K user votes, the model reached GPT-4 Turbo performance. 

The post OpenAI Introduces GPT-4o Mini, 30x Cheaper than GPT-4o appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-introduces-gpt-4o-mini-30x-cheaper-than-gpt-4o/feed/ 0
OpenAI Introduces GPT-4o Mini, Phases Out GPT-3.5 https://analyticsindiamag.com/ai-news-updates/openai-introduces-gpt-4o-mini-phases-out-gpt-3-5/ https://analyticsindiamag.com/ai-news-updates/openai-introduces-gpt-4o-mini-phases-out-gpt-3-5/#respond Thu, 18 Jul 2024 15:50:32 +0000 https://analyticsindiamag.com/?p=10129546 Soon, ChatGPT (Powered by GPT-4o) will Replace Your ‘Senior Employees’

GPT-4o mini will be available to free users and ChatGPT Plus and Team subscribers starting today, with enterprise customers gaining access next week.

The post OpenAI Introduces GPT-4o Mini, Phases Out GPT-3.5 appeared first on AIM.

]]>
Soon, ChatGPT (Powered by GPT-4o) will Replace Your ‘Senior Employees’

OpenAI has announced the release of GPT-4o mini, a more affordable and streamlined version of its flagship AI model, GPT-4o, according to a report by CNBC. The new model aims to cater to a broader range of developers and businesses in the competitive AI services market.

GPT-4o mini will be available to free users and ChatGPT Plus and Team subscribers starting today, with enterprise customers gaining access next week. This model will replace GPT-3.5 Turbo in ChatGPT, providing updated functionality at a lower cost.

GPT-4o’s announcement of the mini AI model is a key part of OpenAI’s initiative to lead in “multimodality,” integrating various forms of AI-generated media—text, images, audio, and video—within the ChatGPT platform.

OpenAI introduced GPT-4o in May, highlighting its ability to process audio and visual information in real time. While some of these features are still pending release due to safety concerns, GPT-4o mini offers comparable capabilities, with plans to expand its functionality over time.

AI companies, including Anthropic and Alphabet’s Google, often release smaller, less advanced versions of their models to provide developers with more options. Smaller models are suitable for high-volume, basic tasks, while larger models handle more complex work. Developers may choose to use both within a single application.

“In our mission to enable the bleeding edge, we want to continue developing frontier models while also offering the best small models,” said Olivier Godement, head of product for OpenAI’s API, in an interview with Bloomberg News. Over the past week, some developers have tested GPT-4o mini, with email startup Superhuman using it for automated replies and financial services startup Ramp using it to extract information from receipts.

Initially, GPT-4o mini will handle text and image inputs and outputs. OpenAI plans to expand its capabilities to process other types of content in the future.GPT-4o mini is also the first model to incorporate a new safety approach called “instruction hierarchy,” which prioritises certain instructions over others. This tactic aims to prevent the AI from performing undesirable actions by giving precedence to directives from organisations.

The post OpenAI Introduces GPT-4o Mini, Phases Out GPT-3.5 appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-introduces-gpt-4o-mini-phases-out-gpt-3-5/feed/ 0
Chinese AI Companies Surpass American Rivals https://analyticsindiamag.com/ai-insights-analysis/chinese-ai-companies-surpass-american-rivals/ https://analyticsindiamag.com/ai-insights-analysis/chinese-ai-companies-surpass-american-rivals/#respond Wed, 17 Jul 2024 12:40:40 +0000 https://analyticsindiamag.com/?p=10129434

Forget text generation—China has leaped ahead in video generation models too

The post Chinese AI Companies Surpass American Rivals appeared first on AIM.

]]>

Chinese tech heavyweight SenseTime recently released SenseNova 5.5 at the World Artificial Intelligence Conference in Shanghai, claiming a 30% performance increase over its predecessor and superiority over GPT-4o in several criteria.

The company said that key enhancements include improved mathematical reasoning, English proficiency, and command following capabilities, putting it on par with GPT-4o in terms of interactivity and other core indicators.

In early May, SenseTime released a demo similar to OpenAI’s GPT-4o demo, showing off the model’s visual skills. SenseNova 5.5 can recognise and describe specific items by pointing a smartphone camera at them while the AI operates.

Similarly, Baidu USA CEO Robin Li claimed that Ernie 4.0 can produce better results than GPT-4o. Meanwhile, Alibaba’s Tongyi Qianwen (Qwen) models saw downloads shooting up to 20 million, tripling in only two months.

Forget text generation—China has leaped ahead in video generation models too. While the world awaits the official release of Sora, the internet is abuzz with Kling’s videos depicting a series of animals enjoying a meal of noodles.

In the meantime, OpenAI recently announced that as of July 9, it had blocked API access for users in unsupported countries and territories, including mainland China, Hong Kong, and Macau.

China’s Race To Dominance

Hugging Face co-founder and CEO Clement Delangue praised the progress made by Chinese AI firms in a post on X, saying, ‘Qwen 72B is the king, and Chinese open models are dominating overall.'” This is proven by the fact that Alibaba’s Qwen 2-72B model has claimed the top spot on Hugging Face’s current LLM Leaderboard, outperforming all other open-source models. 

“China’s advantage is doing whatever it takes to catch up,” said Kai-Fu Lee, a Taiwanese computer scientist and founder of China-based AI startup 01.AI. Lee’s firm open-sourced Yi-34B, its foundational LLM that outperformed Llama 2 on various benchmarks.

He went on to explain how 14 months ago, they had nothing, including no GPT. “At that time, we were six or seven years behind, and at this moment, we are six to nine months behind. So the catch-up has already been happening. Going forward, we hope that it will continue,” Lee said. 

In May, China-based DeepSeek open-sourced its DeepSeek LLM, a 67-billion parameter model trained from scratch on a dataset consisting of 2 trillion tokens in both English and Chinese, hinting at its bid to go global. The model managed to outperform Llama 2, Claude 2, and Grok-1 in various metrics.

Government Support 

Chinese tech companies have also been receiving support from the government in its ongoing AI battle against America. The Chinese government has made AI a national priority, aiming to become the world leader in AI by 2030.  

The Cyberspace Administration of China (CAC) has issued approvals for over 40 LLMs in the past six months, granting operational licences to 1,432 AI-driven applications.

Meanwhile, according to a survey, over 80% of Chinese business leaders surveyed are currently using GenAI in their operations, way above the global average of 54% and the US average of 65%.

China also dominates the global race in GenAI patents, filing more than 38,000 patents from 2014 to 2023, according to a UN report. That’s six times more than those filed by the US-based inventors. Geographically, China leads with 38,210 inventions, far surpassing the US (6,276), South Korea (4,155), Japan (3,409) and India (1,350).

With the release of these models, the fear of China rising against US open-source models, particularly in the field of generative AI, is not unfounded. This competition is driving significant advancements in AI technology.

Just a Copycat? 

But despite the nation’s mad rush to develop generative AI, Chinese businesses are almost wholly dependent on American underpinnings, such as open-sourced foundational research and technology developed by leading US companies and research institutions.

“As a measure of how far behind they are, leading Chinese firms are comparing their performance to ChatGPT,” said Paul Triolo, technology policy lead and senior VP for China, Dentons Global Advisors.

China’s businesses typically use “fine-tuned versions of Western models” because their own AI models “aren’t very good,” said Jenny Xiao, a partner at San Franciscan venture capital firm Leonis Capital. She added that Silicon Valley is unquestionably far ahead of the curve.
For instance, some of the technology in Chinese firm 01.AI, which released its open-source model, came from LLaMA. Former Google CEO Eric Schmidt said that while China intends to take the lead in several industries, the US is still far ahead in artificial intelligence.

The post Chinese AI Companies Surpass American Rivals appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-insights-analysis/chinese-ai-companies-surpass-american-rivals/feed/ 0
Chinese Company SenseTime Releases SenseNova 5.5, Beats OpenAI’s GPT-4o https://analyticsindiamag.com/ai-news-updates/chinese-company-sensetime-releases-sensenova-5-5-beats-openais-gpt-4o/ https://analyticsindiamag.com/ai-news-updates/chinese-company-sensetime-releases-sensenova-5-5-beats-openais-gpt-4o/#respond Tue, 16 Jul 2024 10:24:04 +0000 https://analyticsindiamag.com/?p=10129275

SenseTime has expanded its suite of applications with the release of Vimi, an AI avatar video generator capable of creating short video clips.

The post Chinese Company SenseTime Releases SenseNova 5.5, Beats OpenAI’s GPT-4o appeared first on AIM.

]]>

SenseTime, a Chinese AI company, has released its upgraded SenseNova 5.5 LLM  at the 2024 World Artificial Intelligence Conference & High-Level Meeting on Global AI Governance (WAIC 2024). 

The new release includes China’s first real-time multimodal model, SenseNova 5o, which offers streaming interaction capabilities comparable to OpenAI’s GPT-4o.

The SenseNova 5.5 upgrade boasts a 30% improvement in overall performance compared to its predecessor, SenseNova 5.0, released just two months ago. Key enhancements include improved mathematical reasoning, English proficiency, and command following abilities, putting it on par with GPT-4o in terms of interactivity and core indicators.

Dr. Xu Li, Chairman of the Board and CEO of SenseTime, emphasised the significance of this release, saying, “This is a critical year for large models as they evolve from unimodal to multimodal. In line with users’ needs, SenseTime is also focused on boosting interactivity.”

The company has also introduced a cost-effective edge-side large model, reducing the cost per device to as low as RMB 9.90 per year. This move aims to facilitate widespread deployment across various IoT devices, including smartphones, tablets, and in-vehicle computers.

SenseTime has expanded its suite of applications with the release of Vimi, an AI avatar video generator capable of creating short video clips with precise control over facial expressions and upper body movements from a single photo. The company has also upgraded its SenseTime Raccoon Series, improving coding precision and response speed in the Code Raccoon tool.

To lower entry barriers for enterprise users, SenseTime launched the “Project $0 Go” scheme, offering a free onboarding bundle for new enterprise users migrating from the OpenAI platform.

The SenseNova Large Model has already been deployed at more than 3,000 government and corporate customers across various industries, including technology, healthcare, finance, and programming. 

SenseTime continues to develop AI applications for vertical industries such as finance, agriculture, cultural tourism, and healthcare, aiming to boost productivity and cost-efficiency in these sectors.

Kyutai, a French non-profit AI research laboratory, has introduced Moshi, a real-time native multimodal foundational AI model. This open-source project features a voice-enabled AI assistant offering capabilities that rival OpenAI’s GPT-4o and Google’s Astra.

Meanwhile, Anthropic’s latest model, Claude Sonnet 3.5, continues to challenge GPT-4o by recently dethroning it and securing the top spot in both the Coding Arena and Hard Prompts Arena.

The post Chinese Company SenseTime Releases SenseNova 5.5, Beats OpenAI’s GPT-4o appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/chinese-company-sensetime-releases-sensenova-5-5-beats-openais-gpt-4o/feed/ 0
OpenAI Secretly Working on Project ‘Strawberry’ to Enhance Reasoning and Build Autonomous AI Agents https://analyticsindiamag.com/ai-news-updates/openai-secretly-working-on-project-strawberry-to-enhance-reasoning-and-build-autonomous-ai-agents/ https://analyticsindiamag.com/ai-news-updates/openai-secretly-working-on-project-strawberry-to-enhance-reasoning-and-build-autonomous-ai-agents/#respond Sat, 13 Jul 2024 06:13:35 +0000 https://analyticsindiamag.com/?p=10126753

Project Strawberry was previously known as Project Q*, which was leaked last year and was capable of solving previously unseen math problems.

The post OpenAI Secretly Working on Project ‘Strawberry’ to Enhance Reasoning and Build Autonomous AI Agents appeared first on AIM.

]]>

OpenAI, the creator of ChatGPT, is reportedly working on a new AI technology under the code name “Strawberry.” This project aims to significantly enhance the reasoning capabilities of its AI models, as revealed by internal documents and a source familiar with the development.

The project’s specifics, which have not been previously disclosed, involve a novel approach that allows AI models to plan ahead and navigate the internet autonomously to perform in-depth research. This advancement could address current limitations in AI reasoning, such as common sense problems and logical fallacies, which often lead to inaccurate outputs.

Project Strawberry was previously known as Project Q*, which was leaked last year and was capable of solving previously unseen math problems.

OpenAI’s teams are working on Strawberry to improve the models’ ability to perform long-horizon tasks (LHT), which require planning and executing a series of actions over an extended period. 

The project involves a specialised “post-training” phase, adapting the base models for enhanced performance. This method resembles Stanford’s 2022 “Self-Taught Reasoner” (STaR), which enables AI to iteratively create its own training data to reach higher intelligence levels.

A spokesperson from OpenAI acknowledged ongoing research into new AI capabilities but did not directly address the specifics of Strawberry. The internal document indicates that Strawberry includes a “deep-research” dataset to train and evaluate the models, although the contents of this dataset remain undisclosed.

In recent months, OpenAI has privately hinted at releasing technology with advanced reasoning capabilities, aiming to overcome challenges in AI research and development. 

This innovation is expected to enable AI to conduct research autonomously, using a “computer-using agent” (CUA) to take actions based on its findings. Additionally, OpenAI plans to test Strawberry’s capabilities in performing tasks typically done by software and machine learning engineers.

OpenAI has recently unveiled a five-level classification system to track progress towards achieving artificial general intelligence (AGI) and superintelligent AI. OpenAI executives shared this classification system with employees during an internal meeting and plan to share it with investors and external parties. The company currently considers itself at Level 1 and anticipates reaching Level 2 in the near future.

Other tech giants like Google, Meta, and Microsoft are also exploring techniques to enhance AI reasoning. However, experts like Meta’s Yann LeCun argue that large language models may not yet be capable of human-like reasoning.

OpenAI’s CEO, Sam Altman, emphasised earlier this year that reasoning ability is crucial for AI progress. The Strawberry project could mark a significant step towards AI models achieving human or super-human-level intelligence, potentially revolutionizing how AI assists in scientific discoveries and software development.

The post OpenAI Secretly Working on Project ‘Strawberry’ to Enhance Reasoning and Build Autonomous AI Agents appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-secretly-working-on-project-strawberry-to-enhance-reasoning-and-build-autonomous-ai-agents/feed/ 0
OpenAI Clocks $3.4 Bn in Revenue from ChatGPT Subscriptions https://analyticsindiamag.com/ai-news-updates/openai-clocks-3-4-bn-in-revenue-from-chatgpt-subscriptions/ https://analyticsindiamag.com/ai-news-updates/openai-clocks-3-4-bn-in-revenue-from-chatgpt-subscriptions/#respond Fri, 12 Jul 2024 03:52:55 +0000 https://analyticsindiamag.com/?p=10126601

ChatGPT Plus, the premium version of OpenAI's popular chatbot, emerged as the primary revenue driver, contributing $1.9 billion.

The post OpenAI Clocks $3.4 Bn in Revenue from ChatGPT Subscriptions appeared first on AIM.

]]>

OpenAI, the company behind ChatGPT, has generated an impressive $3.4 billion in revenue, according to a recent report by futureresearch. The company’s financial success is largely attributed to its various ChatGPT offerings, with ChatGPT Plus leading the pack.

ChatGPT Plus, the premium version of OpenAI’s popular chatbot, emerged as the primary revenue driver, contributing $1.9 billion. The service boasts 7.7 million subscribers paying $20 per month. Following closely is ChatGPT Enterprise, which brought in $714 million from 1.2 million subscribers at $50 per month.

The company’s API services and ChatGPT Team also made significant contributions to the bottom line. The API generated $510 million, while ChatGPT Team added $290 million from 80,000 subscribers paying $25 monthly.

Microsoft takes a cut from some of OpenAI’s AI model sales since they run on Microsoft’s cloud. Additionally, OpenAI receives a share from Microsoft’s sales of OpenAI models to Microsoft’s Azure cloud customers. This share now amounts to about $200 million on an annualised basis, or roughly 20% of the revenue Microsoft generates from that business, according to Altman.

Chinese investor and serial entrepreneur Kai Fu Lee is bullish about OpenAI becoming a trillion-dollar company in the next two-three years. “OpenAI will likely be a trillion-dollar company in the not-too-distant future,” said Lee at a recent event with Fortune. 

Despite the impressive revenue figures, OpenAI faces challenges. The company is reportedly operating at a loss, with CEO Altman acknowledging that OpenAI is likely to be “the most capital-intensive startup in Silicon Valley history”. The high operational costs are primarily due to the expensive nature of running AI models like ChatGPT, estimated at around $700,000 per day. 

The post OpenAI Clocks $3.4 Bn in Revenue from ChatGPT Subscriptions appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-clocks-3-4-bn-in-revenue-from-chatgpt-subscriptions/feed/ 0
OpenAI CTO Mira Murati is an Absolute PR Disaster https://analyticsindiamag.com/ai-breakthroughs/openai-cto-mira-murati-is-an-absolute-pr-disaster/ https://analyticsindiamag.com/ai-breakthroughs/openai-cto-mira-murati-is-an-absolute-pr-disaster/#respond Thu, 11 Jul 2024 09:14:32 +0000 https://analyticsindiamag.com/?p=10126522

OpenAI has a history of bad PR but knows how to turn a crisis into an opportunity. 

The post OpenAI CTO Mira Murati is an Absolute PR Disaster appeared first on AIM.

]]>

During a recent podcast at Johns Hopkins University, Mira Murati, the chief technology officer of OpenAI, acknowledged the criticism that ChatGPT has received for being overly liberal and emphasised that this bias was unintentional. 

“We’ve been very focused on reducing political bias in the model behaviour. ChatGPT was criticised for being overly liberal, and we’re working really hard to reduce those biases,” said Murati. 

However, no specific details or measures on the redressal efforts have been provided yet. This is all part of their ongoing effort to improve the model and make it more balanced and fair.

However, in an interview back in March, Murati was asked where the video data that was used to train Sora came from. The CTO feigned ignorance, claiming to not know the answer, making her the talk of the town on social media.

Netizens were quick to create memes highlighting her as “an absolute PR disaster”.

OpenAI Needs No Safety Lessons

OpenAI has a history of bad PR, but it knows how to turn a crisis into an opportunity. In a previous discussion moderated at Dartmouth, Murati focused on safety, usability, and reducing biases to democratise creativity and free up humans for higher-level tasks.

In a recent post on X, she said that to make sure these technologies are developed and used in a way that does the most good and the least harm, they work closely with red-teaming experts from the early stages of research.

“You have to build them alongside the technology and actually in a deeply embedded way to get it right. And for capabilities and safety, they’re actually not separate domains. They go hand in hand,” she added.

Notably, her optimism on AI stems from the belief that developing smarter and more secure systems will lead to safer and more beneficial outcomes for the future. However, she is now facing questions about ChatGPT’s perceived liberal bias.

Meanwhile, OpenAI’s former chief scientist Ilya Sutskever launched Safe Superintelligence shortly after leaving the company in May 2024, allegedly due to disagreements with CEO Sam Altman over AGI safety and advancement.

In an apparent response to this and to ward off safety concerns, OpenAI formed a Safety and Security Committee led by directors Bret Taylor, Adam D’Angelo, Nicole Seligman, and Altman.

Murati to the Rescue 

In a July 2023 discussion with Microsoft CTO Kevin Scott, Murati expressed concerns about the prevailing uncertainty in the AI field, emphasising the need for clear guidance and decision-making processes. 

She highlighted the challenge of determining which aspects of AI to prioritise, develop, release, and position effectively. “When we began building GPT more than five years ago, our primary focus was the safety of AI systems,” said Murati.

Highlighting the risks of letting humans directly set goals for AI systems—due to the potential for complex, opaque processes to cause serious errors or unintended consequences—Murati and her team shifted their focus to using RLHF to ensure AI’s safe and effective development.

Briefly, after GPT-3 was developed and released in the API, OpenAI was able to integrate AI safety into real-world systems for the first time.

An Accidental PR 

Murati’s acknowledgement of ChatGPT’s perceived liberal bias and her emphasis that this bias was unintentional represent a significant and positive step towards the responsible use of AI. 

Her addressing criticisms openly demonstrates a commitment to transparency and accountability, which are crucial for the ethical development of technology. 

Murati’s approach not only seeks to rectify past concerns but also underscores a proactive stance on refining AI systems to better serve diverse user needs. This openness fosters trust and shows that OpenAI is dedicated to addressing issues constructively. 

Murati’s tryst with responsible AI is not new-found. In a 2021 interview, she discussed AI’s potential for harm, emphasising that unmanaged technology could lead to serious ethical and safety concerns. Some critics argued that Murati’s comments were too alarmist or did not fully acknowledge the positive potential of AI. 

While Murati aimed to promote responsible AI, the backlash led to broader debates on the technology’s future and its societal impacts.

Not to forget the ‘OpenAI is nothing without its people’ campaign started by Murati during Sam Altman’s ousting. One thing is for sure: Murati is truly mysterious, and no one knows what she’s going to say next to the media. We are not complaining! 

The post OpenAI CTO Mira Murati is an Absolute PR Disaster appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-breakthroughs/openai-cto-mira-murati-is-an-absolute-pr-disaster/feed/ 0
OpenAI Partners with Lab that Built the Atomic Bomb for AI Bioscience Research https://analyticsindiamag.com/ai-news-updates/openai-partners-with-lab-that-built-the-atomic-bomb-for-ai-bioscience-research/ https://analyticsindiamag.com/ai-news-updates/openai-partners-with-lab-that-built-the-atomic-bomb-for-ai-bioscience-research/#respond Thu, 11 Jul 2024 04:30:20 +0000 https://analyticsindiamag.com/?p=10126475

The joint evaluation study will assess how models like GPT-4o can assist with tasks in a physical laboratory setting through multimodal capabilities such as vision and voice. 

The post OpenAI Partners with Lab that Built the Atomic Bomb for AI Bioscience Research appeared first on AIM.

]]>

OpenAI and Los Alamos National Laboratory (LANL) have announced a partnership to develop evaluations for the safe use of multimodal AI models in laboratory settings, aiming to advance bioscientific research. 

This collaboration continues the U.S. tradition of public-private partnerships to drive innovation in critical areas like healthcare and bioscience. The partnership responds to the recent White House Executive Order on AI safety, tasking national laboratories to evaluate frontier AI models’ capabilities, including biological applications. 

OpenAI’s technology, already utilised by companies like Moderna and Color Health, aims to enhance the speed and impact of scientific research.

“We’re thrilled to announce a first-of-its-kind partnership with Los Alamos National Laboratory to study bioscience capabilities,” said Mira Murati, OpenAI’s Chief Technology Officer. “This partnership marks a natural progression in our mission, advancing scientific research while also understanding and mitigating risks.”

The joint evaluation study will assess how models like GPT-4o can assist with tasks in a physical laboratory setting through multimodal capabilities such as vision and voice. This includes biological safety evaluations for GPT-4o and its real-time voice systems, aiming to support bioscience research. 

“The potential upside to growing AI capabilities is endless,” said Erick LeBrun, research scientist at Los Alamos. “However, measuring and understanding any potential dangers or misuse of advanced AI related to biological threats remain largely unexplored.

The study will build upon OpenAI’s existing work on biothreat risks and follow their Preparedness Framework, consistent with their commitments to Frontier AI Safety from the 2024 AI Seoul Summit.

The upcoming evaluation will test multimodal frontier models in a lab setting by assessing the performance of both experts and novices on standard laboratory tasks. Tasks will include genetic transformation, cell culture, and cell separation. The goal is to quantify how GPT-4o can enhance task completion and accuracy, potentially upskilling both professionals and novices in biological tasks.

LANL was established in 1943 as part of the Manhattan Project during World War II. Initially known as Project Y, it was a top-secret site dedicated to designing and developing the first atomic bomb. The laboratory is located about 35 miles northwest of Santa Fe, New Mexico.

The post OpenAI Partners with Lab that Built the Atomic Bomb for AI Bioscience Research appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-partners-with-lab-that-built-the-atomic-bomb-for-ai-bioscience-research/feed/ 0
‘Odyssey’ AI Built for Hollywood, Sora Can Wait https://analyticsindiamag.com/ai-news-updates/odyssey-ai-built-for-hollywood-sora-can-wait/ https://analyticsindiamag.com/ai-news-updates/odyssey-ai-built-for-hollywood-sora-can-wait/#respond Tue, 09 Jul 2024 13:05:20 +0000 https://analyticsindiamag.com/?p=10126293 Odyssey text-to-video

The new text-to-video platform looks to compete with OpenAI’s Sora and other similar options.

The post ‘Odyssey’ AI Built for Hollywood, Sora Can Wait appeared first on AIM.

]]>
Odyssey text-to-video

Odyssey, dubbed as a Hollywood grade AI system, was recently unveiled by co-founder and CEO Oliver Cameron. The model looks to give visual control to people to allow them to tell a story exactly the way they have imagined. 

The co-founder believes their text-to-video platform will result in higher-quality movies, shows and video games. 

Multiple AI Models

When compared to OpenAI’s Sora or Google’s Veo, Odyssey has given the reins of control to the users who can direct the visuals as per their needs. Thereby, offering better customisation options for users. Infact, there are a number of AI-video generation platforms in the market. 

Odyssey’s capability is achieved by delving deeper than traditional text-to-visual models. Instead of using just one model that limits you to one input and one unchangeable output, Odyssey is using four generative models. These models give precise control over each main part of visual storytelling: creating detailed shapes, realistic materials, customisable lighting, and adjustable motion. Together, they let you quickly create scenes and shots exactly as a user envisions them. 

Furthermore, the company is also developing workflows for expert users, seamlessly integrating into current production methods used in Hollywood, gaming, and others. The company allows compatibility with established workflows in top-tier production, and even edit and export in multiple formats such as 3D file formats. 

Super Team

The team behind Odyssey comprises Hollywood artists and AI researchers from emerging tech verticals including Cruise, Wayve, Waymo, Tesla, Meta and more. The artists have worked with big production names such as Dune, Godzilla, Avengers and others. In addition to Cameron, Jeff Hawke serves as the other co-founder and CTO of Odyssey. 

The company has raised $9M from investors from Y-Combinator, Google Ventures and many more. 

The post ‘Odyssey’ AI Built for Hollywood, Sora Can Wait appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/odyssey-ai-built-for-hollywood-sora-can-wait/feed/ 0
OpenAI and Thrive Global Launch Thrive AI Health to Tackle Chronic Diseases with GenAI https://analyticsindiamag.com/ai-news-updates/openai-and-thrive-global-launch-thrive-ai-health-to-tackle-chronic-diseases-with-genai/ https://analyticsindiamag.com/ai-news-updates/openai-and-thrive-global-launch-thrive-ai-health-to-tackle-chronic-diseases-with-genai/#respond Tue, 09 Jul 2024 08:46:15 +0000 https://analyticsindiamag.com/?p=10126264

The company will be creating an AI health coach to provide expert-level health coaching to improve health outcomes and address health inequities, particularly in chronic disease management and improve lifespan.

The post OpenAI and Thrive Global Launch Thrive AI Health to Tackle Chronic Diseases with GenAI appeared first on AIM.

]]>

OpenAI is once again venturing into AI-driven healthcare, this time with New York-based behaviour change technology company Thrive Global.

The OpenAI Startup Fund and Arianna Huffington’s Thrive Global have announced the formation of Thrive AI Health, an entirely new company dedicated to creating an AI health coach to provide expert-level health coaching to improve health outcomes and address health inequities, particularly in chronic disease management.

“Using AI in this way would also scale and democratise the life-saving benefits of improving daily habits and address growing health inequities,” said Sam Altman and Huffington, stating that while those with more resources already access trainers and life coaches, a hyper-personalised AI health coach can make healthy behaviour changes accessible to everyone, especially those disproportionately affected by chronic diseases like diabetes and cardiovascular disease.

Led by former Google product leader DeCarlos Love, the Alice L. Walton Foundation is also backing the new company. The CEO brings extensive experience from Google, and Apple, and his work on childhood obesity programs. 

“This product will solve the limitations of current AI and LLM-based solutions by providing personalised, proactive, and data-driven coaching across the five daily behaviours. This is how it will improve health outcomes, reduce healthcare costs and significantly impact chronic diseases worldwide,” said Love. 

Thrive AI Health’s mission is to use generative AI to offer hyper-personalised health coaching focusing on five key daily behaviours: sleep, nutrition, fitness, stress management, and social connections. These behaviours impact health outcomes more than medical care or genetics. By promoting healthier habits in these areas, the AI coach aims to enhance both health spans and lifespans.

The AI Health Coach will combine peer-reviewed science, biometric data, and user preferences to offer a transformative health experience. It will be powered by a unified health data platform with strong privacy and security measures. 

The platform will utilise Thrive Global’s behaviour change methodology, Microsteps, and benefit from the latest AI advancements, including enhanced memory capabilities and a custom behavioural coaching model.

The initiative has established research partnerships with Stanford Medicine, the Alice L. Walton School of Medicine, and the Rockefeller Neuroscience Institute. These partnerships aim to integrate the AI Health Coach into their communities and explore its potential to improve health outcomes.

Salesforce CEO Marc Benioff also took to X to appreciate the move. 

However, this is not the first time that OpenAI is working in healthcare. Previously, it recently partnered with the pharmaceutical and biotechnology company Moderna to develop mRNA medicines.

Moderna is developing a pilot program called Dose ID with ChatGPT Enterprise. This tool reviews and analyses clinical data, integrates large datasets, and visualises them. Dose ID aims to help clinical study teams improve data analysis and decision-making. It has also partnered with other health tech platforms like WHOOP, Summer Health, and more, to accelerate healthcare ambitions.

The post OpenAI and Thrive Global Launch Thrive AI Health to Tackle Chronic Diseases with GenAI appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/openai-and-thrive-global-launch-thrive-ai-health-to-tackle-chronic-diseases-with-genai/feed/ 0
Amazon Announces ‘Trusted AI Challenge’ for LLM Coding Security https://analyticsindiamag.com/ai-news-updates/amazon-announces-trusted-ai-challenge-for-llm-coding-security/ https://analyticsindiamag.com/ai-news-updates/amazon-announces-trusted-ai-challenge-for-llm-coding-security/#respond Mon, 08 Jul 2024 14:02:06 +0000 https://analyticsindiamag.com/?p=10126222 Amazon AI Challenge

Amazon’s AI challenge sort of mimics OpenAI’s method of building responsible AI.

The post Amazon Announces ‘Trusted AI Challenge’ for LLM Coding Security appeared first on AIM.

]]>
Amazon AI Challenge

Amazon announces a global university competition to focus on Responsible AI for LLM coding security. The “Amazon Trusted AI Challenge” is offering 250,000 in sponsorship and monthly AWS credits to each of the 10 teams that will be selected for the competition that begins in November 2024. The winning team will have a chance to get $700,000 in cash prizes. 

The students will participate in a tournament-style competition where they can either develop AI models or red teams to improve AI user experience, prevent misuse, and help users create safer code. 

Model developers will focus on adding security features to AI models that generate code, while testers will create automated methods to test these models. Each round of the competition will also involve multiple interactions, allowing teams to improve their models and techniques by identifying strengths and weaknesses.

Improving AI Through Red Teaming

“We are focusing on advancing the capabilities of coding LLMs, exploring new techniques to automatically identify possible vulnerabilities and effectively secure these models,” said Rohit Prasad, senior vice president and head scientist, Amazon AGI.

“The goal of the Amazon Trusted AI Challenge is to see how students’ innovations can help forge a future where generative AI is consistently developed in a way that maintains trust, while highlighting effective methods for safeguarding LLMs against misuse to enhance their security,” said Prasad.  

Amazon’s AI challenge is a promising way to build more robust and secure coding systems by collaborating with some of the deepest young minds in the industry. Similar methods have been adopted by other companies including OpenAI who run cybersecurity and bounty challenges. Their last competition invited people to help with framing ways to deploy responsible AI models.

The post Amazon Announces ‘Trusted AI Challenge’ for LLM Coding Security appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/amazon-announces-trusted-ai-challenge-for-llm-coding-security/feed/ 0
‘Fears of an AI takeover are unfounded due to data limits and finite growth,’ says Aidan Gomez https://analyticsindiamag.com/ai-news-updates/fears-of-an-ai-takeover-are-unfounded-due-to-data-limits-and-finite-growth-says-aidan-gomez/ https://analyticsindiamag.com/ai-news-updates/fears-of-an-ai-takeover-are-unfounded-due-to-data-limits-and-finite-growth-says-aidan-gomez/#respond Mon, 08 Jul 2024 07:38:00 +0000 https://analyticsindiamag.com/?p=10126168

The intelligence of these models is limited by the humans who create them.

The post ‘Fears of an AI takeover are unfounded due to data limits and finite growth,’ says Aidan Gomez appeared first on AIM.

]]>

In a recent interview, Aidan Gomez, CEO and co-founder of Cohere, stated fears of an AI takeover are unfounded due to its reliance on human training data and the limits of exponential growth.

He explained, “I think I’m empathetic to the fears, you know, the sci-fi narrative of computers or AI taking over and destroying the world. It’s been going on for decades, and so it’s really deeply embedded within our culture. It gets lots of clicks, headlines, it gets attention. It’s shocking. I understand why people are scared of it and why some say it to get attention.” 

Gomez highlighted that this is not a technical truth of the technology and that continuous exponential scaling does not happen. There are friction points and complexities. He also mentioned that the intelligence of these models is limited by the humans who create them, as it is our data and knowledge that teach them.

However, he believes real risks lie in deploying AI in high-stakes scenarios like medicine and advocates for scrutiny and tough discussions on AI deployment, rather than sensational sci-fi narratives.

What is Cohere Upto

In the interview, he also discussed how the original goal of the project was to improve Google Translate, a very well-known problem. He noted that it has been extraordinary to see the broad impact of a technology developed to enhance translation. Gomez was also the co-author of the original Transformers paper which forms the crux of today’s generative AI products. 

Recently, AIM spoke to Saurabh Baji, SVP of Engineering at Cohere, about the mixed emotions in Silicon Valley regarding achieving AGI, as seen in the recent banter between Meta’s Yann LeCun and xAI’s Elon Musk.

“We remain concentrated on designing AI solutions that deliver better workforce and customer experiences for businesses today rather than pursuing abstract concepts like AGI,” said Baji.

The post ‘Fears of an AI takeover are unfounded due to data limits and finite growth,’ says Aidan Gomez appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/fears-of-an-ai-takeover-are-unfounded-due-to-data-limits-and-finite-growth-says-aidan-gomez/feed/ 0
What’s the California AI Bill, and Why Does Meta’s Yann LeCun Think it Sucks? https://analyticsindiamag.com/ai-insights-analysis/whats-the-california-ai-bill-and-why-does-metas-yann-lecun-think-it-sucks/ https://analyticsindiamag.com/ai-insights-analysis/whats-the-california-ai-bill-and-why-does-metas-yann-lecun-think-it-sucks/#respond Sat, 06 Jul 2024 04:36:35 +0000 https://analyticsindiamag.com/?p=10126036

Mentioning worst-case scenarios like nuclear war or building chemical weapons only serves to stoke a pervasive fear of AI that is already common among the general public.

The post What’s the California AI Bill, and Why Does Meta’s Yann LeCun Think it Sucks? appeared first on AIM.

]]>

A staunch proponent for the open development and research of AI, Meta’s chief AI scientist, Yann LeCun, recently posted a call to action to oppose SB1047, colloquially known as the California AI Bill.

Now, not taking into consideration the actual content of the proposed Safe and Secure Innovation for Frontier Artificial Intelligence Models Bill, California is home to a majority of AI and AI-adjacent companies that operate globally.

This means that a comprehensive bill governing AI within the state will affect not only the companies within the US state but also globally.

Overkill Much?

There’s plenty of reason not to like the actual Bill. But the main point of contention that had LeCun concerned was the regulation of research and development within the ecosystem.

“SB1047 is a California bill that attempts to regulate AI research and development, creating obstacles to the dissemination open research in AI and open source AI platforms,” he said.

However, the Bill also attempts to predict where AI could go, thereby implementing pretty strict and near unattainable compliance from companies.

It uses the potential for AI to “create novel threats to public safety and security, including by enabling the creation and the proliferation of weapons of mass destruction, such as biological, chemical, and nuclear weapons, as well as weapons with cyber-offensive capabilities” as a way to implement overarching measures, that in the end will go unimplemented.

For example, the Bill basically states that it is prohibited to build a model that can enable critical harm, given certain provisions. However, as AIM has covered before, literally any model can be jailbroken to even produce instructions on how to build nuclear weapons.

The Bill is filled with similar instances of providing guidelines that are either near impossible to adhere to or are just generalisations, backed by a need to adhere to safety protocols but a lack of actual knowledge of how these systems work.

Meta’s vice president and deputy chief privacy officer, Rob Sherman, said it perfectly in a letter sent to the lawmakers, “The bill will make the AI ecosystem less safe, jeopardise open-source models relied on by startups and small businesses, rely on standards that do not exist, and introduce regulatory fragmentation.”

Stick to What You Know

The general consensus is that it’s basically impossible to implement future-proof regulations for AI. 

Mentioning worst-case scenarios like nuclear war or building chemical weapons only serves to stoke a pervasive fear of AI that is already common among the general public. There have been several AI leaders, as well as government officials, who have stated that over-regulation of AI is something that they’re hoping to avoid.

However, regulations like these broadly generalise what AI is, with a lack of input from those working within the tech space and who are familiar with ongoing developments in the industry.

While there are several concerns on the development and usage of AI, these don’t ever seem to get addressed in regulations like this one and EU’s Artificial Intelligence Act (AIA). Instead, they focus on trying to future-proof AI usage, thereby making generalisations, and fail to address problems that are already prevalent within communities and the industry.

“The sad thing is that the regulation of AI R&D is predicated on the illusion of “existential risks” pushed by a handful of delusional think-tanks, and dismissed as nonsense (or at least widely premature) by the vast majority of researchers and engineers in academia, startups, larger companies, and investment firms,” LeCun said.

There are many gaps in regulation that AI companies and startups actively take advantage of, though conceding that this be done carefully so as not to cross ethical boundaries. However, with no proper regulation in place, companies are not bound by any kind of legal obligation.

Many big players like OpenAI, Meta, Google and Microsoft have been in staunch favour of regulations but have asked that preliminary conversations are held with stakeholders. Which, for anything regulation-related, makes sense.

However, it seems that the California AI Bill is just another in a long line of examples where governments seem to push regulations as a reactionary measure rather than one that has thought and rationale put behind it. Which is evidenced in the open letter written to the legislators, signed by several researchers, founders and other leaders in the AI space.

Further regulations can only serve to push companies, particularly startups, to pursue prospects in other countries that don’t attempt to have a hamfisted approach to policing AI.

The post What’s the California AI Bill, and Why Does Meta’s Yann LeCun Think it Sucks? appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-insights-analysis/whats-the-california-ai-bill-and-why-does-metas-yann-lecun-think-it-sucks/feed/ 0
Claude 3.5 Sonnet vs GPT-4o – Which is Best? https://analyticsindiamag.com/ai-origins-evolution/claude-3-5-sonnet-vs-gpt-4o/ https://analyticsindiamag.com/ai-origins-evolution/claude-3-5-sonnet-vs-gpt-4o/#respond Fri, 05 Jul 2024 07:56:09 +0000 https://analyticsindiamag.com/?p=10125893

Claude Sonnet 3.5's features spark debate and interest, showcasing its strengths as a compelling model.

The post Claude 3.5 Sonnet vs GPT-4o – Which is Best? appeared first on AIM.

]]>

Since the release of Anthropic’s Claude 3.5 model family, social media platforms, particularly X, have been abuzz with comparisons and testing of Claude 3.5 Sonnet and OpenAI’s GPT-4o. These models are being evaluated based on their features through various testing methods.

Claude 3.5 Sonnet is part of the Claude 3 model family, which was released in 2024. It’s important to note that Claude 3.5 Sonnet outperforms its predecessor, Claude 3 Opus, as well as other leading AI models in various evaluations. It combines enhanced intelligence with improved speed and efficiency.

The latest model is available for free on Claude.ai and the Claude iOS app, with higher rate limits for Claude Pro and Team plan subscribers. The model can also be accessed via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI, priced at $3 per million input tokens and $15 per million output tokens. 

As posted by Perplexity CEO Aravind Sreenivas, the model is now available on Perplexity. And with 2x the speed of Opus, Claude 3.5 Sonnet unlocks new possibilities for complex AI applications across reasoning, knowledge, and coding tasks.

OpenAI’s GPT-4o, released earlier, has also demonstrated significant improvements over its predecessors, including GPT-3.5. It shows enhanced language understanding, broader knowledge, and better contextual comprehension, often generating more accurate, coherent, and contextually relevant responses.

Does Claude Sonnet 3.5 Outperform GPT-4o

Several features of Claude Sonnet 3.5 generate debate or interest among people, highlighting why it is a good model. Let’s take a look at a few standout features and compare them to GPT-4o.

Artifacts Feature

This is something interesting that Claude 3.5 Sonnet came up with. The Artifacts feature expands how users interact with Claude, offering a dedicated window alongside conversations. While generating content like code snippets, text documents, or web design, users can now see a preview of the output.

However, GPT-4o lacks this feature, making the one in Claude stand out even more. 

Coding Abilities

Coding with Claude 3.5 Sonnet is 10 times more efficient and faster than with GPT-4o or any other LLM available. The Artifacts feature enhances the user experience by allowing you to generate and run code directly within your chat, providing an amazing user experience. 

In a Reddit discussion many users found that Claude 3.5 Sonnet outperforms GPT-4o in coding tasks, often producing nearly bug-free code in the first try. Claude is praised for its accuracy in text summarisation and natural, human-like communication style.

Developing Games From Scratch

Claude 3.5 goes far beyond simple text generation. It’s fun to use Artifacts to make games playable inline. In fact, with the help of Artifacts, it’s more enjoyable to create interactive experiences.

For instance, Pietro Schirano, the founder of EverArt AI, used Claude 3 Sonnet to create a new and original game designed for quick sessions. It generated Color Cascade, a game where players catch the correct colour from a series of falling shapes, hinting at the advanced capabilities of Claude 3.5 Sonnet. 

Reasoning Capabilities 

Claude 3.5 Sonnet shows advanced visual reasoning, surpassing earlier models. It accurately interprets charts, graphs, and imperfect images, making it valuable for retail, logistics, and finance sectors that rely on visual data analysis.

For instance, Muratcan Koylan, a marketing professional, tried Claude 3.5 to analyse financial data and provide trading insights. The model demonstrated impressive capabilities in data extraction, correlation analysis, and generating trading strategies. 

It provided detailed predictions for interest rates, the USD Index, and the S&P 500, along with sophisticated trading strategies and potential black swan events. 

When compared with other models like GPT 4o, users were particularly impressed by the model’s ability to offer nuanced, context-specific insights and its advanced reasoning capabilities, which they found superior to other AI models.

Solving Pull Request

Claude 3.5 Sonnet shows major improvements in coding tasks, especially pull requests. It solved 64% of problems in an internal evaluation, up from 38% for Claude Opus. This leap demonstrates Sonnet’s enhanced reasoning and coding abilities, making it a potentially valuable tool for collaborative software development.

Alex Albert from Anthropic AI posted on X a demo video of a simple pull request:

He mentions that Claude is starting to get really good at coding and autonomously fixing pull requests. It’s becoming clear that in a year’s time, a large percentage of code will be written by LLMs. 

Whereas for GPT, there is no clear evidence that GPT-4o can directly solve pull requests. However, there are some related developments and applications of GPT models in the context of GitHub and pull requests.

Final Verdict?

A Reddit discussion rated GPT 4o against Claude 3.5 Sonnet. The users generally found Claude 3.5 Sonnet to be superior to GPT-4o for many tasks, particularly coding and writing. 

A user described Claude as a doctoral candidate, while GPT-4o was an intelligent undergrad or master’s level student. 

The post Claude 3.5 Sonnet vs GPT-4o – Which is Best? appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-origins-evolution/claude-3-5-sonnet-vs-gpt-4o/feed/ 0
French AI Lab Kyutai Releases OpenAI GPT-4o Killer ‘Moshi’ https://analyticsindiamag.com/ai-news-updates/french-ai-lab-kyutai-releases-openai-gpt-4o-killer-moshi/ https://analyticsindiamag.com/ai-news-updates/french-ai-lab-kyutai-releases-openai-gpt-4o-killer-moshi/#respond Thu, 04 Jul 2024 12:21:36 +0000 https://analyticsindiamag.com/?p=10125810

Built on the Helium 7B model, Moshi integrates text and audio training, optimised for CUDA, Metal, and CPU backends with support for 4-bit and 8-bit quantization.

The post French AI Lab Kyutai Releases OpenAI GPT-4o Killer ‘Moshi’ appeared first on AIM.

]]>

Kyutai, a French non-profit AI research laboratory, has introduced Moshi, a real-time native multimodal foundational AI model. This open-source project features voice-enabled AI assistant offering capabilities that rival OpenAI’s GPT-4o and Google Astra. 

Moshi, developed by a team of just eight researchers in six months, can understand and express 70 different emotions and styles, speak with various accents, and handle two audio streams simultaneously, allowing it to listen and talk at the same time.

Built on the Helium 7B model, Moshi integrates text and audio training, optimised for CUDA, Metal, and CPU backends with support for 4-bit and 8-bit quantization.

Key features of Moshi include:

  1. Real-time interaction with end-to-end latency of 200 milliseconds
  2. Ability to run on consumer-grade hardware, including MacBooks
  3. Support for multiple backends (CUDA, Metal, CPU)
  4. Watermarking to detect AI-generated audio (in progress)

Kyutai chief Patrick Pérez said that the Moshi has the potential to revolutionize human-machine communication, saying, “Moshi thinks while it talks”.

Kyutai plans to release the full model, including the inference codebase, the 7B model, the audio codec, and the optimised stack. 

Founded in November 2023 with €300 million in backing from investors including French billionaire Xavier Niel, the startup aims to contribute to open research in AI and foster ecosystem development. 

The lab’s approach challenges major AI companies like OpenAI, which have faced criticism for delaying releases due to safety concerns. Notably, OpenAI has been withholding the release of its video generation model Sora, as well as the Voice Engine and voice mode features of GPT-4o.

Moshi contributes to France’s increasing influence in the AI sector, alongside other French-origin projects such as Hugging Face and Mistral.

The post French AI Lab Kyutai Releases OpenAI GPT-4o Killer ‘Moshi’ appeared first on AIM.

]]>
https://analyticsindiamag.com/ai-news-updates/french-ai-lab-kyutai-releases-openai-gpt-4o-killer-moshi/feed/ 0