UHG
Search
Close this search box.

Sam Altman Confirms OpenAI’s Project Strawberry

OpenAI's teams are working on Strawberry to improve the models' ability to perform long-horizon tasks (LHT), which require planning and executing a series of actions over an extended period. 

Share

OpenAI chief Sam Altman has hinted in a cryptic post that the AI startup is working on a project known internally as “Project Strawberry.” On X, Altman shared a post saying, “I love summer in the garden,” accompanied by an image of a pot with strawberries.

Project Strawberry, also referred to as Q*, was recently revealed in a Reuters report, which said that it will significantly enhance the reasoning capabilities of OpenAI’s AI models. “Some at OpenAI believe Q* could be a breakthrough in the startup’s search for artificial general intelligence (AGI),” said the report. 

Project Strawberry involves a novel approach that allows AI models to plan ahead and navigate the internet autonomously to perform in-depth research. This advancement could address current limitations in AI reasoning, such as common sense problems and logical fallacies, which often lead to inaccurate outputs.

AI Insider who goes by the name Jimmy Apples recently revealed that the Q* hasn’t been released yet as they ( OpenAI) aren’t happy with the latency and other ‘little things’ they want to further optimise.  

OpenAI’s teams are working on Strawberry to improve the models’ ability to perform long-horizon tasks (LHT), which require planning and executing a series of actions over an extended period. 

The project involves a specialised “post-training” phase, adapting the base models for enhanced performance. This method resembles Stanford’s 2022 “Self-Taught Reasoner” (STaR), which enables AI to iteratively create its own training data to reach higher intelligence levels.

OpenAI recently announced DevDay 2024, a global developer event series scheduled to take place in San Francisco on October 1, London on October 30, and Singapore on November 21. While the company has stated that the focus will be on advancements in the API and developer tools, there is speculation that OpenAI might also preview its next frontier model.

Recently, a new model in the LMsys chatbot arena showed strong performance in math. Interestingly, before the release of GPT-4o and GPT-4o Mini, these models were also observed in the chatbot arena a few days earlier.

The internal document indicates that Project Strawberry includes a “deep-research” dataset for training and evaluating the models, though the contents of this dataset remain undisclosed.

This innovation is expected to enable AI to conduct research autonomously, using a “computer-using agent” (CUA) to take actions based on its findings. Additionally, OpenAI plans to test Strawberry’s capabilities in performing tasks typically done by software and machine learning engineers.

Last year, it was reported that Jakub Pachocki and Szymon Sidor, two leading OpenAI researchers, used Ilya Sutskever’s work to develop a model called Q* (pronounced “Q-Star”) that achieved an important milestone by solving math problems it had not previously encountered.

Sutskever, raised concerns among some staff that the company didn’t have proper safeguards in place to commercialise such advanced AI models. Notably, he  left OpenAI and recently founded his own company called Safe Superintelligence. Following that Pachocki was appointed as the new chief AI scientist. 

What is Q*? 

Q* is probably a combination of Q-learning and A* search.  OpenAI’s Q* algorithm is considered a breakthrough in AI research, particularly in the development of AI systems with human reasoning capabilities. Q* combines elements of Q-learning and A* (A-star search), which leads to an improvement in goal-oriented thinking and solution finding. 

This algorithm shows impressive capabilities in solving complex mathematical problems (without prior training data) and symbolizes an evolution towards general artificial intelligence (AGI).

Q-learning  is a foundational concept in the field of AI, specifically in the area of reinforcement learning. Q-learning’s algorithm is categorised as model-free reinforcement learning, and is designed to understand the value of an action within a specific state. 

The ultimate goal of Q-learning is to find an optimal policy that defines the best action to take in each state, maximising the cumulative reward over time.

Q-learning is based on the notion of a Q-function, aka the state-action value function. This function operates with two inputs: a state and an action. It returns an estimate of the total reward expected, starting from that state, alongside taking that action, and thereafter following the optimal policy. 

OpenAI has recently unveiled a five-level classification system to track progress towards achieving artificial general intelligence (AGI) and superintelligent AI. The company currently considers itself at Level 1 and anticipates reaching Level 2 in the near future.

Other tech giants like Google, Meta, and Microsoft are also exploring techniques to enhance AI reasoning. However, experts like Meta’s Yann LeCun argue that large language models may not yet be capable of human-like reasoning.

📣 Want to advertise in AIM? Book here

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.