UHG
Search
Close this search box.

Claude-2 vs GPT-4 – Which is Better?

A true competitor for OpenAI is finally here and might make the company drop its prices and come to the ground to compete

Share

Claude-2 vs GPT-4

Google-backed Anthropic, an AI lab based in San Francisco, has unveiled Claude 2, a publicly accessible alternative to GPT-4. Previously, Claude, the earlier iteration, was exclusively offered to enterprises, but the latest version is now open to the general public in the United States and the United Kingdom. Distinguishing itself from its predecessor, Claude 2 is accessible through both a beta website and an API.

The timing couldn’t have been better. Claude-2 comes at a time when the popularity of GPT has seen a decline in recent months. Users are seeking alternatives that offer superior performance and affordability. Claude-2 appears to fit the bill, with its enhanced capabilities and cost-effectiveness.

Learning from Google’s Bard and OpenAI’s ChatGPT and taking user feedback into account, Anthropic has made significant enhancements to Claude-2. Users on Twitter have been lauding Claude’s ability to engage in natural language conversations, clearly explain its reasoning, and produce less harmful outputs. Claude-2 builds on these strengths and adds several key features that elevate its performance to new heights.

One notable improvement is Claude-2’s enhanced coding, maths, and reasoning skills. This includes reading PDFs, something that GPT-based models still struggle with. This is exactly the time around when OpenAI has introduced Code Interpreter on its paid models.   

Let’s Evaluate

Anthropic has put considerable effort into fine-tuning the model. According to the model card of Claude-2, the model is built using unsupervised learning and reinforcement learning with human feedback (RLHF), similar to what OpenAI used for GPT. Moreover, the model is trained with data till early 2023, but does not access the internet. 

Claude-2 now boasts an impressive 71.2% score on the Codex HumanEval, a Python coding test, up from 56.0% achieved by its predecessor, Claude-1.3. This is compared to 67% of GPT-4. Claude-2 wins.

Similarly, on the GSM8k maths problem set, Claude-2 scored 88%, an improvement from Claude-1.3’s score of 85.2%. These advancements position Claude-2 as a valuable asset for developers and individuals seeking assistance with technical challenges. GPT-4 wins here with 92% score.

The most important aspect is the expansion of Claude-2’s input and output capabilities. Users can now input up to 100,000 tokens per prompt, compared to 32,000 of GPT-4, allowing Claude-2 to process extensive technical documentation or even entire books. Additionally, Claude-2 can generate longer documents, ranging from memos to letters to stories, up to a few thousand tokens in length. 

This is also 4-5 times cheaper than GPT-4-32K which costs $1.96 per token. Prompt tokens cost $11 per million token vs $60 million for GPT, and completion costs $32 vs $120/M, assuming similar tokenisation length. This will definitely push a lot of users to start using Claude-2 instead of GPT-4.

Read: Busting the Myth of Context Length

Price drop and availability

Anthropic has made Claude-2 available through multiple channels. Users can access Claude-2 via the API, allowing businesses to integrate it into their systems seamlessly. Remarkably, Anthropic has maintained the same pricing for the Claude-2 API as its predecessor, Claude-1.3, making the upgrade to the latest model even more appealing to budget-conscious users.

Partners like Jasper, a generative AI platform, have reported Claude-2’s strength in a wide range of use cases, particularly those involving extended content generation. With a 3X larger context window and improved semantics, Claude-2 has empowered Jasper’s customers to stay ahead of the curve and achieve their content strategy goals. 

Another notable collaboration involves Sourcegraph, a code AI platform that assists developers in writing, fixing, and maintaining code. Sourcegraph’s coding assistant, Cody, leverages Claude-2’s improved reasoning and access to a larger context window of up to 100,000 tokens. By providing accurate answers and incorporating codebase context, Cody assists developers in speeding up their workflow and staying up to date with the latest frameworks and libraries.

Safe but still hallucinatory

According to Anthropic, the model has undergone rigorous evaluation, including internal red-teaming and automated tests on harmful prompts. In these evaluations, Claude-2 demonstrated a twofold improvement in providing harmless responses compared to Claude-1.3. While no model is completely immune to misuse, Anthropic accepts that.

“For example, Claude models could support a lawyer but should not be used instead of one, and any work should still be reviewed by a human,” reads the paper. People on Twitter have been already pointing out that the claims of being good at maths are overstated. 

Anthropic acknowledges the evolving nature of AI and is committed to responsible deployment. Claude-2 is poised to become a trusted companion for individuals and a valuable tool for businesses. 

As users seek alternatives to declining ChatGPT usage, Claude-2’s budget-friendly offering and remarkable feature set make it an enticing option. Seems as though, a true competitor for OpenAI is finally here and might finally make the company drop its prices and come to the ground to compete.

📣 Want to advertise in AIM? Book here

Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words.
Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.