Google-backed Anthropic, an AI lab based in San Francisco, has unveiled Claude 2, a publicly accessible alternative to GPT-4. Previously, Claude, the earlier iteration, was exclusively offered to enterprises, but the latest version is now open to the general public in the United States and the United Kingdom. Distinguishing itself from its predecessor, Claude 2 is accessible through both a beta website and an API.
The timing couldn’t have been better. Claude-2 comes at a time when the popularity of GPT has seen a decline in recent months. Users are seeking alternatives that offer superior performance and affordability. Claude-2 appears to fit the bill, with its enhanced capabilities and cost-effectiveness.
Learning from Google’s Bard and OpenAI’s ChatGPT and taking user feedback into account, Anthropic has made significant enhancements to Claude-2. Users on Twitter have been lauding Claude’s ability to engage in natural language conversations, clearly explain its reasoning, and produce less harmful outputs. Claude-2 builds on these strengths and adds several key features that elevate its performance to new heights.
One notable improvement is Claude-2’s enhanced coding, maths, and reasoning skills. This includes reading PDFs, something that GPT-based models still struggle with. This is exactly the time around when OpenAI has introduced Code Interpreter on its paid models.
Let’s Evaluate
Anthropic has put considerable effort into fine-tuning the model. According to the model card of Claude-2, the model is built using unsupervised learning and reinforcement learning with human feedback (RLHF), similar to what OpenAI used for GPT. Moreover, the model is trained with data till early 2023, but does not access the internet.
Claude-2 now boasts an impressive 71.2% score on the Codex HumanEval, a Python coding test, up from 56.0% achieved by its predecessor, Claude-1.3. This is compared to 67% of GPT-4. Claude-2 wins.
Similarly, on the GSM8k maths problem set, Claude-2 scored 88%, an improvement from Claude-1.3’s score of 85.2%. These advancements position Claude-2 as a valuable asset for developers and individuals seeking assistance with technical challenges. GPT-4 wins here with 92% score.
The most important aspect is the expansion of Claude-2’s input and output capabilities. Users can now input up to 100,000 tokens per prompt, compared to 32,000 of GPT-4, allowing Claude-2 to process extensive technical documentation or even entire books. Additionally, Claude-2 can generate longer documents, ranging from memos to letters to stories, up to a few thousand tokens in length.
This is also 4-5 times cheaper than GPT-4-32K which costs $1.96 per token. Prompt tokens cost $11 per million token vs $60 million for GPT, and completion costs $32 vs $120/M, assuming similar tokenisation length. This will definitely push a lot of users to start using Claude-2 instead of GPT-4.
Read: Busting the Myth of Context Length
Price drop and availability
Anthropic has made Claude-2 available through multiple channels. Users can access Claude-2 via the API, allowing businesses to integrate it into their systems seamlessly. Remarkably, Anthropic has maintained the same pricing for the Claude-2 API as its predecessor, Claude-1.3, making the upgrade to the latest model even more appealing to budget-conscious users.
Partners like Jasper, a generative AI platform, have reported Claude-2’s strength in a wide range of use cases, particularly those involving extended content generation. With a 3X larger context window and improved semantics, Claude-2 has empowered Jasper’s customers to stay ahead of the curve and achieve their content strategy goals.
Another notable collaboration involves Sourcegraph, a code AI platform that assists developers in writing, fixing, and maintaining code. Sourcegraph’s coding assistant, Cody, leverages Claude-2’s improved reasoning and access to a larger context window of up to 100,000 tokens. By providing accurate answers and incorporating codebase context, Cody assists developers in speeding up their workflow and staying up to date with the latest frameworks and libraries.
Safe but still hallucinatory
According to Anthropic, the model has undergone rigorous evaluation, including internal red-teaming and automated tests on harmful prompts. In these evaluations, Claude-2 demonstrated a twofold improvement in providing harmless responses compared to Claude-1.3. While no model is completely immune to misuse, Anthropic accepts that.
“For example, Claude models could support a lawyer but should not be used instead of one, and any work should still be reviewed by a human,” reads the paper. People on Twitter have been already pointing out that the claims of being good at maths are overstated.
Anthropic acknowledges the evolving nature of AI and is committed to responsible deployment. Claude-2 is poised to become a trusted companion for individuals and a valuable tool for businesses.
As users seek alternatives to declining ChatGPT usage, Claude-2’s budget-friendly offering and remarkable feature set make it an enticing option. Seems as though, a true competitor for OpenAI is finally here and might finally make the company drop its prices and come to the ground to compete.