Anthropic Introduces Claude 3.5 Sonnet and Claude 3.5 Haiku That Can Use Your Computer

Anthropic has made a significant leap forward with major updates to its Claude AI models, introducing Claude 3.5 Sonnet and Claude 3.5 Haiku, along with an experimental new capability called Computer Use.

These updates bring improvements in coding, reasoning, and general AI capabilities while also adding an unprecedented feature that allows AI to interact with computers just like humans.

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use.

Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text. pic.twitter.com/ZlywNPVIJP
— Anthropic (@AnthropicAI) October 22, 2024

Let’s dive into what each of these updates means for developers, businesses, and the future of AI.

Table of Contents

Claude 3.5 Sonnet: Leading the Field in AI-Powered Coding

Performance Improvements

The upgraded Claude 3.5 Sonnet offers significant performance improvements, particularly in coding and reasoning tasks.

It has now outperformed leading models like GPT-4o from OpenAI and Gemini 1.5 Pro from Google. For example, Claude 3.5 Sonnet scored an impressive 49.0% on the SWE-bench Verified test, the highest score ever recorded by any AI model.

Additionally, the model excelled in TAU-bench tests, showing improvements in both retail and airline domains, with scores of 69.2% and 46.0%, respectively. These results further demonstrate the model’s ability to handle complex, industry-specific tasks.

Other benchmarks, like GPQA and MMLU Pro, saw increases to 65% and 78%, making Claude 3.5 Sonnet a strong contender for both technical and visual tasks.

Availability

Claude 3.5 Sonnet is now available for developers via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Developers can start utilizing these improvements immediately to enhance their projects, especially those involving complex coding or reasoning tasks.

Claude 3.5 Haiku: Fast, Efficient, and Cost-Effective

While Claude 3.5 Sonnet leads in coding capabilities, Claude 3.5 Haiku is designed for speed and efficiency. It delivers three times faster performance than many competing models while maintaining high accuracy.

Performance and Speed

Despite its focus on speed, Claude 3.5 Haiku doesn’t sacrifice performance. It matches the results of Claude 3 Opus, Anthropic’s previous flagship model, across several benchmarks.

This model is perfect for tasks requiring low latency and real-time responses. For example, it excels at processing large volumes of data, such as purchase history or inventory records.

At launch, Claude 3.5 Haiku will be a text-only model, but multimodal support for images will be added soon.

Availability

The Claude 3.5 Haiku model will be released later this month. Initially, it will only support text inputs, but it will soon gain the ability to handle images as well, making it even more versatile for developers and businesses.

Computer Use: Teaching AI to Navigate Computers

Anthropic’s most groundbreaking update comes with the introduction of “Computer Use,” a feature that lets AI models interact with computers just like humans.

How we developed computer use, Claude's newest ability: https://t.co/gAAULqZAM8
— Anthropic (@AnthropicAI) October 22, 2024

With this capability, Claude can now navigate websites, move a cursor, click buttons, and even fill out forms. This feature could revolutionize how AI is used for automating tasks that usually require human intervention.

How It Works

Through the Anthropic API, developers can instruct Claude to perform tasks like navigating web browsers, filling out forms, and manipulating data. This makes it ideal for automating repetitive processes or conducting complex tasks such as software testing and data entry.

Several companies, including Asana, DoorDash, and Replit, have already started exploring how this feature can be used to simplify their workflows. For example, Replit is using this capability to develop an automated app verifier for its Replit Agent product.

Limitations and Beta Status

While promising, the Computer Use feature is still in its beta stage and remains experimental. Anthropic has acknowledged that the feature is currently error-prone, especially when it comes to basic actions like scrolling or zooming.

Developers are encouraged to use Computer Use for low-risk tasks while providing feedback to help Anthropic improve the feature over time.

Safety Measures: Ensuring Responsible AI Use

Because Computer Use introduces new possibilities, Anthropic has taken steps to ensure its safe deployment.

For one, the model is not trained on users’ screenshots or prompts, and it cannot access the web during training. This limits the risk of misuse, such as fraud or data leaks.

Additionally, classifiers have been developed to detect when Claude is performing a high-risk action, such as posting on social media or interacting with sensitive websites. This helps ensure that the model is being used responsibly.

Testing and Validation

Before releasing the Claude 3.5 Sonnet model, Anthropic worked with both the US AI Safety Institute (US AISI) and the UK Safety Institute (UK AISI) to rigorously test the model for any potential risks. The model was found to meet the ASL-2 Standard, which ensures that it is safe for various applications.

What These Updates Mean for the Future

With these updates, Anthropic’s Claude models are set to compete even more aggressively in the AI market.

The Claude 3.5 Sonnet is ideal for developers who need powerful coding and reasoning capabilities, while Claude 3.5 Haiku offers speed and efficiency for quick, low-latency tasks.

The experimental Computer Use feature could open up new possibilities for automation, especially for tasks that have traditionally required manual effort.

As AI continues to evolve, Anthropic’s latest updates position their models at the forefront of innovation, helping businesses and developers tackle increasingly complex tasks with ease.

Claude 3.5 Opus: Anthropic’s Next Leap

Anthropic is preparing to launch Claude 3.5 Opus, a new model that aims to build on the success of Claude 3.5 Sonnet. This version is expected to set new benchmarks in AI intelligence, versatility, and autonomous capabilities, positioning it as a top tool for developers and enterprises.

What to Expect from Claude 3.5 Opus:

More Intelligent AI

Claude 3.5 Opus is likely to expand on the strengths of its predecessor, offering even more powerful coding and complex reasoning abilities. It’s expected to excel in areas like data analysis and long-form content generation, making it a versatile tool for both technical tasks and creative projects.

Insiders suggest that Claude Opus could significantly outperform previous models, making it a go-to solution for developers and businesses that need high-performance AI.

Bigger Context Window

One of Claude’s standout features is its large context window, which allows it to hold more information in memory during conversations. The upcoming Claude 3.5 Opus might take this further by expanding the context window to up to 500,000 tokens. This will enable it to handle tasks that require understanding vast amounts of context, making it ideal for long-term interactions and in-depth projects.

Autonomous Capabilities

There are rumors that Claude 3.5 Opus could introduce more autonomous functionality, allowing it to tackle complex, multi-step tasks with less user guidance. This would move the model closer to becoming an agentic AI, capable of managing entire projects independently—a game-changing feature for enterprises.

Release Schedule for Claude 3.5 Opus

Although Anthropic has not yet confirmed an official release date, experts speculate that Claude 3.5 Opus could be launched by late 2024. The anticipation around this release is growing, particularly among developers and businesses seeking advanced AI tools to streamline their workflows.

Conclusion

Anthropic’s new Claude 3.5 Sonnet and Claude 3.5 Haiku models, along with the innovative Computer Use feature, represent a significant step forward in AI capabilities. Whether you need powerful coding support or fast, efficient task management, these models are designed to meet the needs of modern developers and businesses. As the technology matures, we can expect even more exciting updates in the near future.

FAQs

What is the difference between Claude 3.5 Sonnet and Claude 3.5 Haiku?

Claude 3.5 Sonnet is designed for advanced coding and reasoning tasks, showing improvements across multiple industry benchmarks. Claude 3.5 Haiku, on the other hand, is faster and more efficient, designed for tasks that require quick responses, such as real-time data processing.

What is the “Computer Use” feature?

The Computer Use feature allows Claude to interact with a computer in the same way a human would. It can navigate browsers, click buttons, type text, and fill out forms, making it useful for automating repetitive tasks.

Is the Computer Use feature fully functional?

Not yet. The feature is still in beta and experimental. It currently struggles with actions like scrolling and zooming, so Anthropic recommends using it for low-risk tasks while they work on improvements.

How can developers access these models?

Developers can access Claude 3.5 Sonnet and Claude 3.5 Haiku through the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI.

When will Claude 3.5 Opus be released?

While no official date has been announced, it’s expected to arrive by the end of 2024.

What are the main features of Claude 3.5 Opus?

Key features include improved coding and reasoning, a larger context window, and more autonomous functionality for handling complex tasks.

Will Claude 3.5 Opus be available to the public?

It is expected to be available to enterprise clients and developers, with potential for broader public access in the future.

How can I use Claude on a Mac with Fello AI?

To use Claude on Mac, download and install Fello AI, a desktop app that supports multiple LLMs like Claude, ChatGPT, and Gemini.

Once installed, you can interact with Claude directly through the Fello AI app without needing separate subscriptions for the models. This makes it easy to access Claude’s capabilities, including coding, reasoning, and other advanced features, all from your Mac.