Claude Opus 4.5: new features, prices and real improvements

Worldbytes » Artificial Intelligence » Claude Opus 4.5: all the new features, prices and improvements

Opus 4.5 leads in programming and agents, with 80,9% in SWE-Bench and better than rivals.
New effort parameter and price drop to 5/25 USD per million tokens.
Updates in Chrome, Excel and Claude Code, with reinforced security and memory.
Strict usage limits and regional endpoints with a 10% cloud premium.

AI image and productivity with Claude Opus 4.5

Claude Opus 4.5 It's here, and it comes with a clear ambition: to position itself at the top in programming, intelligent agents, and office tasks. Anthropic's proposal doesn't stop at grandiose headlines; it comes with figures, comparisons, and product changes that, on paper, could redefine how we use IA in real work.

Beyond direct competition with other giants in the sector, The update incorporates a leap in token efficiency and effort controlAlong with improvements in security, tool usage, memory, and a host of new application integrations, it's not just a faster model: it's a platform designed for long, multi-agent, and less frictionless work cycles.

Vibe working in Excel and Word: Agent Mode and Office Agent

What is Opus 4.5 and why has it caused such a stir?

It comes just days after moves by the competition, with a resounding positioning: Anthropic presents it as its most capable model and the Better AI for every task in programming, agent management, and computer useThe company also highlights its value in in-depth research, document creation, and visual and mathematical reasoning.

To avoid remaining in the realm of theory, the firm maintains that the model It surpasses other benchmarks such as Gemini 3 Pro and GPT-5.1 Codex-Max in software engineering testing. And in practice, the idea is clear: fewer steps, more precision, and better coordination with tools in complex workflows.

Claude Opus 4.5 new releases

Measured performance: benchmarks and real-world tests

In the benchmark for software engineering, SWE-Bench Verified, Opus 4.5 achieves 80,9% accuracyoutperforming both its predecessors and top-tier competitors. This data matters because it measures the ability to solve real-world issues in GitHub projects, not just play exercises.

Anthropic has gone further with a real hiring test for engineersThe test, timed at two hours and rated for its difficulty, was, according to the company, that the model not only solved the exercises, but also... He beat all the human candidates who took the same exam, relying on strategies such as parallel computation of hypotheses.

In everyday office tasks, the leap is also noticeable: better results in spreadsheetswith reported increases in accuracy of 20% and efficiency of 15% in financial models, in addition to the ability to organize databasesPrepare presentations and write lengthy reports without losing the thread.

All of this is supported by a long-term context of up to 200.000 tokens in internal testing and reinforced working memory management, where Long conversations benefit from automatic summaries to maintain consistency without running into window limits.

Image of AI agent and tools

Efficiency, cost, and the new effort parameter

One differentiating factor is the price: The API price drops from $15 and $75 per million entry and exit tokens to $5 and $25.respectively. This opens the door to automations that were previously too expensive for everyday use.

The key technical element for maximizing savings is the effort parameter, with low, medium, and high levels. At medium effort, Opus 4.5 matches the performance of Sonnet 4.5 on SWE-Bench Verified using 76% fewer output tokensWith great effort, It outperforms Sonnet 4.5 by 4,3 percentage points and still spends 48% less tokensThe novelty here is not only the control, but also the ability to vary the depth of reasoning without changing the model.

This adjustment affects the entire response: text, tool calls, and extended thinkingWith low effort, you get more concise and efficient answers; with high effort, you get detailed analysis and extensive explanations for complex scenarios.

Billing also introduces practical nuances: Anthropic recognizes automatic optimizations that add few tokens to requestsHowever, he clarifies that these tokens added by the system are not charged. Small details, yes, but they matter when scaling.

Fix Error 8DDD0020 in Microsoft Update

More than code: agents, office automation and computer use

Opus 4.5 aims high in programming, but Their improvements don't stop thereThe model excels at creating professional documents, spreadsheets, and presentations, and at research tasks with multiple sources, managing long threads without losing relevant context.

In agent capabilities, coordination goes up a level: Effective management of sub-agent teams For complex multi-agent systems, capable of dividing work, prioritizing, and progressing stably for hours in long workflows.

On the computer usage front, the update incorporates a zoom action for detailed inspection of on-screen regions at full resolution. This is useful for reading small print, analyzing interfaces with dense information, or verifying details before taking action.

The combination of reasoning, tools, and memory allows Opus 4.5 to undertake tasks include migration and code refactoring, report generation, and desktop automation. with fewer turns and less token waste.

Safety and robustness: alignment versus prompt injection

Autonomy raises questions about control and reliabilityHere, Anthropic claims that this is its more robustly aligned model To date, there has been concrete progress against instruction injection attacks that attempt to deflect system behavior.

This is not a minor detail: Deploying agents with access to tools requires additional defenses.The company maintains that it has strengthened the barriers without compromising usability. Even so, they recommend good design practices and human oversight in sensitive scenarios.

Ecosystem and apps: What's new in Claude Code, Chrome and Excel

The update doesn't stop at the model. It extends to the product stack. Claude Code improves its planning modeBefore getting started, ask clarifying questions and create an editable file with the plan to facilitate review and control.

In the browser, Claude for Chrome is released for Max userswith the promise of managing tasks across multiple tabs and coordinating actions within longer work sessions. For those who work with spreadsheets, Claude for Excel comes to Max, Team and Enterprise, with support for charts, pivot tables and file uploads.

In the app, one of the most practical new features is that Long conversations no longer get stuckThe system automatically summarizes the previous context as needed to extend sessions, maintaining consistency and traceability of decisions.

All of this comes alongside the availability of Opus 4.5 in the API and on the main cloud platformsThis facilitates integration into existing pipelines without waiting for dedicated deployments.

Three 4.5 models for different needs: Opus, Sonnet and Haiku

The 4.5 family is organized into three profiles. Opus 4.5 is the ultimate intelligence With practical performance for high-level specialized tasks, professional engineering, and advanced agents. It is the only one that accepts the effort parameter.

sun 4.5 It's the workhorse for coding and complex agents. It brings improvements across the entire development lifecycle: systems planning and design, security engineering, more accurate instruction following and a concise and natural communication style, with fact-based progress updates.

In agent capabilities, Sonnet 4.5 works autonomously for hours while maintaining focus. with awareness of the context and the token budget in real time. It uses parallel tool calls, better coordinates multiple sources, and preserves state between long sessions.

Haiku 4.5 It focuses on speed and cost, achieving near-frontier performance at a third of the price, with more than double the speed of Sonnet 4. It brings for the first time thought extended to the Haiku line, with optional thought summary, interspersed between tool calls and thought token budget control.

With this, Anthropic restores balance to its catalogIn recent months, Sonnet 4.5 overshadowed the older Opus 4.1; now each model is regaining its place in terms of cost, speed, and capacity.

What Is An SRT File? What Is It For And How To Open It

Using tools and new APIs: what changes on a daily basis

For multi-tool workflows, Anthropic introduces programmatic tool callsThe model can write code that invokes tools within an execution container, reducing round-trip latency and filtering data before loading it into the context window.

If you have hundreds of tools, the new search for tools It allows you to discover and dynamically load only what's needed. There are two variations: using regex patterns with the tool tool_search_tool_regex_20251119and through natural language queries with tool_search_tool_bm25_20251119, saving 10.000 to 20.000 context tokens by not loading the entire catalog.

To improve the accuracy of summons, you can contribute examples of tool usage with valid inputs to guide the model through complex schemes; and if you're concerned about context, there are context editing which automatically cleans up old calls and results when the token limit is approaching.

In execution control, the 4.5 models include new reasons for stopping: model_context_window_exceeded to indicate that the context window has been reached, differentiating it from the top of max_tokensand the reason refusal This update addresses issues that arise when the system refuses to generate content for security reasons. Additionally, it fixes a bug that preserves line breaks when passing parameters to tools.

Extended thinking returns a summary of the internal process in the messaging API, and when transmitting, it may arrive in fragmented deliveries with small delays; nothing critical, but it's worth keeping in mind for the UX of streaming.

Development tools: text editor and code execution

If you use Claude's text editor, there's a new version: type of tool text_editor_20250728 with name str_replace_based_edit_tool, and the command undo_edit It is no longer supported. Note if you are migrating from Sonnet 3.7.

For code execution, the following is recommended: self code_execution_20250825which adds commands Bash and file manipulation. The legacy variant code_execution_20250522 It's still available, but since it's only Python It is not recommended for new implementations.

These changes, along with the support of interspersed use of tools and extended thinkingThey push towards more natural flows in which the model reasons, consults tools and continues the conversation without artificial jumps.

Pricing, endpoints, and cloud availability

With the price dropping to $5 per million tokens entering and $25 per million exiting, The 4.5 models maintain competitive pricesThere is also a new feature for endpoints when they are consumed via cloud providers.

AWS Bedrock and Google Vertex AI offers global and regional endpoints For Opus 4.5, Sonnet 4.5, and Haiku 4.5, regional services guarantee geographic routing with a 10% price premium. Anthropic's proprietary API is global by default and is unaffected by this change.

Opus 4.5 is available in Anthropic's applications, API, and main platforms, including integrations like Amazon Bedrock. This reduces There for implementation in business environments.

Limitations and fine print: what you should know

The Achilles' heel, for the moment, is the usage limits and quotasEven for Pro and Max plans, tokens run out quickly, and the counter resets every five hours from the first message. Since Opus is the most powerful plan, it also consumes tokens more rapidly, leading to frustration for users who pay $20 or even $100 per month.

Anthropic prioritizes availability. United States and Western EuropeIf you operate from Latin America or Asia-Pacific, latency may increase and local language support may be more limited. It's advisable to measure real-time latency before committing to critical deployments.

Another point is the dependence on connectivity and cloud servicesIntegrations like Excel and Chrome rely on cloud services. For regulated sectors that require on-premises deployments, private deployments will need to be negotiated, or open models considered in specific scenarios.

Finally, getting the most out of Opus 4.5 requires Training in prompt engineering, context management, and debuggingWithout good practices, capacity is wasted and token spending skyrockets; it's worth investing in internal training.

How to use Meta's MusicGen locally without uploading files to the cloud

Strategy and comparison: where it fits in against OpenAI and Google

With Opus 4.5, Anthropic positions itself as premium provider for professionals and developersCompeting head-to-head in applications where precision and reliability are paramount, this competition has unleashed a price and capability war that benefits the buyer, and the combination of performance, cost, and effort control is a powerful lure.

Compared to rivals, Opus 4.5 excels in workflows with autonomous tools and agentsIn multimodality or pure reasoning, the gap with some competing models is smaller, but the set of functions of the 4.5 ecosystem tips the scales in production scenarios that require persistence and coordination.

When to migrate and how to evaluate it within the company

If you're coming from Claude 3.5 or Opus 4.1, consider making the jump when you need complex reasoning, high token volume, or agent capability with access to tools. At over 10 million tokens per month, the savings offset the reconfiguration effort.

Anthropic documents migration routes with and without break-away shifts: Sonnet 3.7 to Sonnet 4.5, Haiku 3.5 to Haiku 4.5 (more changes), and smooth upgrades from Opus 4.1 to Sonnet 4.5 or Opus 4.5. It is advisable to review the checklists before moving production environments.

To make the decision, ask yourself if you have repetitive processes with sufficient volume, control over data and validation criteriaAnd clear KPIs that measure hours saved, errors, and response times. Without metrics, any pilot is left adrift.

Practical checklist for secure adoption: privacy policies and DPA, bounded proof of concept (for example, support tickets or meeting summaries), internal training of two key people, cost monitoring with alerts, and a contingency plan in case the service changes or fails.

For developers: Use Opus 4.5 in Cursor and Claude Code

To take advantage of Opus 4.5 in day-to-day development, Create an Anthropic account and generate an API keyActivate access to the model according to your plan (Max, Team, or Enterprise) and configure your usual tools, including Deepseek Coder.

In Cursor, add the Anthropic API key in the models section and select Opus 4.5 in the AI-powered chat panel. You can work with chat-assisted autocomplete and multi-agent flows directly in the IDE; there is a paid Cursor Pro plan that, according to the tool, enables simplified access to advanced models.

In Claude Code, launch the CLI in your project directory, Log in with your password and change the model using the selection command. From there, activate planning mode to suggest steps before you touch code, and use it to refactor, debug, or run goal-directed scripts.

Best practices: switch between templates as needed (Haiku or Sonnet for light tasks, Opus when reasoning demands itMonitor token usage to prevent drift and respect rate limits. If authorization errors appear in third-party tools, check that your account has the model enabled and that you are using the latest version of the client.

For frequently asked questions, please consult the tools help center and forums, where known incidents such as fragmented responses in extended thinking or unauthorized model messages are listed when the API key does not match the contracted plan.

In light of all the above, Opus 4.5 combines benchmarking muscle, fine-tuned cost control, and platform improvements This makes it especially attractive for software engineering, office automation, and autonomous agents. The issue of usage limits remains to be resolved to fully round out the experience, but the direction is clear: higher quality per token and an ecosystem better prepared for real, sustained work.

Isaac

Passionate writer about the world of bytes and technology in general. I love sharing my knowledge through writing, and that's what I'll do on this blog, show you all the most interesting things about gadgets, software, hardware, tech trends, and more. My goal is to help you navigate the digital world in a simple and entertaining way.