Anthropic Launches Smaller, Faster Claude Haiku 4.5 AI Model

The new Claude generative AI modelcalled Haiku 4.5, has the same coding capability as the company’s Sonnet 4 model in a smaller, faster package, Anthropic said in a press release Wednesday. The new template is being made available to everyone and will be the default template for free users on Claude.ai.
Anthropic claims that Haiku 4.5 is significantly faster than Sonnet 4, but at a third of the cost. When using Claude for Chromean extension that gives Chrome users AI capabilities in their browser, Anthropic said Haiku 4.5 is faster and better for agentic tasks.
Don’t miss any of our unbiased technical content and lab reviews. Add CNET as your preferred Google source.
Since Haiku 4.5 is a small model, it can be deployed as a subagent for Sonnet 4.5. So while Sonnet 4.5 plans and organizes complex projects, the smaller Haiku subagents can complete other tasks in the background. For coding tasks, Sonnet can handle high-level thinking while Haiku takes care of other tasks such as refactors and migrations. For financial analysis, Sonnet can perform predictive modeling while Haiku monitors data feeds and tracks regulatory changes, market signals and portfolio risks. On the research side, Sonnet can perform comprehensive analysis while Haiku reviews literature, gathers data, and synthesizes documents from multiple sources.
Haiku’s speed also helps on the chatbot side, processing requests faster.
“Haiku 4.5 is the latest iteration of our smallest model, and it’s designed for anyone who wants Claude’s superior intelligence, reliability and creative partnership in a lightweight package,” Anthropic CEO Mike Krieger said in a statement provided to CNET.
Given the high expense of training and deploying AI models, companies have looked for ways to deploy smaller, more efficient models while still performing well. An AI query consumes significantly more energy than a Google search, but it depends on the size of the AI model. A large model with more than 405 billion parameters can consume 6,706 joules of energy, or enough to run a microwave for eight seconds, according to a report from the MIT Technology Review. However, a small model, with eight billion parameters, can only consume 114 joules of energy, which is the equivalent of running a microwave for a tenth of a second. A Google search can consume 1,080 joules of energy.
Letting smaller, more efficient models take care of simpler queries or background tasks can provide significant savings on server costs. ChatGPT-5, for example, can switch between models, giving instant answers to lighter questions and allowing more power to be leveraged for complex queries. Energy-saving measures are necessary as AI companies need to be able to recoup the potential billions that will be spent on data center investments.




