Would you rather have a coder who finishes tasks faster, or one who writes code you’ll thank them for six months later?
That’s the question we found ourselves asking when comparing Opus 4.1 with the newly released Sonnet 4.5. One model often nailed the result right away. The other sometimes stumbled at first but, with a little help from our agentic framework—error fixes, lint checks, and a visual feedback loop—produced cleaner, more maintainable code.
Opus 4.1: Fast, Feature-Rich, and Literal
Opus 4.1 often got things right on the first attempt. It produced fewer initial errors, included strong feature completeness (inline editing, keyboard shortcuts, richer avatars), and followed instructions faithfully—even down to keeping typos in the figma, which can actually be important when those “typos” are client-specific terms.
If your goal is to ship something quickly and get it working, Opus 4.1 has a clear edge.
Sonnet 4.5: Slower Start, Stronger Finish
Sonnet 4.5 didn’t always shine on the first try. It introduced more errors and sometimes missed smaller instructions. But once it went through the full feedback loop, its advantages stood out. It produced better accessibility (contrast, focus states, scoped widths), cleaner component architecture with encapsulated UX, and smarter prioritization of project-wide theming over figma quirks.
The end result is code that’s easier to maintain, scale, and reuse—qualities that become more important the longer a product lives.
Why Error Recovery Changed Everything
One of the quiet but impactful improvements in Sonnet 4.5 was how it handled error recovery. Earlier, Claude-based runs struggled with unterminated string literals, tripping over contractions like We’ll and I’d. We even had to patch our flow with Gemini 2.5 Pro just to handle these cases.
Sonnet 4.5 fixed that. Suddenly, we no longer needed the Gemini patch. Retries ran faster. The whole system became cleaner and more reliable. A small improvement at the model level reshaped the entire architecture of our pipeline.
Side-by-Side Comparisons
To really see the differences, here are the outputs side by side:
| Opus 4.1 | Sonnet 4.5 | Figma |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
This view makes it easier to spot where each model excelled and where they stumbled.
Evolution, Not Just Iteration
The story here isn’t simply that Sonnet 4.5 is “better.” Opus still wins if you measure by speed to first result. But once you give Sonnet the benefit of an agentic framework, its strengths are amplified. It builds code you’ll be glad to maintain months later.
This isn’t just iteration on a model—it’s an evolution in how these systems collaborate with engineering culture.
Additional Insights
Accessibility and interactivity showed up as the biggest tradeoff. Opus leaned toward richer interactivity, while Sonnet leaned toward stronger accessibility. Both matter, depending on your product stage.
We also noticed a different kind of production-readiness between the two. Opus feels ready to ship today. Sonnet feels ready to maintain tomorrow.
And then there’s the question of instruction following versus initiative. Opus followed instructions to the letter—even down to figma typos. Sonnet sometimes “corrected” them, which can be helpful or harmful depending on context.











