Navigating the AI Ecosystem as a Senior Engineer

A practical approach to selecting AI models, IDEs, and CLI tools in the rapidly evolving agentic development landscape.

By Katia Wheeler · November 17, 2025

Navigating the AI Ecosystem as a Senior Engineer

As a Senior Developer at Shopify, I’ve had access to a large chunk of the current AI ecosystem since Shopify went all in on AI back in April. With new models, IDEs, and CLI tools launching what feels like every other week, it can be overwhelming to figure out:

Which model to use for what
Which IDE actually improves your workflow
When a CLI agent is better than an in-editor one

This post walks through how I work in this new agentic style and how I evaluate the tools I use day to day.

Agents

It’s currently Nov 17, 2025 as I’m writing this and the rate at which agents are changing is wild. Anthropic’s Opus 4.1 dropped in August and is already considered “legacy.” That’s how fast the agent landscape is moving.

New agent models ship almost bi-weekly, each claiming to out-reason, out-code, or out-perform the others. I’ve stopped trying to evaluate everything and instead doubled down on a small set of agents that I actually rely on.

These are the ones that stuck for me and why.

AI agents taking our jobs

Sonnet 4.5

Sonnet 4.5 is my default coding workhorse.

Use cases: bug fixes, quick iteration, and most planning
Why I like it:
- Fast enough to stay in flow
- Solid reasoning and implementation quality
- Consistent across sessions and codebases

The Sonnet series has always struck a nice balance between execution and reasoning, and 4.5 continues that trend. I reach for it when I want something that “just works” across a wide range of tasks without a lot of prompt wrangling.

I used to default to Opus 4.1 for planning (see below), but since it’s now treated as “legacy,” I lean more on Sonnet 4.5 so I’m not building workflows on top of models that are actively being phased out.

Haiku

If Sonnet 4.5 is my reliable all‑rounder, Haiku is my speed demon. Haiku can handle many of the same categories of work (such as bug fixes, quick edits, short explanations, small refactors) but with an emphasis on latency and cost-efficiency over depth.

Use cases: tight feedback loops while coding, small, well-scoped changes (e.g., “rewrite this function,” “add basic tests,” “clean up this file”), quick clarifications in context (“what does this function do?”, “why is this failing?”)
Why I like it:
- Very fast responses
- “Good enough” reasoning for well-framed, local tasks
- Great for staying in flow when I don’t need full architectural thinking

Haiku can reason and plan, but where it really shines for me is execution on clearly bounded tasks, especially when I already know what I want and just need it done quickly.

How I Split Reasoning vs Execution

Sonnet 4.5 vs Haiku Both Sonnet 4.5 and Haiku can technically handle reasoning and execution, but I get the best results by giving them distinct roles in my workflow:

Reasoning / Planning → Sonnet 4.5
- Designing a feature end-to-end
- Exploring tradeoffs between approaches
- Breaking down a ticket into implementable steps
- Thinking through edge cases and failure modes

Example: “Given this existing service, design how we’d add multi-tenant support, including data model changes, API updates, and migration strategy.”

Execution / Local Changes → Haiku
- Implementing a single step from the plan
- Editing or refactoring a file or two
- Adding tests for a specific scenario
- Translating a clear spec into code

Example: “Here’s the plan from Sonnet. Implement step 2 in billing_service.rb and add tests for the new error handling behavior.”

In practice, a typical flow might look like:

Use Sonnet 4.5 to:

Understand the problem
Propose an approach
Break the work into concrete steps

Use Haiku to:

Implement those steps quickly and iteratively
Handle the repetitive or mechanical parts of the work
Polish and refine code as you go

This split keeps me from overloading one model with every kind of task and lets each one operate where it’s strongest: Sonnet for thinking, Haiku for doing.

Composer‑1

Composer‑1 came onto my radar with the release of Cursor 2.0.

Use cases: greenfield work, exploration, rough drafts
Why I like it:
- Very fast
- “Good enough” reasoning for early iterations
Where it falls short:
- Not always accurate
- Can be brittle when working in large, established codebases

If I’m spiking a new idea, prototyping a feature, or scaffolding something from scratch, I’ll happily throw Composer‑1 at it. But when I’m working inside a mature, complex codebase (especially at Shopify scale), I still default to Sonnet 4.5 for its stability and consistency.

GPT‑5

GPT‑5 has essentially replaced Google for me.

Use cases: research, learning, concept explanation, broad overviews
Why I like it:
- Strong at explanation and education
- Great at pulling together context and summarizing
- Usually the most reliable in terms of factual accuracy when carefully prompted

It’s not always the fastest option, and I don’t lean on it as heavily for day-to-day coding compared to Sonnet. But for answering “why” questions, exploring unfamiliar domains, or synthesizing long-form content, GPT‑5 sits at the top of my stack.

Honorable Mention: Opus 4.1

Even with its “legacy” status, Opus 4.1 is still one of my favorite models for planning.

Use cases:
- High-level system design
- Implementation plans
- Architectural blueprints
Why I like it:
- Concise while still going deep where it matters
- Strong at exploring tradeoffs and edge cases

When I need a detailed, well-structured plan that balances pragmatism and thoroughness, Opus 4.1 still delivers. I use it a bit less now due to its legacy status, but it’s absolutely worth mentioning because of how much it shaped my current workflow.

IDEs

VS Code is no longer the only serious option in town. New AI-native or AI-augmented editors like Cursor, Zed, and Kiro have entered the picture, and I’ve ended up with a multi-IDE setup depending on what I’m working on.

An IDE with real code

Cursor

I avoided Cursor for a while. Before Cursor 2.0, it felt bloated, clunky, and absolutely brutal on my CPU.

Post‑2.0, it’s become my primary editor of choice.

Why I like it:
- The agentic view is clean and well-integrated
- It genuinely supports a flow state while coding with an AI partner
- The review experience of seeing diffs, understanding changes, and integrating them is solid
- Performance has improved to the point where it no longer hammers my machine

If you bounced off Cursor before because it felt heavy or unstable, it’s worth giving 2.0+ another shot.

Zed

Zed has carved out a distinct niche for me:

Open source
Lightweight and extremely fast
A growing extension ecosystem
Highly configurable

I use Zed primarily for: — Navigating large codebases — Quickly jumping between files and symbols — Situations where I want raw speed over heavy AI integration

It does have a built-in agent pane that you can configure for any agent, but I often still default to a CLI-based agent instead (see below). The in-editor agent sometimes feels a bit laggier to me, or that might just be my perception compared to the snappiness of the rest of Zed.

Overall, Zed is a very solid choice when you care about performance and minimal friction.

Kiro

Kiro, Amazon’s spec-driven IDE, is an interesting experiment in how to structure AI-assisted development.

Pros:
- Spec-first flow can be helpful during planning
- Encourages explicit design before implementation
Cons (for my workflow):
- Feels slow
- Writing a spec for every change becomes tedious fast

I’ll occasionally use Kiro during a planning-heavy phase, especially when I want stricter structure around what I’m building. But more often than not, I find myself reaching back for Opus 4.1 or Sonnet in a dedicated agent environment instead.

CLI Tools

Having an agent that’s not tied to an IDE but still fully integrated into your machine is an underrated superpower.

With CLI-based agents, I’ve been able to:

Drive workflows in tools like Obsidian
Generate daily summaries of my work
Track “wins” and notable events for my brag doc
Work across multiple repos or non-repo code and notes

These tools shine outside the narrow world of a single editor or git repository.

Real picture of my computer

Claude Code

You’re absolutely rightif you think Claude Code is still the reigning champion of CLI-based agents in my setup.

Strengths:
- Consistent reasoning
- Strong coding output, especially for refactors and multi-file changes
- Good at maintaining context over longer sessions
When it goes off the rails:
- Like any model, it can hallucinate
- The upside is that you can usually:
- Redirect it with clearer instructions, or
- Clean up its “memory” / context and nudge it back on track

I use Claude Code for anything that feels like **“AI terminal pair-programming”.**From writing and editing code to generating summaries of my day and updating my personal knowledge base, I know Claude Code can handle it.

OpenCode

OpenCode fills a key gap: it gives you a Claude Code–style CLI experience but with the ability to plug in different models.

Why it’s useful:
- Open source
- Model-agnostic: you can bring your own keys and swap in alternatives
- Great if:
- You don’t have an Anthropic subscription
- You want to experiment with multiple providers behind a single workflow

If you like the idea of a CLI agent but don’t want to be tied to one vendor, OpenCode is worth a look.

Codex

In my setup, Codex is the OpenAI-flavored alternative to Claude Code:

What it does:
- Lets you interact with GPT models via the CLI
- Brings the power of GPT‑5 (and friends) to your terminal workflows
Where I use it:
- Research tasks directly from the terminal
- Quick Q&A while working inside a repo
- Generating or tweaking scripts and one-off utilities

I tend to reach for Codex when I specifically want GPT’s style of reasoning or explanation but in a CLI-first workflow rather than a web UI or IDE integration.

Closing Thoughts

The AI tooling landscape is moving too fast for any one person to thoroughly evaluate every new model, IDE, or agent that drops. Instead of chasing everything, I’ve focused on a small, opinionated stack:

Agents
- Sonnet 4.5 for day-to-day coding and planning
- Composer‑1 for fast greenfield iteration
- GPT‑5 for research and explanation
- Opus 4.1 (legacy, but still excellent) for deep planning
IDEs
- Cursor 2.0+ as my primary AI-native editor
- Zed for performance and large codebase navigation
- Kiro occasionally, when I want a spec-first workflow
CLI Tools
- Claude Code as my main terminal agent
- OpenCode for model-flexible CLI workflows
- Codex when I want GPT models at the command line

The “right” stack will vary by team, company, and personal preference. But if you’re feeling overwhelmed by the sheer volume of options, my recommendation is:

Pick one primary agent, one IDE, and one CLI tool to really invest in and learn
Use them deeply for a few weeks
Only then start swapping components to see what meaningfully improves your flow

The goal isn’t to use all the tools. It’s to build a setup where the AI feels less like a novelty and more like a trusted teammate embedded in your everyday workflow.

👋 Bye bye!

Originally published on Medium.