AI-Native Engineer Playbook: Tool Convergence and Why Your 2024 Tool Map Is Obsolete
In this blog
PREVIOUSLY IN THIS SERIES
Part 1 made the case that AI-native engineering isn't a tool decision - it's a shift in what the practitioner's job looks like. Five activities dominate the role: scoping, delegating, reviewing, extending and redirecting. The shift is from author to orchestrator; the differentiator is judgment about which tool to use when. Part 2 maps where the tool landscape actually stands today.
This is the most important shift in how to think about AI engineering tools in 2026, and it's where most teams' mental models are outdated.
The map many people still use - "Copilot is autocomplete, Cursor is autocomplete-plus, Claude Code is the terminal one, Devin is the autonomous one" - was approximately right eighteen months ago. It isn't anymore. The tools have encroached on each other's territory, and the major platforms have all shipped substantive releases in the six weeks leading up to this writing. Six platforms now matter at the frontier of agentic coding.
Where the six major platforms actually stand
Modern Cursor has evolved into a full agentic IDE and control plane. Cloud Agents execute in isolated VMs with terminal, browser and desktop access. The /worktree and /best-of-n commands run parallel agents across git worktrees and models, surfacing the best result. Design Mode lets agents act on visual prompts while editing the underlying code; Bugbot runs PR reviews in under 90 seconds. Codebase indexing, Jira integration and the Cursor SDK give agents full project context and a programmatic extension surface. Used by more than half the Fortune 500 - including 20,000+ developers at Salesforce - Cursor's own framing captures the shift: "the IDE is now a fallback, not the default."
Microsoft Build 2026 brought GitHub Copilot's biggest architectural shift: the Copilot App, an agent-native desktop experience with a unified "My Work" view across connected repos - active sessions, issues, PRs and background automations. Each session runs in its own git worktree so parallel agents don't collide, with cloud and local sandbox options. Microsoft also shipped the MAI model family - MAI-Thinking-1 for reasoning and MAI-Code-1/Code-1-Flash for coding, trained on the Copilot harness rather than benchmark-tuned - alongside Project Polaris, a mixture-of-experts model that replaces GPT-4 Turbo as Copilot's default in August 2026. Microsoft IQ adds a four-layer context system grounding the AI in M365 signals, structured data, live web and cross-source retrieval. The Copilot SDK is GA across six languages, and multi-agent VS Code ships an orchestrator that decomposes tasks to specialist subagents.
Windsurf became Devin Desktop on June 2, 2026. The Agent Command Center is now the default surface - a Kanban board of every local and cloud agent session before you see any code. Cascade hits EOL July 1, replaced by Devin Local, a Rust rewrite claiming up to 30% better token efficiency and native subagent support. The platform ships with Agent Client Protocol (ACP), an open Apache-2.0 standard adopted by JetBrains, Google, GitHub and 25+ agents - so Codex, Claude Code, and Gemini can run as first-class agents inside the IDE. Core differentiators remain: Codemaps (AI-annotated visual codebase representations), FedRAMP/HIPAA/ITAR coverage in a single contract, and a fully vertically integrated stack - model, retrieval, IDE, and autonomous agent under one vendor.
Claude Code has evolved from a terminal interface into a complete agent platform. Opus 4.8 is the default model, with a fast mode available for latency-sensitive workflows. Dynamic workflows (research preview) let Claude plan a task and spawn hundreds of parallel subagents in a single session. Agent Teams let multiple Claude sessions message each other, claim tasks and challenge findings - genuine multi-agent coordination without a separate orchestrator. The plugin system supports ten component types, including commands, agents, skills, hooks and MCP servers; the community marketplace has surpassed a million contributions. A built-in Security-guidance plugin runs pattern checks per edit, model reviews per turn, and a deeper agentic review on commit. Anthropic has committed to bringing Mythos-class models to all customers in the near term.
OpenAI Codex has emerged as a legitimate fourth platform. What was a CLI a year ago is now a unified agent system spanning terminal, IDE extension, ChatGPT web, GitHub bot and computer-use surfaces on a shared model. The current generation runs on GPT-5.5 - OpenAI's first fully retrained base model since GPT-4.5, with agentic-first training - serving roughly four million weekly active developers. Codex Goal Mode turns Codex into a persistent autonomous runtime that runs for hours, retains state across sessions, and surfaces results when done. Codex Security finds and fixes vulnerabilities automatically.
Google now has a coherent coding suite for the first time - and it's the shift most teams haven't internalized yet. Three products form the stack. Antigravity 2.0, Google's agent-first IDE, is a modified VS Code fork that orchestrates parallel subagents powered by Gemini 3.5 Flash and Gemini 3 Pro, shipping with a CLI, SDK and Managed Agents in the Gemini API. It replaces the older Gemini Code Assist IDE extensions and Gemini CLI. Jules, Google's async coding agent, handles the autonomous-platform piece: give it a GitHub issue and it provisions a Cloud VM, plans, codes, runs tests, and opens a pull request. Code Wiki auto-generates structured documentation for any public GitHub repository - architecture, class and sequence diagrams that update on every commit, plus a chat agent grounded in the always-current wiki. For teams onboarding into unfamiliar open-source code or evaluating new dependencies, it has no direct competitor.
What's actually different now
The tool comparison conversations from 2024 are mostly obsolete. Capability differences between Cursor, Copilot, Devin Desktop, Claude Code, Codex and Antigravity on day-to-day work are now small. All six do agentic multi-file editing. All six run parallel agents. All six route between frontier models from multiple providers (often including their own in-house models). All six support MCP, and most now support ACP as well. All six ship cloud-execution surfaces of some kind. Differentiation now comes from four places:
- Workflow style. IDE-embedded vs. terminal-native vs. cloud-autonomous vs. agent-command-center-first. Same capabilities, different ergonomics.
- Extensibility ecosystem. What you can attach to the tool: skills, subagents, rules, plugins, MCP servers, ACP-connected external agents, custom commands. This is where the largest velocity gaps between teams using the same tool now show up.
- Integration posture: vertical vs. ecosystem. One vendor (Devin Desktop) owns the model, IDE, and autonomous agent in a single stack. Others control the surface while leasing the frontier-model layer; others own the model but not the IDE. Each posture trades off differently on lock-in, optionality, and feature speed - and the right pick depends on your organization's broader cloud and identity strategy.
- Model strategy and context management. Which model defaults, which you can swap in, and how the tool handles long context, large repos, and parallel agents. Each platform takes a meaningfully different approach - and at scale, context quality is often the largest performance variable.
If you're picking a tool today and the question is "can this tool do X?" the answer is almost always yes. The better question is "which tool's workflow and ecosystem fit the way our team actually works?" - and that's where the rest of this series goes deeper.
What's next
Part 3 covers the extensibility layer underneath the tools - skills, subagents and Agent Teams, rules, MCP, plugins, hooks and the newly relevant Agent Context Protocol (ACP). This is where the productivity multiplier actually lives, and where the largest velocity gaps between teams using identical tools now show up.