Claude Code in May 2026: what the AI CLI can do

Most conversations Rough Works has with prospective clients in 2026 start with a question: "What's actually possible with Claude right now?" The honest answer is that the surface area has expanded faster than any single article can keep up with. Below is the May 2026 snapshot — what Anthropic has shipped recently, what's available across the developer platform, what you can do with it on your own without hiring anyone, and where an agency partner adds value beyond what individual access can.

The model layer: Opus 4.7 is the new default

Claude Opus 4.7 dropped on April 16, 2026. It's the most consequential model release of the year for developers because it changes the cost-benefit math of using the top-tier Claude model on production workloads.

Specifics that matter, pulled from the launch announcement:

A reported 13% lift in resolution on a 93-task coding benchmark over Opus 4.6. For real-world coding workloads — refactors, multi-file changes, integration work — that's a real and measurable improvement, not benchmark noise.
Vision inputs up to 2,576 pixels on the long edge. That's more than three times the previous limit. Dense screenshots — full-page Figma exports, codebase diagrams, screenshots of broken UI — go in without downscaling. The model reads them at the resolution they were captured.
Substantially better instruction following. Earlier Claude models interpreted instructions loosely; 4.7 follows them more literally. Important caveat: prompts tuned for older models may now over- or under-trigger. Production prompts need retuning.

For real work, the practical effect is: Opus 4.7 is now the right default for high-stakes coding tasks. Sonnet 4.6 remains the right call for high-throughput, latency-sensitive work; Haiku 4.5 still handles classification and short-form transforms cheaply.

Claude Design is the surprise new product

On April 17, 2026 — one day after Opus 4.7 — Anthropic Labs released Claude Design, a visual collaboration tool that produces HTML/CSS prototypes from prompts. It's not a Figma replacement, and it's not a Webflow competitor exactly. It sits in a category Anthropic mostly invented: a design-to-code environment where the design and the implementation are the same artifact from the moment you start.

The output is production-grade HTML/CSS/JS in a single file. No build step, no framework, no compilation. Modern HTML and CSS are powerful enough that a single-file site can use scroll-driven animations, view transitions, CSS Grid, container queries, and variable fonts — without any toolchain. It runs in the browser. A client can review a prototype in any browser tab.

For more on what Claude Design does well and where it stalls — and why it's not a one-click design solution despite the marketing — see our walkthrough of Claude Design and what it means for design agencies.

The CLI surface: what Claude Code actually does

The core Claude Code product is the CLI. It runs in any modern terminal, integrates with VS Code (including Cursor and Windsurf), JetBrains IDEs (in beta), and a native desktop app. There's also a web interface at claude.ai/code and Slack integration for triggering tasks from team chat.

Capabilities that are now standard:

Agentic search across an entire codebase — no need to manually point Claude at specific files for context. It finds relevant code itself.
Coordinated multi-file edits — refactors, migrations, and large-scale renames happen as a single reviewed change.
Full git workflow integration — reading issues, writing code, running tests, opening PRs.
Permission gating — Claude asks before modifying files. There's also --dangerously-skip-permissions for trusted environments and a safer Auto mode for long-running unattended work.
MCP server integration — connect Claude Code to GitHub, Linear, your databases, internal APIs. The Model Context Protocol is now an industry-wide standard with hundreds of community-built servers.
Routines — configurable tasks that run on schedules, API calls, or event triggers. Effectively cron + Claude.
CLAUDE.md project files — repo-level configuration that teaches Claude your project's conventions, constraints, and preferred patterns.

What you can do with Claude Code today, on your own

You don't need an agency to start getting value from Claude Code. The most practical patterns for a small business or solo operator, in order of "value per hour spent learning":

1. Use it as a research assistant in your terminal. "Read these three articles I'll paste in and give me a one-page summary with the action items." Most knowledge workers spend more time than they realize on reading-then-summarizing. The CLI is faster than tabs.

2. Use it to do the boring half of a task. Writing email templates, formatting a CSV, drafting a meeting agenda from rough notes, turning a bullet-list into a thank-you message. The high-leverage version of Claude isn't replacing complicated work — it's eliminating the tedious work that surrounds complicated work.

3. Treat it as a code-review partner if you can read code. Even non-developers benefit here. Paste in a CSV export and ask Claude to flag anomalies. Paste in a vendor's contract and ask for the riskiest clauses. The "I know enough to recognize a problem when it's pointed out" zone is where Claude's value is highest.

4. Set up one repeating workflow. Pick a single task you do every week — generating a status update, drafting client correspondence, summarizing customer feedback — and codify it as a Claude Code Routine. You'll get the result in seconds instead of an hour, every week, for free.

5. Try it on a small first project. Set a budget of five hours and a single goal: "replace this one spreadsheet with a tiny web app" or "automate the invoice-generation script I run on the 1st of every month." Watch what Claude does. The feel for what it's good at is something you can only develop by doing.

The Pro tier is affordable enough that this entire learning curve costs less than one professional services consultation. Anyone reading this article can run the entire playbook above this week.

Where agency partners actually add value

If self-serve gets you 60% of the way, what does an experienced agency partner add beyond Pro tier and a willing team?

1. Production architecture, not just prompts. The hard part of an AI-integrated product isn't the prompts — it's everything around them: error handling when the model is slow or rate-limited, retry logic, prompt caching strategy, fallback paths when the API is down, observability so you can see what's happening when something breaks. We've shipped enough Claude-integrated production systems that the operational playbook is reflexive. A first-time integrator will spend two months learning these patterns; we plug them in on day one.

2. Integration with messy real-world systems. Most clients don't actually want "an AI tool." They want their existing CRM, billing system, content stack, and customer-support workflow to quietly absorb Claude-grade capability without changing how their team works. That integration is the work. It's not glamorous; it's where the time goes.

3. Editorial taste over prompt-engineering. Claude Code is increasingly good at the engineering part. It's still not good at deciding what to build. The agency value is in the conversations before the code — figuring out which problems are worth solving, scoping the work realistically, sequencing the build so each piece compounds on the last. That's still a human conversation.

4. Discipline around what not to do. A frequent failure mode in AI projects is "we built five things and shipped none." Agencies that have shipped real Claude-powered work develop a calibrated sense for scope. We say no to a lot of feature requests because we know they'll cause maintenance pain six months from now. That's experience worth paying for.

5. Post-launch operational sense. AI features have ongoing costs (tokens, monitoring, prompt updates as models evolve, occasional unexpected behavior). The work doesn't end at launch. We carry the operational load so the client's internal team doesn't have to become AI specialists.

The pattern for a real engagement

Here's what a typical Claude-integration project looks like when Rough Works runs it:

Week 1 — Discovery. Three meetings. We figure out what business outcome the client actually wants (usually different from the AI feature they initially asked for), map the existing system landscape, identify the two or three highest-leverage integration points. Output: a brief that fits on one page.

Weeks 2-4 — Prototype. A working, end-to-end version of the feature against real data. Not pretty yet, not handled all edge cases — but the loop runs. The client clicks through it, finds the parts that don't match their mental model, redirects us.

Weeks 5-8 — Production hardening. Error handling, retry logic, monitoring, security, performance work, integration with the actual production data, documentation for the client's internal team. This phase is roughly 60% of the total project effort and the part where experienced agency partners earn their fee.

Week 9+ — Launch and operate. We hand off, but stay on retainer for the first three months. Things will break. Prompts will need tuning. The model itself will get updated and behavior will shift. The agency value continues as long as the system is in production.

Total clock time: 8-10 weeks for a meaningful Claude-integrated feature on top of an existing system. Total cost: less than what most companies spend on a single quarterly software license.

What's visibly on the roadmap

Anthropic doesn't publish a formal roadmap, but the announcements of the last six weeks point at three clear directions:

1. Managed Agents are about to be the dominant pattern. The Managed Agents API (beta) lets you persist agent configs, version them, and run sessions against pre-built containers. Expect Claude Code itself to expose more of this surface in the next two quarters.

2. The browser is becoming a first-class agent surface. Between Claude Design's in-browser code generation, Claude Code's web interface at claude.ai/code, and computer-use capabilities maturing rapidly, the path of least resistance for many new Claude products is "open a browser tab," not "install a CLI."

3. Vision is no longer a side feature. Opus 4.7's 2,576px vision input changes what's possible for design feedback, visual debugging, screenshot-based QA, and document understanding. Expect more workflows where you screenshot something and ask Claude about it directly.

Stainless and the KPMG signal

Two recent moves that tell you where the industry is going:

The Stainless acquisition (May 18, 2026). Anthropic now owns the toolchain that generates their official SDKs. Expect tighter feature parity between API releases and SDK availability, official SDKs in more languages, and generated client libraries for your APIs becoming a Claude-adjacent product.

KPMG at 276,000 seats (May 19, 2026). The largest single Claude deployment publicly announced to date. The big-four firms are betting workforce productivity on Claude specifically — not on a multi-vendor strategy. The downstream effect: prospective clients arrive at AI conversations already familiar with Claude.

What to actually do with all this

If you're a business considering an AI engagement:

Spend your first hour playing with Claude directly — for free or on a Pro tier — before you talk to any agency. The fluency you build is worth more than the time it costs.
Identify three boring tasks your team does weekly. Audit reports, content review, lead enrichment. Those are the right first targets.
Don't pick the biggest, most ambitious thing first. "Replace our entire CMS" adds risk; "auto-generate weekly newsletters from approved blog posts" adds leverage. Start small, ship fast, expand.
Then decide whether you need an agency. If the work fits inside what a careful operator can do alone, do it yourself. If it requires production-grade architecture, integration with messy systems, or post-launch operational support — that's where partners earn their keep.

If you're considering Rough Works specifically — come talk to us. We're happy to look at a candidate project and tell you honestly whether you need an agency, or whether what you actually need is a focused weekend with Claude Code's free trial.

Claude Code in May 2026: what the AI CLI can do — and how to use it well

The model layer: Opus 4.7 is the new default

Claude Design is the surprise new product

The CLI surface: what Claude Code actually does

What you can do with Claude Code today, on your own

Where agency partners actually add value

The pattern for a real engagement

What's visibly on the roadmap

Stainless and the KPMG signal

What to actually do with all this

Rough Works

Ready for a Website That Actually Works?

The model layer: Opus 4.7 is the new default

Claude Design is the surprise new product

The CLI surface: what Claude Code actually does

What you can do with Claude Code today, on your own

Where agency partners actually add value

The pattern for a real engagement

What's visibly on the roadmap

Stainless and the KPMG signal

What to actually do with all this

Rough Works

Keep reading

Berner Crawl: seven engineering problems behind our 3-dog roguelike

OLVR's ring scroll: replicating a Rolex 3D effect with 360 sprite frames

Four Awwwards case studies from 2026 that should reshape how you brief agencies

Ready for a Website That Actually Works?