Essay

Three Surfaces, One Model

2026-06-21 · 7 min read

TL;DR — Design, build, and runtime inspection each run on a separate Claude surface. It’s the same model in all three, so the leverage isn’t a smarter AI per task — it’s controlling what each surface can see and what it’s allowed to decide. Designing with no access to the code tree is the load-bearing move: it stops me anchoring requirements to the implementation that already exists, so the design follows from what the thing should do, not from how it happens to work today. The build surface implements that design and can’t quietly invent its own enforcement. The handoff between them is friction, and the friction is the point — it forces the design into a real, reviewable contract.

For most of my career the craft moved slowly — I learned how to build, and the methods held for years. That stopped, more or less overnight. The work is done with models in the loop now, and the habits I relied on for most of my career don’t transfer cleanly. So I’m rebuilding my own workflow for this world on purpose, rather than letting it form by default. What follows is one piece of that.

I build software with three Claude surfaces open at once, and the split is deliberate. Design happens in a Claude.ai project. Building happens in Claude Code. When the running thing needs a closer look — and it’s running in the browser — that happens in Claude in Chrome. Running one agent end to end would be simpler, and the simpler thing is the wrong thing: the separation does work that a single window cannot.

This is all recent for me. Seven months ago I was building the way I had built for fifteen years — by hand, no model anywhere in the loop. So nothing here is inherited practice. It is what shook out once I started handing real work to an agent and had to decide where the seams belong.

Start with the fact that kills the obvious explanation: it is the same model in all three places. Whatever reasons about my architecture in the project is the same weights reasoning about my goroutines in the editor. So the value cannot be “a sharper AI for each job.” There is no sharper AI. The value is entirely in what each surface can see and what each surface is allowed to decide. Two variables — context and authority — and I get to set them per surface.

The design surface is where I think. It carries the accumulated knowledge of the project: the decisions log, the architecture, the long memory of how the system arrived at its current shape, and a searchable history of every design conversation I have had inside it. What it does not have, deliberately, is the live code tree. It cannot open a file. It cannot grep. It reasons about the system from the record of intent, not from the current state of the bytes.

The build surface is the inverse. Claude Code has the live tree, runs the tests, drives git, greps the codebase, and sees exactly what is there right now. It is blind to the project grouping and to the design history. It knows the code and knows nothing about the conversations that argued the code into existence.

The third surface is the shortest leg, and the one I reach for least. I don’t run a verification pass every cycle. But when I notice something off in the running thing by hand, I don’t go to the browser myself — I have the build surface write the inspection for me. Claude Code drafts a prompt for Claude in Chrome, sometimes with screenshots folded in, aimed at deep debugging of what is actually deployed: check the live behavior against what was designed, not “does it compile” but “does this match the contract.” A different kind of seeing again: neither the intent nor the source, but the runtime, and only when something prompts the question.

Here is the part that sounds backwards until the alternative has burned you. Designing without the code in context is a feature, not a missing capability. The first time I noticed this I was specifying a security boundary. When you do that with the existing wiring loaded in front of you, you do not reason about what the boundary should be. You reason about what it already is. You read the auth that exists and you start rationalizing it. The implementation becomes the anchor, and every requirement quietly bends to fit the mechanism that happens to be sitting there. You end up reverse-engineering the requirements out of your own code.

Detach the design from the tree and that anchor is gone. I am forced to start from what the seam is supposed to do, not from the function that already handles it. What is it protecting, who is on each side, what must not cross — answered before I am allowed to look at how it is wired today. The same logic holds for verification: I decide what behavior would prove the boundary holds before I see the boundary’s code, so the test derives from the requirement instead of being shaped to pass against whatever got written. Requirements first, verify first. The surface with no code makes that posture structural instead of aspirational.

The membrane splits authority too, not just context. The design surface owns the shape of the seam — where it sits, what it guarantees, what is allowed to cross. The build surface implements that shape and is structurally prevented from inventing enforcement on its own, because it never received the authority to decide what enforcement means. It cannot quietly loosen a check to turn a test green, because the check is not its to define. The decision and the implementation live in different rooms, and the build agent does not hold the keys to the first one. A discipline I would otherwise have to remember on every commit becomes a property of the setup.

None of this is free, and I would be selling something if I pretended otherwise. The handoff between surfaces is friction. I have to take what the design produced and carry it across to the builder by hand. But the friction is the asset. To cross the membrane, the design has to stop being a vibe in a chat window and become an actual artifact — an explicit contract that states what gets built and what the seam guarantees. And the crossing is a checkpoint: a deliberate last look before anything is constructed, at the exact moment when changing my mind is cheapest.

One honest limit: this works best on greenfield. When the tree is small or not there yet, designing without it costs almost nothing — there’s little existing wiring to anchor to in the first place, so the detachment is nearly free. Brownfield is the unsolved case. On an established codebase you have to understand what’s already there to change it safely, and a design surface that can’t read the tree fights you instead of helping. For that work I’m still entirely in Claude Code and Claude in Chrome — build and runtime, no separate design seat — and I haven’t worked out how to bring the design-first split to a system that already exists without losing the context the change depends on. That part is still an open question.

I did not start here. For a long time everything went straight into the build agent — design, implementation, and the slow erosion of one into the other, all in the same window. The split came later, from watching what actually got decided when the code was always in the room. Starting real work in the design surface is a habit I earned, not one I began with.

The newest piece is making the bookkeeping survive the surfaces. Each window is stateless — close it and the reasoning is gone — so the record of what was decided has to live outside all three. Decisions go into an append-only log; the current shape of the system sits in one lean file the build surface keeps current as it works; the commit history carries the implementation. A single tag at the top of each build prompt joins them, so any later session can grep from a change back to the decision that authorized it and the conversation that argued for it. It is half-built. But the direction is right: the membrane is worth more when crossing it leaves a trace.