Generate less

The developers who feel most productive with AI tools are often producing the least trustworthy work. The discipline is not in how much you can generate. It is in how little.

There is a seductive lie at the heart of AI-assisted engineering. It goes like this: the model is fast, so you should let it be fast. Generate large. Accept whole files. Let it refactor broad and deep. You are the pilot now, not the mechanic. Let the machine do what machines do.

This is how you lose control of a codebase in a week.

I have watched it happen. A developer asks a model to implement a feature. The model produces two hundred lines. The developer scans it — not reads it, scans it — and accepts. The tests pass. The feature works. Everyone is happy. Then it happens again, and again, and again. Within a fortnight there are thousands of lines in the codebase that no human has properly understood. The architecture has drifted. Conventions have shifted between files because the model was not told, and does not care, that consistency matters. The code works. Nobody knows why.

This is not velocity. It is debt with a delayed invoice.

The practice that changed my output more than any other was brutally simple: generate less. Not less per day. Less per cycle. Small changes. Tight loops. One function, one test, one commit. Then again.

When a model gives you two hundred lines, something has gone wrong. Either the prompt was too broad or you are asking the machine to make decisions that belong to you. Architecture is a human responsibility. The boundaries between components, the names of things, the shape of the abstraction — these are judgment calls, and judgment does not come from a language model. It comes from the person who will have to live with the consequences.

So I work in narrow passes. I describe a single, bounded change. The model produces it. I read every line — not scan, read — and I read it in an environment that lets me see it properly, because we’ve already solved that problem. I accept, adjust, or reject. Then I move to the next pass.

This is slower per cycle. It is dramatically faster per week. The difference is compounding trust. Each small change is understood. Each commit is clean. Each test is deliberate. When something breaks — and something always breaks — I know where to look, because I was present for every decision. There is no mystery code. There is no section of the codebase that exists only because a model put it there and nobody questioned it.

The developers who feel most productive are often the ones I worry about most. They have found a way to generate enormous volumes of code in a single session. They talk about it with genuine excitement — look how much I shipped today. But shipped is not the right word. Deposited, perhaps. Accumulated. The code arrived. Whether anyone chose it, in any meaningful sense, is a different question.

There is an analogy to writing, and it is not a loose one. First drafts are easy to produce and dangerous to trust. The skill is in the editing — the line-by-line scrutiny that turns raw output into something you would stake your name on. Nobody publishes a first draft and calls it done. But that is precisely what developers do when they accept large generations without forensic review.

Generate less. Review more. Steer at every point. The model is a collaborator, not an author, and the moment you forget that distinction you have lost the only advantage that matters: your judgment.


This is the second in a series on LLM-assisted engineering practices. Previously: Your Screen Is the Bottleneck.

Next: why I stopped typing and started talking to my computer — and why voice-first prompting changes more than just speed.

If you are an engineering leader wondering whether your team is getting real value from AI tooling, I would like to hear from you. Find me on LinkedIn or at [email protected].