Skip to content
Back to blog

But do it better

Engraph||6 min read

It's almost midnight and you're staring at a diff that doesn't match what you asked for. Not wrong, exactly. The code works. But the agent made a decision about your project that doesn't match the decision you made three sessions ago, and you don't have the words to explain why it's wrong without relitigating every choice you've ever made. So you type the only thing you have left: "no, but do it better."

The agent apologizes. Rewrites the file. Gets it closer this time, or maybe just differently wrong. You tweak the prompt. Try again. It's 12:40 and you're doing something that feels productive but isn't - you're re-teaching your agent something it already knew an hour ago, in a session that no longer exists.

I've been watching people I care about build things with AI agents. Genuinely creative people with real ideas, using tools that are more capable than ever. And I keep seeing the same thing happen around week three.

The loop

You know the move. The session has gone sideways, the agent keeps making the same weird choice, and you've run out of ways to rephrase what you want. So you kill it. Start fresh. Maybe this agent will get it. You're not debugging anymore. You're gambling - rolling for a better agent like it's a gacha game, hoping the next draw reads your mind. At least gacha games have pity counters. Your agent sessions don't get better with repetition.

Sometimes it works. The new session picks up a different pattern from your codebase and accidentally does the right thing. That feels like progress. It's luck, and luck doesn't compound. Tomorrow you'll open a new session and the lucky agent is gone, replaced by one that's never heard of you or your project's opinions about anything.

The ecosystem keeps offering solutions that sound right (MCPs, skills, better rules files, custom instructions) and each one helps with a piece of the problem. None of them address the thing that actually breaks people: the corrections don't survive. You spend a real hour of your real life figuring out that the sidebar goes on the left and that the contact form needs proper validation. That knowledge exists for exactly one session. The next time you sit down, it's gone. Not deprecated. Not overwritten. Just gone, like the conversation never happened.

And it's worse when your project abstracts the code away from you. If you're building something where the interface is Figma or a no-code tool or natural language prompts, you can't even read the diff to figure out what changed. You just know it feels wrong, and all you can say is "but do it better" because you don't have the vocabulary to say what "better" means technically. The effort is real. The tooling isn't meeting it.

Week three

A friend of mine started building a recipe-sharing app in January. She's a graphic designer who's never written a line of code, but she had a clear idea and strong opinions about how it should feel. Three weeks in she was correcting the same patterns she'd already corrected. Not because the agent was bad. Because the agent didn't know she'd already solved this. Every session was a first date - except on a first date, at least the other person is trying to remember.

By week four she was spending more time managing the agent than building the thing she cared about. The audit documents piled up - markdown files at 1:30am, trying to figure out why the agent made a choice that contradicted something she'd explicitly decided two weeks earlier. She'd go to sleep promising herself she'd organize it all tomorrow.

You go to sleep promising yourself you'll pick it up tomorrow. You never do.

The project didn't fail dramatically. It got quieter. Fewer sessions per week. Longer gaps between them. By February she was describing it in past tense even though the repo was still there. I've seen this exact arc four times now, with four different people building four different things, and the pattern is always the same.

"Just learn to code"

The obvious response. And I want to take it seriously because it's not entirely wrong. Understanding your tools, learning to read what the agent produces, being able to say "the login page should follow the same layout as the landing page" instead of "it looks off" - all of that helps. More vocabulary means faster communication.

But some of the people who flame out at week three flame out because they don't know what they want yet. No persistence layer fixes that. If you can't tell the agent what's wrong, capturing your corrections won't help because there's nothing to capture. That's a different problem and this piece isn't trying to solve it.

Most of the people I'm watching DO know what they want. They demonstrate it every time they correct the agent. Always reuse the components from the landing page. The navigation stays fixed at the top. These aren't vague preferences. They're ground rules, decisions about how things should work and patterns that should be consistent. Right now they live nowhere. They're in your head, or scattered across old chat sessions you'll never reread, and every correction you make is one of these decisions trying to exist.

An experienced engineer hits the same wall, just with more precise language. They can write a detailed CLAUDE.md. They can articulate exactly what they want. And the agent still ignores rule fourteen because the session ran long and attention faded, or because they started a new session and the context reset. Worse, the agent's own system prompt can actively work against the rules you wrote. Claude Code's built-in brevity directives ("lead with the answer, not the reasoning," "avoid over-engineering") override CLAUDE.md instructions because the system prompt arrives with more structural weight than your rules file. The system prompt sits at the top of the context window, in the high-attention zone. Your CLAUDE.md arrives later, dimmer. One developer resorted to prefixing critical rules with "THIS IS NOT OVER-ENGINEERING" just to stop the agent from dismissing their specific architectural decisions as unnecessary complexity. That's a workaround for a structural problem, not a workflow anyone should accept.

The gap that matters is between "made a decision once" and "that decision carries forward into how every future session works." Technical skill helps you articulate the decision faster, but it doesn't make the decision persist. Capturing those decisions through the natural act of correcting your agent does.

What if the corrections stuck

I build Engraph, and what Engraph does is directly relevant to the problem I just spent several paragraphs describing. I'm not going to pretend the framing is neutral. But the problem predates the product, and if Engraph didn't exist, the problem would still be real and this piece would still be worth writing.

Imagine every correction you make (the nav bar stays fixed at the top, use the same font stack as the landing page) persists past the session. Not as a line in a rules file you have to remember to maintain, or one you have to prompt the agent to update. Just a ground rule with context: why you decided this, when, what part of the project it applies to. The next agent, in the next session, on the next late night, gets that ground rule before it starts working. The thing you learned last Tuesday is still here on Thursday.

That's the concept. It doesn't prevent the first mistake. The first time, you still have to catch it and say "no, not like that." What it prevents is the second, third, and fortieth time, because the correction compounds instead of evaporating. The dread of starting a new project shrinks, because the lessons travel with you.

You don't start from zero either. Point Engraph at an existing codebase and it suggests rules based on the patterns already there. Pick the ones that sound right, skip the rest, and every correction from that point sticks.

The tradeoffs are real though, and they're sharpest for the people this piece is most about. A non-technical builder is more likely to encode a bad pattern as a rule and less likely to notice when it's causing problems downstream. My friend could easily have locked in "always center the main content" early on and spent weeks wondering why certain layouts felt cramped, never connecting it back to a decision she made in session two. Persistence carries good decisions and bad ones forward with equal enthusiasm.

Over-constraining is the other risk. Too many rules and the agent gets conservative, follows them so carefully it stops making the useful suggestions that surprised you early on. For someone who doesn't know exactly which rules matter and which are just preferences, the line between "my project has opinions" and "my project is rigid" isn't obvious. The mitigation is that rules can be revoked as easily as they're created. The system needs a delete button as much as it needs a save button, and using it has to feel cheap, not like admitting a mistake.

These aren't edge cases. They're the likely experience for the exact audience I'm describing. What makes persistence worth it despite these risks is visibility. A bad decision that lives in your head compounds invisibly - you keep correcting symptoms without seeing the cause. A bad decision captured as a rule is at least something you can look at, question, and revoke. The failure mode shifts from "silent drift" to "visible mistake," and visible mistakes are fixable.

The quiet version

Few weeks later, same project. You describe what you want. The agent tells you what it's going to do before it does it, and you recognize your own decisions in its plan - the component style you chose, the navigation pattern you established, the validation rules you spent that late night figuring out. The things you learned three weeks ago didn't disappear when you closed the laptop. They just became part of how the project works.

You still have to check its work. One of the rules from week two turned out to be wrong - you locked in a layout choice that doesn't work for the new feature, and you have to revoke it. The project is still yours to manage, still full of judgment calls. But when you open a new session, you don't start with a fifteen-minute preamble about your entire architecture. You start where you left off.


Related: Why Engraph describes the governance worldview underneath the product. Your agents don't read the middle examines what happens when the rules you write for agents fade in the middle of the context window.