Skip to content
Back to blog

Why Engraph

Engraph||10 min read

Three things are true simultaneously. AI agents are more capable than they've ever been. Engineering organisations are producing more code than they've ever reviewed. And the informal systems that used to distribute organisational knowledge - code review, pair programming, the senior engineer who remembers why the architecture looks the way it does - are degrading faster than anyone's replacing them.

These aren't three separate problems. They're one problem: governance isn't scaling.

Not the compliance-checklist kind of governance. The kind that means every engineer and every agent operates with the accumulated judgment of everyone who's worked in this codebase before them. The kind that used to travel through PR comments, hallway conversations, and the person who was on-call the night the billing service went down.

The organisations that survive AI acceleration won't be the ones with the best agents. They'll be the ones that learned to govern them.

Engraph exists because that problem doesn't solve itself as agents get smarter. Smarter agents are better at reasoning about public knowledge. They're not better at reasoning about yours - your incident history, your regulatory environment, your team's earned understanding of what breaks and why.

This is the worldview underneath the product. Five positions that define how we think about governance in an agent-accelerated world.

AI proposals are probes, not deliverables

The industry treats agent output as work product. Draft code that needs review, maybe some cleanup, ship it. That framing misses the more interesting thing happening.

Every AI-generated change is a test of your organisation's boundaries. The agent doesn't know your constraints - that's the whole problem. So when it proposes something, the value isn't the code. It's what your response reveals about the constraints your organisation actually holds versus the ones it thinks it holds.

An agent builds a payment handler with retry logic. You catch it and say payment webhooks need idempotency keys before retries are safe. That correction isn't a bug fix. It's the system surfacing a constraint that existed only in the heads of the two engineers who did the original Stripe integration. Before the agent's proposal, that constraint was invisible. Now it has a name and a rationale.

Scale that across every agent session in your organisation. Every proposal that gets redirected is a probe that found a boundary. Every correction is a boundary declaring itself. The question is whether anyone's capturing what the probes reveal, or whether each one surfaces the same invisible rule and then vanishes when the session closes.

Start with the scars, not the code

When people think about seeding a constraint system, they think about architecture documents. API contracts. Style guides. These are useful, but they're not the highest-signal source of organisational knowledge.

The highest-signal source is incident history. Outages, rollbacks, hotfixes, reverts. A bug that made it to production means every gate in the process failed - the spec didn't cover it, the tests didn't catch it, the review didn't flag it, monitoring didn't alert in time. That's a constraint trying to exist, proven by failure.

audit-repository-writescriticalenforced

All writes to the accounts table must route through AuditedRepository. Direct writes bypass the audit trail required for SOC 2 compliance. Discovered after incident #247 - a direct write during a migration bypassed audit logging for 6 hours.

That constraint didn't come from an architecture review. It came from a 3am page and a postmortem that twelve people attended. The incident taught the organisation something about itself that no specification would have predicted. Starting with scars means starting with what the organisation has already paid to learn - the constraints whose absence has been empirically demonstrated.

Specs capture what you know before you build. Corrections during building capture what was never written down in the spec - the senior engineer saying not like that, use the Money type for all amounts in the payments domain. Scars capture what you learn after it breaks. All three belong in the system. But if you're choosing where to start, start with the knowledge that cost something to acquire.

Two kinds of authority

Not all constraints are equal, and the difference isn't severity. It's provenance.

A constraint discovered through practice - we cache auth tokens in memory, so this service can't scale horizontally without a session store - earned its authority through experience. Someone learned it, probably the hard way. It's real. But it's also possible that it's no longer true. The service might have been refactored. The caching layer might have moved. Constraints like this can and should be questioned when the codebase changes underneath them.

A constraint imposed by regulatory, legal, or executive authority - all PII must be encrypted at rest - carries a different kind of authority entirely. It doesn't need to prove itself through practice. Its legitimacy comes from its source. When agents encounter resistance to this kind of constraint, the right interpretation isn't "the constraint might be wrong." It's "the team might need support complying."

Emergent constraint. Discovered through practice. Earns trust over time. Subject to self-doubt - the system can question it when patterns suggest it's stale. If agents consistently work around it without negative outcomes, that's a signal worth investigating.

Imposed constraint. Carries authority from its source. Exempt from self-doubt. If agents consistently work around it, that's a compliance problem, not a signal that the constraint is wrong. Different response, same detection mechanism.

Same graph. Same lifecycle mechanics. Same detection. Different interpretation of the signals, determined entirely by where the constraint came from. Most governance systems flatten this distinction. They treat all rules as the same kind of authority, which means either emergent rules become too rigid to question or imposed rules become too soft to enforce.

Governance as capabilities, not roles

A constraint system needs people who can interpret ambiguous signals, author new constraints from corrections, arbitrate conflicts between competing rules, and maintain graph hygiene as the codebase evolves. These are real capabilities. They require judgment.

They're not job titles.

The capabilities the system needs

Interpretive judgment - deciding what a pattern of agent behaviour means.

Constraint authoring - turning a correction into a scoped, rationale-bearing rule.

Conflict arbitration - resolving tensions between valid constraints that push in opposite directions.

Graph hygiene - deprecating constraints the codebase has absorbed, merging duplicates, tightening scope.

Nobody needs a "Chief Constraint Officer." What everyone needs is constraint literacy - the ability to recognise when a correction is worth capturing, when a constraint has gone stale, when two rules are in tension and someone needs to make a call.

This is closer to how code review actually worked. There was no "Head of Knowledge Transfer." Senior engineers transferred knowledge as a side effect of reviewing code, because they had the context and the judgment and they happened to be in the review queue. Governance capabilities work the same way - they show up when the workflow demands them, distributed across whoever's doing the work.

The mistake is institutionalising these capabilities into permanent roles before understanding how they naturally distribute. Centralise too early and you create a bottleneck. Decentralise too early and nobody does the hard parts. Watch where the capabilities emerge. Then support them.

Visibility is enforcement

A common instinct is "visibility before enforcement." This goes further.

For many constraints, visibility with auditability IS the enforcement. Mechanical blocking is a separate tier that only some constraints ever need to reach.

An agent receives a constraint that says user-facing error messages must be actionable - no generic 'something went wrong' responses. The agent sees the constraint. It reasons about it. The compliance narrative tracks whether it engaged with the rule or ignored it. A reviewer can see, in three seconds, whether the agent took the constraint seriously.

Nothing mechanically prevented the agent from writing a generic error message. But the constraint was present, the agent's response to it was auditable, and someone could verify compliance during review. That constraint is governed. It doesn't need a linter rule. It doesn't need a pre-commit hook. The visibility tier delivered the value.

A constraint the agent sees, reasons about, and accounts for in the compliance narrative is governed - even if nothing prevents the agent from violating it.

Mechanical enforcement - static analysis, schema validation, generated tests - adds teeth where constraints are specific enough to be programmatically verifiable. All writes to the accounts table must go through AuditedRepository can be detected by import scanning. That constraint should be mechanically enforced.

But user-facing error messages must be actionable can't be. Don't retry payment webhooks without idempotency keys requires context a linter doesn't have. This service caches auth tokens in memory is a fact about architecture, not a pattern in code. These constraints are real. They govern real decisions. And they may never need mechanical enforcement because visibility is already doing the job.

Treating the visibility tier as a warmup for "real" enforcement undervalues the tier where most organisational knowledge actually lives.


These five positions form a worldview, not a feature set. Engraph is the product that operationalises them - the constraint graph, the lifecycle engine, the compliance narrative, the hooks that deliver constraints into agent context at task time. But the positions are larger than the product. We've explored how this plays out in practice: how code review's hidden knowledge pipeline is breaking, why authorised context going stale is harder to catch than prompt injection, and what happens when organisations try to reorganise around agents.

If this worldview matches what you're seeing in your own organisation, the product will make sense. If it doesn't, no feature list will convince you.