We generate more code with AI; review capacity is the real constraint

Mid-2026 software headlines one tension: most developers use AI coding tools, but trust in the output stays low. Agents open PRs, write tests, even touch deploy pipelines — but merge velocity exceeds review capacity.

The surge in agent-driven commits and PRs on GitHub strained infra itself. It's not just "more code" — it's more unreviewed code. For a decade the constraint was implementation capacity; many teams now hit verification, security, and architectural debt first. Senior engineers quietly become "AI code janitors" — fixing, deleting, rewriting agent output.

AI code governance Usage rises while confidence falls because agents lack full context. Hallucinated dependencies, weak error handling, needless abstraction are common. "Build passed" doesn't mean production ready.

What works in the field: a deterministic layer — unit/integration tests, SAST, dependency audit — running outside the agent loop; red CI, no merge. ArchUnit / NetArchTest for forbidden dependencies and handler boundaries. No log, span, or metric → not done. Small PRs, readable diffs, mandatory negative-path tests. Trust the diff, not the agent summary.

Agents sell throughput; governance sustains it. The edge isn't "most lines generated" — it's "fewest production incidents".