I wrote this article (yes - ai-enhanced) and thought of sharing this methodology that I personally use more and more every day with all of you:
We can oversimplify every problem to: input → process → output. This isn't new. What is new is that AI agents are increasingly writing the code in between — and that shift demands we rethink where human expertise actually belongs. The instinct to apply pre-AI methodologies like TDD, BDD, or DDD to agentic workflows is understandable, but these were designed for humans writing code.
In a world where AI is doing most of the implementation, our job changes. We stop being craftsmen and become architects of correctness. Our primary artifact is no longer code — it's the contract that tells the agent what "done" looks like.
This suggests a three-layer methodology, but the layers aren't equal, and one of them isn't free.
*The Output Layer — own this completely* This is the most important layer and the one that demands the most human attention, especially early on. The questions to answer here are deceptively simple: what does ideal output look like? What are the schemas, types, formats, ranges, and invariants that define correctness? Building a robust validation and sanitization layer here is not optional — it's the foundation everything else rests on. Once it exists, it becomes a target. Use agentic AI to generate adversarial fixtures: malformed data, boundary violations, unexpected types, edge cases humans wouldn't think to try. Then use those fixtures to attack your own validation layer until it holds. No errors allowed here. This layer is what gives agents something concrete to validate assumptions against — without it, they're flying blind and so are you.
*The Input Layer — define it, don't neglect it* Less critical than the output layer but still worth deliberate human attention. The same questions apply: what does valid input look like? What are the schemas, constraints, and expected shapes of incoming data? The temptation is to be loose here and let the system handle whatever arrives. Resist it. Sloppy input contracts create ambiguity that propagates inward, and agents will make quiet assumptions to fill the gaps. Those assumptions compound.
*The Functional Layer — powerful, but not free* With solid input and output contracts in place, agents can take on the bulk of the transform logic: the validation, the mapping, the processing. This is genuinely powerful. Implement performance benchmarks and target latencies, then let agents iterate until the system meets them. Much of this layer can be owned by AI. But the word "free" needs a qualifier. The functional layer is where business logic lives, and business logic encodes decisions that are often subtle, ambiguous, or politically loaded. Rounding rules, partial failure handling, priority in edge cases — these aren't derivable from schemas alone. Well-specified contracts constrain the space; they don't eliminate judgment. Agents will make decisions in the gaps between your tests, and those decisions may be quietly wrong. The output layer limits the blast radius of that wrongness, but it doesn't eliminate it. Human review of business-critical paths remains necessary.
*The Layer Nobody Talks About — change* There's a fourth concern that sits above all three layers: how does the contract itself evolve? Schemas rot. Requirements shift. In a mature agentic system with many agents operating against a shared contract, changing the spec becomes the hardest engineering problem in the room. Migration, versioning, and backward compatibility need to be first-class concerns from the start, not afterthoughts. These should be explicitly set in our plans/specs from the get go.
PRO TIP: code is now "cheap" so your first prototype doesn't necessarily have to start with the above framework... you can hack something together to understand the problem, limits, requirements. When clear.. create a plan and follow the above steps in order: output, input, process.