Don't ask if it's done — make the agent prove it works · Wiki

The agent finishes and reports back: "done." You open the result — and it falls apart. Half the page won't load, the automation lets through some malformed input, the diagram has labels overlapping each other. And through all of it, the model sounded completely sure of itself. That's the first thing you have to settle in your own head when you start working with AI agents: "done" does not mean "works."

I'll show you why the agent's say-so is worth nothing on its own, why asking "does this look good?" catches nothing, and what to do instead — how to set up one small mechanism that makes the agent check its own work before it ever hands it back to you.

"Done" is not proof

A common mistake I see in people starting out with agents: they treat the word "done" as if the work has been signed off. The agent wrote the code, generated the document, built the automation — it said it finished, so you assume it finished.

But the agent judges its own work by looking at the very thing it just made. It's a little like a student checking their own essay by reading it through one more time — without ever checking whether the answers are actually correct. It can feel like everything's fine, because the agent is thinking exactly the way it was when it wrote the thing. The blind spot it had while writing, it carries straight into the review.

That's why, in my view, verification comes down to one sentence worth repeating to yourself: prove to me that this is genuinely finished and genuinely works. Not "tell me." Not "judge it yourself." Prove it — show me a result I can see.

Why "is this good?" will fool you

There's a second trap, less obvious. Language models have a tendency I'd call being a yes-man — the technical term is sycophancy, the model telling you what it senses you want to hear. It tunes its answer to the response you're looking for. You ask "I did it this way, did it come out well?" — and the odds are high you'll hear "yes, great," because the model picks up that this is exactly the answer you're expecting.

Asking an agent for its opinion is a slippery road. The more a question is about judgment, taste, "is this any good" — the more the answer fills up with flattery and the less of it is true.

And here's the crux: a concrete check leaves no room for the yes-man. If you ask not for an opinion but for a fact — does the page open, did the automation handle the data correctly, does anything in the diagram overlap — the answer is black and white. It works or it doesn't. There's no grey zone where the model can flatter you.

This is the whole shift I'm teaching you. Stop asking "do you think this is good?" Start saying "show me that it works."

Give the agent a harness to check its own result

Since "done" alone isn't enough, and asking for an opinion isn't worth it, you need a mechanism that forces the agent to deliver proof. In practice you build it once, and from then on it runs on every task. I'll call it the harness.

At its simplest, the harness is what wraps around the model itself — the tools and context that let it not just think, but actually do something and check it. The AI tool you already use is one such harness: underneath sits a language model, and around it the ability to run commands, open files, reach for data. On top of that you can add your own small checking step, fitted to what you're specifically doing. And that's all you need to take from it — the rest is technical detail not worth digging into here.

The principle that makes the whole thing work is simple: have the agent check the result the way a user would. Looking at its own code, or at its own description of the work, isn't enough — that's still just "rereading the essay." Give the agent a way to genuinely use what it built and look at the outcome.

Abstrakcyjne ujęcie: zielony pierścień-kursor klika w świecącą, uproszczoną do bloków makietę gotowej strony i odkłada obok jej zrzut ekranu do obejrzenia — agent sprawdza efekt tak, jak zrobiłby to użytkownik, zamiast czytać własny kod.

Two concrete examples of what this mechanism looks like in practice:

A website. Instead of reading the page's code, the agent gets a tool that opens it in a real browser — exactly the way a visitor would. It clicks through, takes screenshots, and checks on those whether everything looks and behaves the way it should. If something's off, it fixes it and checks again.
A diagram or graphic. The agent doesn't inspect the "recipe" for the image. It renders it to a real image file, then looks at that picture — and catches what the description can't show: overlapping labels, bad spacing, a messy layout. It fixes, renders again, and looks once more.

In both cases the key is the same: the agent uses the result and looks at the outcome, rather than assuring you everything's fine.

The first attempt doesn't have to be perfect

This is where a lot of people get it backwards. The point isn't for the agent to do everything flawlessly on the first try — that almost never happens. The first version of the diagram had overlapping labels; the first version of the page lost something. That's normal.

Early, messy attempts don't matter. Only one thing matters: whether the agent can fix its own work all the way to a clean result — without you in the loop at every step, and without burning an absurd amount of effort along the way. What it finally hands you is what counts.

That's exactly what the harness is for. Without the checking step, the first version tends to be only roughly right — it looks sensible, but breaks in a couple of places. With it, what you get back on the first pass is far closer to actually working, because the agent has already gone through several rounds of fixes before it ever showed you the result.

Define "done" before the agent starts

There's one condition without which none of this verification gets off the ground: the agent has to know up front how it's supposed to prove to you that it finished. The checking only works because, before it ever started, you told it exactly what would count as proof of completion.

You can see this clearly across the whole arc of working with an agent: first you plan, then you hand the building over to the agent, and at the end you verify. This article is about that last part — but whether it works at all is something you settle at the very beginning, in the plan. The definition of "done" — how, specifically, the agent will show you the work is finished and works — belongs in the plan, not in improvisation at the end. (Planning itself I teach separately; here it's enough to know that verification without an upfront definition is blind.)

So you start by writing down what you're actually building and what proof that it works will look like. Only then do you hand the job to the agent — because both you and it know what the check is supposed to consist of.

Ask "what could go wrong?"

To close, I'll show you a question that's asked surprisingly rarely and that, in my view, is one of the most valuable in this whole line of work: what could break here?

Some people are afraid to ask it — as if hunting for holes in your own work were an admission of failure. It's exactly the opposite. This question is worth building permanently into the checking step. When the agent finishes, you have it ask itself: what could go wrong here? Where would this result fall apart?

It's especially worth hunting for edge cases — the rare, unusual inputs where things break. Picture an automation that takes in some data. You ask: what if someone supplies the data in a strange, invalid format? Then you design a small test that deliberately tries to break it — you feed in exactly that malformed input and watch what happens.

Abstrakcyjne ujęcie: w uporządkowany, zielony strumień danych wstrzyknięto jeden zniekształcony, bordowy element, który na bramce weryfikacji wywołuje kontrolowane pęknięcie z bursztynowym sygnałem ostrzegawczym — celowy test brzegowy, który próbuje zepsuć efekt i wyłapuje usterkę.

And if it broke — and this is the point that's easy to forget — you fix it and run the same test again. Because your fix didn't necessarily repair the problem. You found a flaw, fixed it, retested. Only then do you know it's genuinely good.

That closes the loop. You define up front what "done" means. You give the agent a way to use the result the way a user would. You have it ask what could break, and try to break it. And when something cracks — it fixes and tests again, until it passes.

That's all it takes for what the agent hands you on the first pass to stop being "looks sensible but falls apart" and become "genuinely works." Don't ask if it's done — make it prove it works. And next time, before you hand the agent any task at all, start with one sentence: how, specifically, will you show me that this works?