Yesterday I spent four hours watching Claude solve the same problem five different times. Each fix felt like the last one. Each time it was wrong.
The problem itself was boring: I’d deployed a small AWS Lambda to process PDFs as part of our book printing pipeline. It worked perfectly when I tested it inside the AWS console. The moment I called it from anywhere else — my own computer, our automation tool, anywhere that wasn’t the console itself — it returned 403 Forbidden. Locked out by its own configuration.
Claude’s suggestion was confident and clear: “Just re-add the permission, it’s a 2-minute fix.” I did. It didn’t work. “Try deleting the old permission first, then re-adding it.” I did. It didn’t work. “Let’s use the CLI instead of the console.” I did. It didn’t work. Every fix came with the same breezy certainty, and every failure was met with a new suggestion delivered in the same breezy certainty.
At some point I noticed something that should have been obvious from the start: we were going in circles. Claude wasn’t tracking what had already failed. It was pattern-matching each new attempt to a different prior problem, confidently prescribing solutions that I had already tried, sometimes twice. I was the memory. I was the thing keeping the debugging session coherent, and I was terrible at it because I was tired and it was 1am and I wanted to go to bed.
Eventually I got genuinely annoyed and typed something I don’t usually type at a computer:
“you really need to get better at recording what HASNT worked when debugging because we are just going round in circles”
That single sentence flipped the entire session.
What changed when I forced the step back
Claude wrote a file called DEBUG_403.md. Inside it: every attempt, every result, every hypothesis still alive. It read the actual error messages character by character instead of skimming them. It did a proper web search. And within about ninety seconds it had the real answer.
Turns out AWS quietly changed how Lambda Function URLs work in October 2025. They used to need one permission (lambda:InvokeFunctionUrl). They now silently require two (lambda:InvokeFunctionUrl AND lambda:InvokeFunction). If you only have the first one, every public request returns 403, forever. The AWS console had actually been telling me this the whole time — there was a little blue banner explicitly stating both permissions were needed. I read it nine times over the course of the debugging session. Claude generated a response to it nine times. Neither of us actually understood what it said, because we were both too busy pattern-matching to similar-looking past problems.
The fix was one CLI command.
The thing I keep relearning
I’ve been working with AI coding agents intensively for almost a year now. I build Memolio almost entirely with them — most days I don’t write code directly, I describe what I want and review what comes back. And I keep learning the same lesson in different forms: AI agents will loop on an approach forever if you let them. Sometimes the most valuable thing a human can do is force a step back.
It’s tempting to read this as “AI is unreliable, don’t trust it.” That’s not quite right. Claude wasn’t being lazy or malfunctioning. It was doing what large language models do: generating the next most plausible response given the state of the conversation. The problem was that the state of the conversation kept looking like a problem it had almost-seen-before, and “almost-seen-before” is a bad foundation for debugging something you haven’t actually seen at all.
What the model needed was a different kind of context: not more suggestions, but an enforced structure. A file on disk listing what had been tried. A requirement to read the error message literally before guessing. A rule that says “on the second failed attempt, stop guessing and start researching.” None of this is beyond the model’s capabilities — it did all of it brilliantly once asked. It just wouldn’t do it on its own.
I think this is the actual skill of working with AI agents in 2026. Not prompt engineering. Not picking the right model. Not even building clever evaluation loops. It’s knowing when to interrupt. Knowing when the confident-sounding next step is actually the fourth iteration of the same wrong step, dressed in different words. Knowing when to stop letting the machine drive and make it justify itself.
When I did that last night, Claude wrote me a memory file — literally, a permanent note to its own future sessions — called feedback_record_what_hasnt_worked.md. The rule: “On the second failed attempt at any fix, stop guessing. Create a debug log. Read error messages literally. Do proper research before suggesting more button-clicking.” It will remember this next time I start a new session. I don’t know if that’s exactly what machine learning researchers mean when they talk about AI that learns from feedback, but it felt like something.
The thing that actually shipped
Oh, right. The Lambda works now. The entire print pipeline for Memolio is automated end-to-end for the first time. When a grandparent finishes their WhatsApp interview and approves their book in the review page, everything from “interview complete” to “physical book printed and shipped” now runs without a human touching it. A webhook fires, PDFs get generated, the Lambda adds the tiny invisible printer metadata that commercial presses require, real MD5 hashes get computed, the order lands in CloudPrinter’s system, and forty-five seconds later there’s a confirmed sandbox order waiting to go.
And it happened in ninety seconds once we stopped pretending we knew the answer.
Memolio turns a grandparent’s stories into a personalised, illustrated book — captured over WhatsApp, illustrated by AI, printed and shipped in hardcover. We’re currently in closed user testing. If you want early access or just want to follow along as we build, you can reply to this email or sign up at memolio.io.
