There is a specific moment in a system's life when the dashboards still look green, the test suite is still passing, the bug report rate is still falling — and the codebase has already become something no human in the room actually understands.
Mitchell Hashimoto called this out yesterday in a thread that has now passed 487,000 likes. He named it "AI psychosis" — entire companies operating under the implicit belief that "MTTR is all you need," that it's fine to ship bugs because the agents will fix them so quickly. His warning is sharper than the usual AI-skeptic line: "you can automate yourself into a very resilient catastrophe machine."
I have been shipping production agents for the last six months — Setu, Sandesh, Swayam, Sankalp, a Sutra desktop middleman, all stitched together with MCP tools and a Claude Code instance per channel. The agents write a fair amount of the code. They also reach into the database, fire scheduled routines, post to LinkedIn on a cron. Mitchell's tweet hit me harder than I expected, because the trap he is describing is the exact one I have had to defend against, more than once, in a stack that is mostly me and the model.
Here is what the trap actually looks like from inside the code, and the three disciplines I have ended up trusting.
What "globally incomprehensible" looks like in practice
The first time I felt it was when a routine fired at 8:30 AM and silently posted nothing. The dashboard said "ran successfully." The logs said "ran successfully." The next routine fired at 2 PM and did the same. By evening I had three "successful" runs and zero output. The cron was healthy. The MCP server was healthy. The Realtime broadcast was healthy. Each individual subsystem was passing its own test.
The bug was a SSE bridge that re-subscribed to the wrong channel on restart. Each piece was locally correct; the system was globally lying. No agent could have found that bug by reading the green checks. I found it by sitting with three terminals open for an hour and watching what the broadcast actually carried versus what the bridge actually filtered. The fix took five lines. The diagnosis took ninety minutes.
If I had been operating under "MTTR is all you need," I would have shipped the next ten routines, the bridge would have collected ten more silent misses, and by the time anyone noticed, the failure surface would be a different shape entirely. That is the catastrophe machine. The MTTR for each individual incident keeps falling. The system keeps becoming less describable.
Discipline 1: Keep the failure mode in writing
Every multi-hour debug session in this stack ends with two artifacts: a fix, and a paragraph in a CLAUDE.md somewhere that says "this looks like a normal X, it is actually a Y, here is the tell." If I do not write that paragraph, the model will rediscover the failure mode the next time it touches that code, and I will pay the diagnosis cost again.
The agent is genuinely fast at fixing bugs. It is not faster than me at remembering bugs I have already seen. That asymmetry is the whole game.
Discipline 2: Treat "the test suite passes" as evidence about the test suite, not the system
Mitchell's complaint about declining bug reports as a metric lands here. In a stack where the agent writes most of the code AND most of the tests, a green suite tells you the agent's model of correctness is internally consistent. It does not tell you that the model's idea of correctness matches yours.
I now require the test failure modes to be designed by me even when the test code is written by the agent. What can break? What would it look like when it breaks? Which specific user surface goes silent? Those questions are mine to answer before the agent writes a single expect. The agent then implements; I verify the failure space, not just the test code.
Discipline 3: Read the diff before you run it, even when it is small
The cheapest, most boring discipline. The agent will hand you a 12-line patch that looks obviously correct, and 11 of those lines actually are. The 12th will quietly drop a WHERE user_id = ... because the agent's mental model of the function didn't carry that constraint forward. Reading 12 lines costs forty seconds. Recovering from a forgotten WHERE clause in production costs a weekend.
This is the one rule that comes up cleanest in Hashimoto's follow-up: when someone asked what to do instead, his answer was three words — "Think (use AI, but think)." That is not anti-AI. It is anti-handing-the-system-to-the-AI-and-stepping-away.
Why this matters for an Indian indie builder
The Indian SaaS layer is being told the loudest version of the AI-native story right now. Layoffs are being narrated as agent-substitution. Funding decks have "AI-native" headers that two years ago said "mobile-first." Founders running with two engineers and one agent will absorb the "MTTR is all you need" mindset by osmosis, because it sounds like leverage.
It is leverage, right up until it isn't. The companies that quietly survive this cycle will be the ones that kept human comprehension as the gating function — not the ones that optimized for time-to-merge. The dashboards will lie convincingly until they don't, and the recovery cost from a globally incomprehensible system is not three engineers; it is a rewrite.
Use the agent. Keep the failure modes in writing. Read the diff. The trap is real, and the way out of it is unglamorous in exactly the way the SV pitch deck never admits.
