How Better Tools Can Make Things Worse

A different kind of honesty

We started the last article with a fictional team. You and I are beyond that now.

Let me tell you about teams I actually worked with – teams I trained, teams I served as a programmer, teams I inherited code from. Names and details have been changed to protect the innocent. The disease was real. The teams were good. None of this was anyone’s fault, which is the most annoying thing about it.

A few years ago I started writing a book called AI for Programmers. We had three anchoring ideas. By the time we finished a draft of chapter four, all three had been integrated into the tools themselves. We stopped writing. If the people writing the book about a thing can’t write it fast enough, what chance does a working team have? Not much, it turns out. Not without a different kind of effort than anyone is currently making.

Here’s what I saw.

Copilot: the magic of completion

The first time I watched a senior developer use Copilot for a full afternoon, they laughed out loud three times. Once at a particularly good completion. Once at a particularly bad one. Once at the realization that the tool was, in some quiet way, paying attention to them. Not the way an editor pays attention. The way a colleague does.

It was magic. There’s no other word. Tab. Tab. Tab. The boring half of every function appeared, mostly right, occasionally so right you’d lean back in your chair.

When Copilot was new, the first push back against it was about training data, and we mostly didn’t resolve it. We used the tools anyway. The codebase impact was small. PRs looked like PRs. Reviews looked like reviews. In Elixir specifically, I noticed two things – module documentation got better, and inline comments got more numerous than they needed to be. Neither felt like a problem at the time. Both were early signals that the shape of how code was being written had started to drift.

Everything was fine. At least at first I thought it was. We were all watching the wrong horizon.

Cursor: the delegate, and the day everything worked

Then we started asking the tools to do things. Write this test. Refactor this method. Generate this boilerplate. Small, scoped delegations. The [Accept this edit?] button – the one that would come to define the next several years – entered our working day quietly, almost politely, and never left. Click. Click. Click. Decisions made in rapid succession, mostly correctly, on information that was just barely sufficient. You absorbed the rhythm without noticing the rhythm was absorbing you.

And then we had The Sprint.

I’m being deliberately vague – there were several of these, on several teams. Small project, clean problem, tight scope, careful steering. It worked beautifully. The team got almost exactly what they asked for, faster than they could have built it, at a quality that would have made them proud in any era. People walked out of that sprint changed. This is what programming is going to be now, somebody said. We nodded. It seemed obvious.

That sprint was real. The success wasn’t a setup. The team didn’t secretly build something that fell apart later. The prototype shipped, the internal tool got used, the feature held up. The success was honest.

The lesson I drew from it was wrong. The conditions that made it work – small scope, clean problem, careful steering – did not generalize. We mistook a beautiful day for a forecast. So did most of the people I respected. So did, as far as I can tell, most of the industry. We took the existence of the magical sprint as evidence that the magical sprint was the new baseline, and we organized our optimism – and our staffing decisions, our project plans, our hiring – around that assumption.

Not enough mistakes

Somewhere in the process of selling my soul to the [Always Accept] button, I was badly shaken. Not because AI was making too many mistakes, but because it was not making enough.

I’d been running a mentoring program for underrepresented juniors, and I sat with a few of them and told them I was afraid. Not of being replaced. I was afraid that the kind of mistake that had taught my generation how to be programmers – the wrong-but-shippable function, the abstraction that almost worked, the bug you only understood after staring at it for two days – was being smoothed away by tools good enough to deliver mostly-right code from a mostly-right prompt. I didn’t know how anyone was going to learn the craft if the craft was no longer made of those mistakes. I still don’t. I told them so. I owed them the honesty.

How fast it happened

ElixirConf in September of 2025 was, for me, a black cloud. Everyone was talking about AI. More than half the room were skeptics. Josh Price of Alembic and I were giving the closing keynote – A New Case for Elixir – and the mood in the hall was already squeezed. People kept finding me in the hallways with the same look on their face. They didn’t quite have words yet. They were trying to figure out what the next year was going to do to them.

The juniors I talked to felt it sharpest. They could do more than they used to, but they also felt they were doing too much – more than they understood, more than they could account for, more than they could explain to anyone reviewing their work. They wanted oversight. They couldn’t quite say from whom. The seniors and managers were quieter about it but said the same thing in a different key. They could feel the leverage. They could also feel the cost. They were watching their codebases drift away from them in a direction nobody had charted.

I closed that keynote with a line I didn’t deliver lightly: you are going to be OK. The room exhaled. Several people came up to me afterward and said they’d needed to hear it. I didn’t know if I believed it. I said it anyway, because I thought it was probably true, and because the room needed somebody to say something.

Six months later, in April of 2026, José Valim gave a talk where he asked the room how many of us were using AI, and how much. The answers came back like weather. Most of us. About half our code. Half. Six months from a black cloud of skepticism to half the code in the room being written with these tools. Nobody had prepared the language community for that pace. Nobody had prepared the industry for that pace. We’d crossed a threshold, somewhere between Copilot and Cursor, between helper and delegate, and most of us hadn’t noticed when. By the time José counted us, we were already deep on the other side.

🎯 Join Groxio's Newsletter

Weekly lessons on Elixir, system design, and AI-assisted development — plus stories from our training and mentoring sessions.

We respect your privacy. No spam, unsubscribe anytime.

Where we actually are

I’m a trainer and a consultant. I’ve dealt with dozens of customers in the last few years, and certain problems are frighteningly prolific. Three of them, usually all at once, in different proportions depending on the team.

Functional duplication. The codebase has a date-formatting helper in utils/dates.ts and another in lib/formatting.js and a third inlined in some component nobody has touched in eight months. They do almost the same thing. They drift apart slowly. One eventually develops a bug the others don’t have, a customer hits it, somebody fixes that one, and now you have two correct date-formatters and one subtly wrong one, all live, all in production. Multiply by every common operation in your codebase.

Conceptual proliferation. A junior asks the senior tech lead what the idiomatic way to handle errors is in this codebase. The senior pauses for slightly too long. There used to be one way. There are now four. The junior watches the pause and learns something neither of them wanted them to learn – that the codebase no longer has a single answer to give.

Feature accretion. The obsolete version from March is still in the repo alongside the rewrite from September and the replacement from December. Removing them would require understanding which is current, and nobody is sure.

These three damages cohabit. They feed each other. The codebase isn’t rotting in the old way; it’s accumulating in a new one, faster than any human can hold in their head.

But the place I see them hit hardest, in nearly every engagement, is wherever the codebase uses inversion of control – protocols and behaviours in Elixir, dependency injection in Java or Ruby, events in JavaScript. These were precisely the abstractions that used to simplify our code and make it more powerful. They worked because senior engineers built them with finesse and skill, defending the boundaries deliberately, choosing what crossed the seam and what didn’t. That work was the most important and least visible thing seniors did.

It’s exactly the work the agents can’t do. The coherence isn’t local. It lives across files, across modules, across time, in the heads of people who were no longer steering. The agents wrote what they were prompted to write, individually correctly, and the system added up to less than the sum of its parts. Bloated abstractions became bloated protocols. Misfit seams became misfit boundaries. Functional duplication climbed up a level into behavioural duplication, where two modules implemented nearly the same protocol for nearly the same reason and neither author had written enough of the system to have a complaint about it.

And here is the most uncomfortable thing I’ll tell you in this article. Almost every time I see this pattern in a new engagement, my professional twenty-plus-year programmer’s instinct is the same: I can rebuild it. I can see what the right protocols would have looked like, how the IoC layer should have been structured. A few weeks of careful work and the codebase will snap into shape. Sometimes I can. More often I can’t, because there was no coherent design hiding underneath waiting to be revealed – no coherent design had ever been made. The instinct that I can rebuild the spine is the same instinct that built the mess in the first place: the tools are powerful enough that with one more good pass, this becomes the thing I want it to be. It is not. The discipline that produces a coherent system has to be applied while you’re building, not after.

Seeing the disease doesn’t grant immunity to it. I’ve written about this exact pattern. I recognize it within hours. And I still catch myself, every time, wanting to outsmart it.

One thing to take with you

If you take nothing else from this article, take this: it’s okay to throw it away. The bigger the chunk, the more important this freedom is.

We came up in a profession where code was expensive to write, and we built habits around protecting code as if the typing was the precious part. The typing was never the precious part. The thinking was. And when the thinking didn’t happen – when the code arrived without it, generated by a tool that didn’t know your codebase or your team or your conventions – the right move is to throw it away and ask for it again, smaller, with more design in front of it.

You can do this. You’re allowed. The next pull request you look at that doesn’t fit, doesn’t quite work, doesn’t quite belong – close it. Delete the branch. Ask for a smaller version. Nothing bad will happen. Code is cheap now. That’s the whole point.

How dark it can get

I want to be honest about something the rest of this article hasn’t said yet. We don’t know how this ends.

There’s a version where it works out. Agents get better at refactoring and at attacking complexity. The discipline catches up to the volume. We learn to work alongside the agents in a way that produces consumable, reviewable, right code at a sustainable pace. Juniors grow into seniors by doing what seniors used to do, just with different tools in their hands. The craft survives in a new shape. I think this version is genuinely possible. I am also not sure how possible.

There’s a version where it doesn’t. The economics turn out to be a bubble, and the tools get more expensive than the work they replace. We look back in five years and decide most of this was a mistake – that we damaged a generation of codebases, and a generation of programmers, in service of a productivity story that didn’t quite hold up. The damage is already done by the time we admit it. We rebuild slowly, and the people who would have been the seniors of 2035 are doing something else for a living.

There’s a version that’s worse than that. The agents keep getting better, and the question stops being how do we stay in the loop and starts being whether we stay in the loop at all. I am not predicting that one. I am also not pretending I can rule it out, and the people who tell you they can are either selling you something or guessing.

The honest answer is that we are navigating without a map, at speed, in a domain where the second-order effects on craft and learning take years to show up while the tools change every six months. That asymmetry is the scary thing. Not the tools themselves. The fact that the time horizon for understanding what we’re doing is longer than the time horizon for doing more of it.

I don’t know which version we’re in. Neither do you. I’m writing the rest of this series anyway, because I think the optimistic version is real and reachable, and because not trying is itself a choice for one of the worse versions. But I owe you the acknowledgment that I’m betting on a path I can’t fully see, and so are you, and so is everyone who reads these articles and decides to do something about them.

Eyes open.

Coda

We’re tired of pretending we don’t know.

The teams are still shipping. The dashboards are still mostly green. From the outside, nothing is wrong. But all of us – the people who care about this work – we know. We know the codebases have accumulated something we don’t have a good name for yet. We know the flow is no longer ours. We know the standard slipped, and we never quite agreed to let it slip. And we don’t know which version of the future we’re in.

That part is honest. What we do know is that doing nothing is a vote, and so is mopping up after the agents one PR at a time, and so is throwing away code that doesn’t fit, and so is teaching juniors what a good pull request looks like in a world that’s been remaking itself faster than our books can keep up.

The next article is about how to come back from here. There’s a real program. It works. I’ve seen it work. But I needed this article to exist before any of that would land, because we have to start by being honest about how far we drifted, and how fast, and how much of this is still uncertain.

Next, we cook.

🛠 Train Your Team Through the AI Coding Crisis

This post is from Bruce Tate's series on what the AI coding crisis is doing to engineering teams — and what it would take to train through it instead of around it. Groxio runs private training and ongoing advisory for engineering teams using AI with Elixir, Phoenix, OTP, LiveView, Ecto, Ash, and Postgres. We start with a diagnostic conversation about where your review queue, your seniors, and your codebase actually are.

Browse Courses Explore Training