Your Engineers Don't Have an AI Skill Gap. They Have a Permission Problem.
There is a gap, in most engineering organizations right now, between what the AI coding tools can actually do and what their engineers are actually doing with them. The gap is wide, and it is not closing on its own. The standard moves the buying organization is making to close it (the procurement, the launch email, the lunch and learn, the no-meeting Friday) are not closing it either. It is the corporate gym membership problem at industrial scale: the licenses have been deployed, the welcome email is sent, a couple of enthusiasts are getting real value out of the machines, and the rest of the team is still doing roughly what they did before the gym opened.
I am writing from both sides of that gap. I spent years as a consultant in agile transformation, embedded in organizations for weeks or months at a stretch, helping teams change how they planned and how they shipped. I believed in the work, and I still do. The principles are sound, and the proof of that is in the years since I left consulting, in which I have built multiple product engineering teams into the kind of disciplined, collaborative, well-oiled organizations that the best of agile is meant to produce. The methods work, but like any craft they work only when they are practiced, stressed, exercised, and internalized to the point that they feel natural, which is not a condition any workshop or slide deck has ever delivered. Reps do. What I came to understand, slowly and uncomfortably, was that the firms I consulted for did not actually need the changes to stick, they needed them to stick just long enough to invoice and to fail thoroughly enough for the same company to call back six or twelve months later for retraining. The failure was not a defect of the engagement, oh no my friends, it was the baked-in revenue model.
The failure was not a defect of the engagement, oh no my friends, it was the baked-in revenue model.
I now spend a meaningful portion of my time helping organizations adopt AI-native engineering practices, which is precisely why I am writing this. The pattern is repeating. The same firms that profited from the partial success of agile transformation have rebranded as AI transformation, with slicker deliverables, denser decks, and the same underlying engine.
Consider the most visible current version: the Forward Deployed Engineer, or FDE, who arrives from an AI lab to embed with a customer team and help them get real work done with the lab's tools. The model is genuinely useful at its best, because an FDE brings deep, current expertise about how the agent actually behaves and what techniques produce reliable results at scale, which is knowledge most organizations cannot build internally as quickly as they need it. The question worth asking up front of any such engagement, FDE or otherwise, is what your team looks like when the embedded expert leaves. Did your engineers develop the capability to build, evolve, and control their own AI process, or did they primarily learn how the expert thinks, in a way that benefits from the expert returning every six months? Neither answer is predetermined; it depends entirely on how the engagement is structured, on whether internal capability transfer is a design goal or a byproduct. The structural risk is the same one that followed every agile transformation: when the capability lives in the consultant rather than in the team, the capability leaves with the consultant.

Before I figured out what actually works, I ran my own engineering teams through every variant of this pattern, the ones I had once sold as a consultant and the ones I later bought as a manager, with effectively zero behavioral change to show for it. And then I tried something different, and it worked so completely that I have not been able to stop talking about it since. We will get to that.
First the playbook, because the playbook is everywhere, and once you have seen it you cannot unsee it. A vendor, or your own L&D organization, or a consultant in artfully thin spectacles sells your CTO a transformation roadmap, a kickoff gets convened, a slide deck appears titled something like "The Future of Work" or "AI-Native Engineering," and a workshop occurs with breakouts, color-coded stickies, and a closing exercise in which everyone shares one thing they are "committing to try." Then everybody returns to their tickets, the sprint pressure that was real before is still real, the agent mode that nobody actually trusted to do real work before is still nobody's primary mode of work, and six months later, when nothing has changed, somebody hires a different consultant and the cycle resumes.
I have watched this happen at companies large enough to know better, run by people I respect. The pattern is consistent enough that, once you have seen the diagnosis, you can predict the next engagement before it starts.

What follows is a taxonomy of the three most common failure modes, the lies the industry tells itself while not actually transforming anybody. If you are an engineering leader, you are almost certainly nodding through at least one of them.
Lie #1: The License Dump
This is the most common move in the playbook. Procurement buys the licenses, IT provisions the seats, and an email goes out announcing that Claude Code (or Cursor, or Copilot, or whichever tool dominates the trade press that quarter) is now "available," accompanied by a link to the vendor's getting-started page. Then everybody waits for the productivity numbers to rise.
They do not rise. The license is not the transformation; the license is a receipt of intent. What you have actually purchased, by depositing a tool on a team and walking away, is the right to claim on a board slide that your engineering organization "has access to AI." That is the entire product. Your engineers' behavior remains exactly what it was, because a tool sitting unused on a developer's machine is, by every measurable standard, indistinguishable from no tool at all.
The cynical part is not the deposit, the cynical part is what happens next, which is to say nothing. The licenses get dropped, the needle does not move, and the same executives who approved the budget begin asking six months later why the productivity gains have not materialized, and at no point in the entire arc does anyone open the vendor dashboard to check the token-burn or session-count data that would tell them, in five minutes, that agent mode is not being used by anybody on the team. The investigation is never run because the investigation would yield an answer that contradicts the slide deck. It is easier to perform surprise.
Look, what you are really asking when you deposit a tool and walk away is not a migration between IDEs, and not the transition from Subversion to Git. You are asking a senior engineer to invert the most fundamental act of their craft, which is the act of typing code into a file, after fifteen or twenty years of being the author, and to start directing instead, and you are asking them to do it on their own time, between sprint commitments, on the strength of a vendor link and a Slack announcement. It is not a tool change, it is an identity change wearing a vendor-link disguise, and the entire muscle memory, workflow, and way the problem gets decomposed from the moment a ticket arrives in their queue all of it has to be rebuilt from the ground up.
Look, what you are really asking when you deposit a tool and walk away is for an engineer to invert the most basic act of their craft, after fifteen or twenty years of doing it the same way. It is an identity change wearing a vendor-link disguise.
It gets worse. A significant fraction of the engineers you are asking to do this have already tried the tool on a weekend side project, or a small internal utility, or a greenfield script with three files in it, and the tool, in that context, performed well enough to confirm what they already suspected, which is that this is a toy. A clever toy, perhaps even a useful one for quick scripts and prototypes, but not a serious engineering instrument for production-critical systems, and certainly not for the seven-year-old brownfield monolith with its tangled service boundaries and its documentation that is mostly wrong. They tried it, drew their conclusion, and now constitute your most articulate, most senior, most credentialed resistance, so when you drop the enterprise license on their machine you are not introducing them to the tool, you are confirming, by treating it as a self-service problem, that you do not understand the chasm between what they believe the tool can do and what it can actually do when used correctly.
Lie #2: The Permission Slip
This is the gesture aimed at organizations that recognize learning matters but cannot quite admit that asking for it is not enough to produce it. You announce a no-meeting Friday, or a reading day, or an hour a day for AI exploration, and you assume your engineers will use the space to learn the tools. They will not. The reason is that high-performing engineering teams do not actually take the permission they are given. Velocity is the unit they measure themselves on, and ceding velocity to learning, even with explicit cover from above, feels wrong in the body. The engineers you most want to upskill are the ones least willing to use the slack. They will use the no-meeting Friday to catch up on the four tickets they fell behind on during the four meetings you trapped them in on Tuesday. Pressure does not need to be applied from above when the engineers carry it themselves.
Underneath the self-imposed pressure is the deeper problem, which is that learning agent mode is not the kind of learning that fits a Friday afternoon. It takes deliberate, repeated, sustained practice on real work, the kind that produces a feel for the tool over time, and reading a blog post does not produce that feel, and neither does watching a tutorial. Reps do. And reps require a unit of work shaped to make the reps the job, which is something a permission slip cannot deliver no matter how generously it is granted.
Reading a blog post does not produce that feel, and neither does watching a tutorial. Reps do.
Lie #3: The Echo Chamber
You schedule a lunch and learn, bring in pizza, and ask a senior engineer who is "really excited about Claude" to deliver a forty-five-minute demo. Or you spin up a #ai-tips Slack channel and seed it with three articles and a YouTube link. Or you do both, on the theory that exposure plus a shared workspace will produce momentum.
The same five people show up to the lunch and post in the channel, and they are the engineers who would have figured this out without you, because they are the AI-curious minority who read newsletters and watch tutorials on their own time, and they were going to keep doing that regardless of whether you bought them pizza. The interventions feel productive because the five people are productive, and they appear, in the post-lunch survey and the channel engagement metrics, to be the leading edge of a broader shift, except they are not the leading edge of anything, they are the people who did not need the intervention in the first place. The twenty other engineers on your team, the ones whose behavior actually needs to change for the organization to shift, did not attend the lunch and are not reading the channel, so the intervention reaches the already-converted while the unconverted remain unconverted, and you have purchased the appearance of evangelism without any of the conversions.
None of this is to say that lunches, channels, or designated champions are worthless, because they are excellent supporting structures and they remain so even after a team has been transformed. A champion who has lived through a sprint becomes a real evangelist, with lived experience to teach from, and given how quickly AI moves a designated person whose job is to track the changes is genuinely useful in perpetuity. A #ai-tips channel populated by engineers who actually use agent mode every day becomes a working knowledge base, and a lunch and learn led by somebody with two weeks of forced reckoning behind them becomes a real teaching moment because there is a real teacher in the room. None of these interventions are useless, they are insufficient as the first move, and you cannot run them before the sprint and expect them to do the sprint's work.
I ran all of these in some form, and I believed in them, and they did not change how my engineers actually worked. They were using Cursor the way they had used Copilot the year before, highlighting a function for a small refactor, tabbing through inline autocomplete to push out changes quickly, occasionally posing a question to the chat the way they would have once posed it to Stack Overflow. Agent mode sat in the corner of the IDE, summoned now and then for a quick rewrite of a highlighted block, but never trusted to drive the work. Six months in, they were still on Sonnet 3.5, still on the same workflows they had a year before, still treating the agent as a curiosity rather than a craft, and I had spent real budget, given real time, and produced exactly nothing. The mistake was not running these interventions, the mistake was treating them as the first move.
The reason none of it worked is, in every case, the same. None of these interventions changes the unit of work the engineer is measured on, and the standup is still about your story, the sprint review is still about your story, the performance review is still about your story. When learning competes with shipping, shipping wins every single time, because shipping is what the system you operate within is actually built to reward, and the system will consume a hundred posters about innovation and a hundred lunches and ship the same way it did the week before.

What Actually Worked

The sprint is the conclusion of a longer arc, not its premise. The arc is worth narrating briefly, because it explains why the format is shaped the way it is.
When I first put modern coding tools in front of my engineers, the gains were marginal. The multi-tab autocomplete made small changes faster, the inline refactors handled highlighted blocks well enough, and the chat served as an occasional Stack Overflow replacement. It was a better Copilot. Pushed to use agent mode for real work, the tool hallucinated, blew its context window within three turns, and produced code so unusable that the engineer would close the panel and write it by hand.
Meanwhile, the AI YouTube ecosystem was selling the opposite story. No experience required. Follow this method. Ship a million-dollar SaaS overnight. To any practicing engineer it was obvious nonsense, but the gap between the fantasy on YouTube and what I was watching in my own team was so wide that the truth had to live somewhere between them. A few people, somewhere, were doing real work with these tools. The trick was not the tools; the trick was how the tools were being used.
So I spent a couple of months, mostly at night on my own couch with Cursor open and Sonnet 3.5 hallucinating into slow mode, trying to build a complicated application end to end without touching the code myself. Implementation plans, PRDs that named the tech stack in a table the agent could refer to, small sequences of work, fresh sessions when the context collapsed, rules written into the repo so the agent stopped making the same wrong decision twice. These techniques, accumulated through what felt like a thousand restarts, became BMad V1: six markdown files in a repository, the entire methodology shorter than a chapter of a book.
I brought it to my team, and it did not stick. They had the documents and they had me, available and enthusiastic, to walk them through every piece of it, and they still used Cursor like Copilot, because the pressure was real and the methodology was mine, not theirs. The sprint was the format I designed after I realized that no amount of giving people BMad would ever substitute for letting them discover the same techniques on their own.
The shape of the sprint was deliberately simple: two weeks, one story per engineer, sized to something that would normally take three or four days. That story was the only thing the engineer was responsible for during the sprint, and the only constraint was that it had to be executed end to end using agent mode, and if they finished before the two weeks were up they had to start the same story over from scratch. That last constraint is not a joke, it is the most important rule in the entire format.
Within two weeks, almost every engineer on the team was operating in agent mode at full capacity. People who had been writing code the same way for a decade rewired themselves in fourteen days. And the product manager on my team, who had nothing to do during the sprint because he was not a developer, became so envious of the engineers' visible excitement that I suggested he go play with Gemini gems on his own. He called me at six the next morning, breathless, because he had built fifty-three of them overnight, including one that served as a domain expert on a corner of our codebase he had been quietly nagging engineers about for months.
That is what happens when permission is real. Permission, in this format, is not a slogan, it is built into the math by replacing the unit of work for two weeks rather than supplementing around it, and none of the three failure modes above does that. Only the sprint does.
And the techniques that emerged were precisely the techniques nobody can teach, which includes me. I had tried. People arrived at them on their own, in real time, watching each other in the morning standup. If I write an implementation plan and feed the agent in small sequences, it goes further. If I restart the session when it deteriorates, it stops being dumb. If I write rules into the repo, it stops making the same wrong decision twice. These were the same conclusions I had reached on my own couch a year earlier, the conclusions that had become BMad V1, the conclusions every framework released since has independently rediscovered. Every engineer on my team arrived at them themselves, in two weeks, because the format afforded them no other option.
The deeper outcome, the one nobody planned for, was a shift in how the team thought about every part of the job. Once engineers had spent two weeks questioning their assumptions about the most fundamental act of their craft, the questioning did not stop at the editor, and they began looking at every manual process in their day, every error-triage routine, every code-review ritual, every deploy runbook, every postmortem template, and asking the same kind of question they had spent the sprint asking about their code, which is whether the agent could handle this instead and what the workflow would look like if it did. This is the AI-native cultural shift that every slide deck promises and every all-hands email tries to mandate into existence, and look, mandates do not produce it, only the right format does. Two weeks of forced reckoning with the agent on a single piece of real work does what no executive proclamation has ever done, which is to make every engineer on the team start asking the AI-native question of their own work, on their own initiative, every day, without being told to. The sprint was engineered for agent-mode adoption, and what it actually delivered was an organization that had begun to think differently about everything.
The AI-native cultural shift is what every slide deck promises and every all-hands email tries to mandate, and look, mandates do not produce it. Only the right format does.
How to actually run it
If you have read this far, you have stopped paying for the gym membership and are ready to install the squat rack. Good. Here is the shape.
Secure the rest of the organization before you begin. If your scrum master, your product manager, or anyone above you intends to inject sprint pressure halfway through, the format collapses. The limited scope must be sacred for the full two weeks, and that deal is struck out loud, with everyone who could override it, before day one.
Two weeks is the correct length: not one, not three. The reason is the weekend in the middle. Almost every time I have run this, I have watched people get excited by Friday of week one and devote part of their weekend to exploring on their own. They return Monday with more to share, and week two compounds atop it. One-week sprints do not afford that gap, and anything longer dilutes the urgency.
Handpick the stories. This is the one part of the sprint where you cannot be agile-organic and permit self-selection, so as the engineering manager you select, and each person receives exactly one item sized to what would normally take them three or four days without agent mode, no more than a week. It can be a story, a defect, a bug, anything, and what matters more than the type is variety across the team, one engineer on something UI-heavy, another on a complex backend story, another on something data-heavy or performance-sensitive. Mix greenfield and brownfield, and especially brownfield, because a great many people will inform you that AI does not function in brownfield, and putting that claim to the test against real legacy code is where this format earns its keep.
Make the stories independent of one another so nobody is blocked awaiting somebody else's work.
No frameworks during the sprint, which means do not hand them BMad, do not hand them Spec Kit, do not hand them any playbook, because the entire point is that they arrive at the techniques on their own, which is what makes them able to use BMad meaningfully when you do introduce it after the sprint, as the framework that turns their newly internalized techniques into a repeatable team practice. You may offer a few starting tips: good documentation in the repo, AGENTS.md or CLAUDE.md, the practice of writing rules, the discipline of producing an implementation plan and feeding it in measured chunks, the wiring of the agent to Jira or Linear via an MCP or CLI. That is the floor, and past that, leave them alone.
Reconstruct the standup. From day one, it is no longer a status meeting, each person shares one new or interesting discovery from the day before and that is the entire standup. The first day will feel slightly awkward, and by day three it will not, and people will begin saving things to share, and the morning will become the most generative part of the sprint.
Bookend the two weeks with two demonstrations. On the first Friday, conduct an extended standup so people can elaborate on what they have been finding, and on the final Friday, host a real demo, one to two hours depending on team size, where everyone presents not the story they shipped but what they learned.
And the rule I lead with on day one, every single time, is that everyone works on the same story for the duration of the sprint, and if you finish you begin again, and if you hit a wall you back up and try a different approach. Cycling through the same story two or three times with different techniques is where the techniques compound, and the breakthrough you are looking for is not at the finish line, it is in every restart your engineers make on the way there.


The breakthrough you are looking for is not at the finish line, it is in every restart your engineers make on the way there.
One Last Thing
Here is the part that does not get said often enough.
The AI training and transformation industry is young, but the playbook it is using is not. The shape of the engagements, the deliverables, the maturity models, the certification programs, the quarterly retainer structure, all of it is borrowed wholesale from the agile transformation industry that preceded it by twenty years, and that playbook has a structural problem I lived inside and watched up close. It is engineered for the appearance of motion at the buying organization, not for the actual behavioral change in the team, so the deliverables produce beautiful slides for the board deck but they do not, on their own, produce engineers who use the tool differently on Monday morning. This is not a moral failing of the people running the engagements, most of whom believe in the work, it is a design defect baked into the shape of the agreements, that the buying market and the selling market have not yet had the incentive to fix together.
Two weeks. One story per person, handpicked. Agent mode only, end to end. Daily share-something-cool. Bookend Fridays. Restart and try again if anyone finishes early.
Do not hedge it, do not shrink it to one week because two feels long, because the two weeks are the format, not the budget, and do not hand them a framework. And do not inform me that your team is too senior, or your brownfield too messy, or your engineers will refuse, because mine were senior, ours was brownfield, they were skeptical and busy, and they participated. Yours will too.
One more thing, before you assume that running the sprint will solve your AI transformation problem on its own. If you run it and change nothing else in your organization, your engineers will use agent mode well, look at every manual process in their workflow with new eyes, and ship somewhere between ten and twenty percent faster than they did before, which is meaningful and worth doing but is not transformative on its own. The transformative shift is what happens when the same thinking moves through every role in the software development lifecycle, when product, design, technical leadership, quality, and operations all begin working AI-natively against the same backlog, in the same coordinated way that agile reorganized engineering twenty years ago, and the two-week sprint is the engineering prerequisite for that shift, not the shift itself. The BMad method, which is the framework I have been developing since the early restart cycles on my own couch, is the blueprint for the larger restructuring, and the framework many engineering organizations have already begun adopting at scale, and the next essay in this series is about how to run it. The sprint is where you start.
If you run this and it works for your team the way it has worked for mine and for every other team I have observed run it, I would love to hear what your version looked like.
Related posts
Your AI Should Be Arguing With You... and Making You Sweat!
Inside the PRFAQ gauntlet: Amazon's method for killing bad ideas before they cost you months, now powered by the BMad Method and an
What Does Going AI-Native Actually Mean?
Every few years, something shifts the entire foundation of how software gets built. Containers changed deployment. Cloud changed infrastructure. DevOps changed who was responsible for