If you want a roadmap for how AI is gonna tukkerjerbs, well, here it is. Jesus.
Let's talk about Photoshop, shall we? I suck at Photoshop. Always have, despite taking an in-person course (in Photoshop 2.0!) and at least three online courses. I started hacking at Photoshop when people were still using it to retouch photos, fer chrissake, then adopted Lightroom when Photoshop became too bloated to use on photos, then adopted Lightroom 2 and Lightroom 3 while still trying to make headway against Photoshop, then abandoned Lightroom when they abandoned cataloguing. Photoshop, meanwhile, got folded into "Creative Suite" which meant you were heavily penalized for buying Photoshop when for less than double the price you could get five other programs you'd never use. So yeah, you'd pay $1100 or some tedious shit to get the whole thing, then $100 or $200 to update it, but mostly you torrented it because fuck you, Adobe. Then it got folded into Creative Cloud and yeah, you could rent Photoshop for $30 a month but you could rent all of Creative Cloud for $50 a month! But then nobody rented Creative Cloud because fucking hell you can do 90% of what you need in Canva. Effectively for free. Adobe tried to staunch the bleeding by paying $20b for Figma, but it didn't work. instead they had to pay Figma $1b for letting Canva eat their lunch while they fucked around trying to buy themselves out of a problem. Adobe is dealing with AI by forcing everyone to use it and hiking their prices, which Barclay's thinks is bullish. My kid's art class is learning photo editing in - wait for it - Canva. Her friends do their video editing in CapCut, because of course they do. Meanwhile she is the undisputed heavyweight champion in the world because I spent 10 minutes showing her how to punk around in iMovie. Thousands a day!? But it's money incredibly well spent! Your engineering org will start to be able to go as fast as you want them to go, for once. Can you believe it? It'll be like being a startup again. You’ll be able to “surprise and delight your customers”, as Jeff Bezos is fond of saying, at an elite level you never dreamed possible. Who's got two thumbs and knows what Jira is? This guy! Because I beta-test. And in the past ten years, the three platforms I beta test have moved from Jira to Centercode because Jira is bloated and expensive. Two of those companies? Publicly traded. The third? One of the biggest privately-held firms in Hollywood. I don't know about one of them, but I know the other two have a coder or two in the US working on any given feature and an army of offshore developers. You are now aware that Pro Tools is largely written in Ukraine; Putin really fucked up my beta schedule, albeit only for about four months. So. Are these firms going to trade off AI for offshore development farms? Maybe if it saves them money and time but we've got bugs that we know what they are and we know where they are and we know they aren't getting squished because for the past fifteen years those bugs are a consequence of legacy code that supports this one studio that can't change out this other piece of code and bloody hell if you lose that one studio you're sunk so the entire rest of the world deals with this one rare error that pops up all of a sudden. Are... you going to explain to the AI fleet why that bug has to stay? I know four CRMs. How wretched is that? Two of them I know because there have been hints of APIs that allow me to talk to the two CRMs we run, and the two CRMs we run are that most wretched of CRM, known as the EHR. Yeah. We run two EHRs. I know. Because one is good for naturopathic medicine and the other is unparalleled for midwifery. One of the EHRs? It's got two coders. Neither of them like me. They were willing to give me access to their API for long enough to get Zoho running for $10k. I opted out, since they said "we support Zoho" when in fact they meant "our db will theoretically interface with Zoho's db." The other has one coder. His name is Mohammed. I had a great conversation with them once where I explained what an API was. They thought it sounded like a cool idea. They've been working on mobile (yes, I know) for six years now; interfacing with my phone system sounded like science fiction to them. Now - I know what you're thinking. Mohammed needs an AI! Goddamn right. Just think of what Mohammed could do with a $50k "fleet" of AI agents, other than burn through six months of budget that needs to be passed along to a legion of independent shoestring-budget alpha females. Why, he could create a mobile app! He could integrate with phone systems! He could integrate teleprescription!" I can tell you what he did do with AI. He changed the text box in one of the description fields into an RTF box in one of the description fields and broke everyone's records going back to the dawn of the system; everyone's records are now full of " "s everywhere. Took two days to roll things back. Now - I know a guy. Has worked for Salesforce for like 20 years. His job? Assess your needs, assess Salesforce's stack and custom-build a Salesforce CRM for your organization. His department has been shrinking gradually because Salesforce wanted $150 per user per month to glue my phone system to my EHRs. Now they want $80. "Hey OpenAI, configure this pre-existing Salesforce stack to work with this pre-existing accounting software" is something an AI should be able to do, particularly if you can blame the customer if it doesn't work. Of course, you might end up injecting " " everywhere and having to revert. And we've got a computer science masters' student on our network right now. Her thesis is going to be about tricking the API of one of our EHRs into coughing up useful data we can use to show insurance companies. These are metrics the software is required by law to collect - that whole "HIPAA" thing? It's the Health Insurance PORTABILITY and ACCESSIBILITY Act, there's no "privacy" in there anywhere - and yet, what she gets is blanks and garbage. Masters' student. Thesis project. She's salty that we aren't running EPIC because why wouldn't you run EPIC? Because at my size, EPIC is $30k a seat to buy and $3k a month to operate. And every hospital pays it. Because they have human coders, who solve your problems, who make it all work, and make it so your data can be sent anywhere. EPIC? Epic isn't gonna fill your text fields with " ." _________________________________________ Look. You think I might be able to fumblefuck my way through "vibe-coding?" I'll bet I could. I know how to trick Google into giving me what I want, and what limited entertainment I've derived from OpenAI has turned out exactly how I wanted it to. I could certainly do worse. And my phone system has an API, and one of my EHRs has an API, and I'll bet I could "vibe-code" some glue-ware that would open up my EHR when the phone rings, scan for a known phone number, feed a hit into the phone system so the name pops up automatically and open the EHR so that my receptionist can get some deetz about who she's talking to. This is literally the ORIGINAL SIN of CRMs. It's what they were CREATED to do. It is the only reason they fucking exist - CUSTOMER RELATIONSHIP MANAGEMENT. But what I know? Is if my finger into Google Workspace drops below a threshold of $5 it will stop returning API calls requiring regeneration of the key or else not only does voice transcription fail, the callback to another service I use fails which fails something else and the whole voicemail system goes down. Ask me how I know. Better yet, ask me how long it took to get an answer about this out of Google because the answer is "never" because there's no documentation on any of this shit and you think I'm going to hand over this much mission-critical shit to Microsoft CoPilot or some shit when Mohammed brought down an entire EHR by asking for RTF? Mohammed? Who codes for a living, rather than to just eke out a little extra time to hang out with my wife? You know what I don't need? A tunnel borer. Because there's buried electrical and natural gas in there and I don't need a fucking tunnel. I need two guys who know how to put in drip irrigation, and I will pay them, and I will manage them, because if you just YOLO into this shit you get surprise landscaping. Coders? And people who write about code? Suck bawlz at considering where the code lives. What the code does. And I don't know that we'll ever know what knocked out Telefonica? But my money is on "vibe-coding."And don’t get cocky and try to push it too hard. A coding agent is like a big-ass tunnel borer machine when you've been using power shovels. It is strong, sure, hella strong. But it is expensive, it can still get stuck badly, and you need to guide it carefully at all times. And it's not that fast – it's not going to bore through the English Channel in a day. So don't set unrealistic expectations going in. Just focus on how different this stuff is from 2 years ago when ChatGPT came out, and then marvel at how different it is from 2 months ago when the best we had was chat.
For you CIO-types, fleets will enable your developers to spend thousands of dollars a day. Even if inference costs plummet, the Jevons Paradox will result in higher usage offsetting those costs. If you don’t believe that, go ask to see your bug backlog; it’s basically infinite.
There are two anecdotes that color my opinion here. 1. If you'll recall I worked for a while inside a tech startup building a mobility app. Dev work was split into a small onshore and a large offshore team in India. The onshore team was, essentially, the product manager and four coders of varying seniority. The offshore team was something like 20-30 fulltime developers? If there's anything I'd want to get built or get changed I had a feedback slack that the onshore devs would process into Jira. If there was anything big that I needed to be implemented, I'd get the onshore junior dev to essentially parse my request into Jira for me, often after a few meetings because chopping my idea into baby-sized steps is never a straightforward task. It'd be well-documented what the desired change was, and what the steps were to get there before it went to offshore. Then a week or so later they'd be working on it and have a bunch of extra questions. Then I'd get a new TestFlight version of the app, I'd do a bunch of testing with that, and would often come back with half a dozen edge cases and misinterpretations. Back and forth, testing, questions, back and forth, testing and then it would usually be fine. 2. A good friend of mine works for a Dutch competitor to EPIC. She graduated CS just over a year ago. Basically landed the job as a college intern and was allowed to stay. The first months of her job, she was not allowed to write code, instead just having to review other people's code in order to initiate osmosis for the inner workings of EHRs because there is, essentially, no documentation. The entire company from what I gather seems to operate on tribal knowledge, the elders passing down quirks and edge cases that stay in. It is also company policy to forbid writing any documentation in the code. Instead of documentation, they've created a layer cake of processes that code has to go through to be reviewed and checked. The processes are well-defined, but what goes through it could be whateverthefuck. Most of the time, the changes are very small. They're scared to accidentally break anything in production (and hey, rightly so!) but that also means they rarely change anything large, so the codebase is never refactored and is a layer cake too of cruft accumulating by hundreds of individual developers making changes and writing code in their little corner of the monolith in their particular way. You don't think you can include that in the prompt? Or documentation it reads? I had to regularly instruct the offshore team to do things in a particular way in the app, because I was using a third party tool for our analytics and that tool needs data in a specific way to work. "Yeah I know this is not ideal but can you please define sessionID as text, even though it always consists of numbers only?" The timeline in the past half year or so has been wild, in my humble opinion. When the term vibecoding was coined this January, Claude 3.5 was the best we had, which is (still) quite good at writing a few dozen lines of code but gets lost as soon as it's even slightly bigger than that. My first vibecoding experiences were...rough. The improvements in the past months have been incredibly siginificant but specifically for coding. For my first vibecoded app I had a prototype in 2 minutes and would spend a good hour or two debugging by "there's a new error, go fix it" again and again. For the OVguesser app I mentioned in Pubski? I had a prototype in 2 minutes and...essentially only one or two bugs, despite being a more complex app with 50+ files spread across client and server. I could go straight to "this is great, let me tweak it until I'm happy". Adding reasoning, better prompts, and most of all tool use (you tell the AI how it can Google something, how it can grep a file, how it can use any tool you can imagine) has dramatically improved its ability to do the vast majority of software work. I used to be able to tell AI-gen code from regular code apart. I lost that ability - there are no longer six fingers on the hand, the lighting isn't "off" anymore, especially not with the right prompt. When I showed my EHR friend this week the code Sonnet 4 produces, she too could not find anything bad about it. It genuinely just writes decent code now. The size of what it can write has dramatically increased, from autocompleting your sentence in an IDE to writing functions for you to, now, being able to writing an entire app out of nowhere that sometimes actually works. Now - that does not mean Jesus can take the wheel. I know. The main issue, now, is that the agents are not very good at exploring the solution space. When the code base becomes larger, they often struggle to take the logic of line 265 in file A into account when writing line 1,038 in file B. Or they are too eager to jump to the first solution that sounds remotely like it could work. So you end up with short-sighted solutions that break something else somewhere. It really, really needs checks and balances now to prevent the  's from happening but let's be honest, do you realized how often shit breaks in normal software developers? Are the blanks you are getting from the EHR API any better than the  's because those shit sausages were made differently? Even if there is not a single inch of progress in the models, I'm fairly certain we'll still see progress in the coming years in the ability of AI to improve software engineering. What I didn't know, until reading this article, is what that future could look like. I don't think the blog is a blueprint? There's every chance it will be kneecapped in multiple ways? But I'd be surprised if this is not the direction we'll be heading in for the next few years. Right now we're in the It Moves Fast And Breaks Things era. But that era will end and I'm intrigued slash terrified slash in awe of what that might look like. In a way, some of this is already here and working, just in small pockets. NotebookLM's podcast feature is a single button to the user but behind the scenes it is basically an agent cluster that takes a document, creates a podcast script, refines the script to add uuhs and ahs and other vocal nuances, and text-to-speeches it. Not just in a single pipeline, but even on-the-fly when you "call into" the podcast. You take a complex task, break it up into its constituent parts, and refine an agent with a specific agent-prompt and toolset. Then you tell the smartest AI you can afford "these are your minions, go do this overarching task" and have only that one talk to the human in the loop. From my experience, both this concept of agent clusters and vibecoding feels eerily similar to the way I worked at that mobility app. The onshore dev I worked with the most was a junior dev. He wasn't particularly bright or experienced (he was the same age as I at the time) but he had a) a few tools at his disposal b) he knew what the architecture of the app looked like and c) he was somewhat good at reasoning about code (but I'd often be just as good). His task was to pour my request into the molds of the Jira processes they were used to and he would delegate everything else to his offshore colleagues to actually do. Along the entire process of going back and forth with the junior dev and at times with the Indian devs I'd do nothing different from what the article describes as agent babysitting. And the results of that process was often just, like, implementing 1 new class or function call in the backend. I also don't think my friend at Not-EPIC should be unworried. She knows how wretched the way they develop is. She doesn't know what her code does IRL or what workflow it could break. This fall she has a surgery coming up which will knock her out for a good three months. That means for three months, there are exactly zero people available to deal with problems in her corner of the monolith. When people leave the company (because of course they under-pay and over-ask) it often results in a plethora of problems because the next person put on that bit of the system breaks a bunch of shit because there is nothing documented. My expectation is that management will, sometime in the next years, realize that they can actually fix the fundamental problems with their organization and get more done. Get everyone to talk through the code they manage, record and transcribe it all, and autogenerate the mother of all documentation and mandate docs updates from there on in the code change process. Now you're no longer dependent on tribal knowledge and expensive senior devs. Then, talk to every client about all of their wishes they have, for as long as the client wants. Record and transcribed it all and generate the mother of all backlogs and requirements. You give every medior or even junior tasks from that list and assign them an agent cluster, only requiring the senior devs for the aforementioned checks and balances. They could even take all 8 hours of a work day just for picking the tasks and for checks and balances, and have the agent cluster work throughout the night to provide new code to check and balances. But I highly doubt all of that will require more devs.Are... you going to explain to the AI fleet why that bug has to stay?
The upcoming wave, which I'm calling "agent clusters" – the chariot I hinted at in the last section – should make landfall by Q3. This wave will enable each of your developers to run many agents at once in parallel, every agent working on a different task: bug fixing, issue refinement, new features, backlog grooming, deployments, documentation, literally anything a developer might do.
We have had this very discussion about self-driving cars. We have had this very discussion about AI in audio. We have gotten to the point where AI is tentatively shuttling passengers around urban hubs, effectively turning an open network into a closed system. Waymo and a couple others have gotten to the point where they can replace a fresh-to-the-country Uber driver, under ideal conditions, within a closed environment. But all the companies that proposed a wide-open adaptive environment are either (A) gone, like Uber or (B) killing people with aplomb, like Tesla. We have gotten to the point where AI is tentatively adding chorus effects and synthesizing voices but it still can't mix worth a shit. We had a lengthy discussion about how to do a simple recording and not only did you not hear the chair squeaks, but they were so bad I couldn't do anything with them. AI can't either, of course. Izotope tried to prove that my entire industry was doomed in 2010 and released a tool that went so badly they scrubbed it from the Internet. I'm about to cut a DJ session. You would think that beat-matching and pitch-shifting and crossfading would be the sort of thing an AI could do the shit out of. And yes! Rekordbox has an automix. It's got some fuckin' "AI" tag on it too. And if you want to hear the worst mixes you can imagine, engage it. Fuckin' AI can't even tell the difference between a chorus and a verse. You're not a coder. Neither am I. I know enough to know that I hate it, but that's based a lot on learning to code in fucking Fortran and Turbo Pascal. You know enough that you'd rather code than pick up a soldering iron, but you're very much at the "genie, make me a thing" phase of the adventure. You never got to try a self-driving car in the beginning; you likely would have survived but also likely wouldn't have pushed it into the corner-cases necessary to ensure life-safety for everyone on the road. "Push it into the corner cases" is the thing every AI booster refuses to do. kk in this scenario? For Pro Tools? I'm the onshore dev. Their problem is a lack of documentation, not the imminent threat of AI. The Birth Center has a 500pp binder of documentation on every fucking thing we do, not because we really like documentation but because we're required to have all this shit documented for certification. Because if we fuck up in the clutch, people may die. here's the problem in a nutshell: - Be this company - Be ready to join the 20th century - Be stopping down for eight months to fucking document everything - Be ready to join the 21st century OR - Be this company - Be ready to join the 20th century - Be feeding your codebase to an AI to generate documentation - Be spending three years on pins and needles as you spend eighteen months proofreading the documentation and then eighteen months breaking things you missed From your addendum: FUCKING HELL DUDE. Which is faster - writing it yourself or playing minesweeper with someone else's code? If you answered "playing minesweeper" you just tattled on yourself. Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them. Here's a game I play regularly: "Hey receptionist - I would like a slide for the billboard that says this." (crap slide) "Great. Now apply the discussions we've had about whitespace and readibility." (Vaguely less-crap slide) "Okay awesome. Can you mess with the color palate a little?" (heinously more-crap slide) "Okay try these RGB values" (less-crap slide, copy changes, receptionist goes on crying jag) "No no you're doing great. Er... do this." (less-crap slide, let's ship it) I play this game because it's good for her self-esteem. Her roommate is a designer, so she fancies herself a designer. SHE IS NOT A DESIGNER. We've had all sorts of discussions about rule-of-thirds, read-three-times, don't-flash, etc. About a third of it is accessible to her at any given time. But she has such pride in seeing her work parking-lot sized that it's worth it to me for morale to let her pretend she's designing things, rather than whipping that shit out on my own in a third the time. I give no fux about Gemini's self-esteem Your expectation is that everyone will go "well, it'll make it that last 20 percent no problem so we should adopt it now, and damn the consequences." Mine is, too. The difference is I don't think it'll work out. If there's anything I'd want to get built or get changed I had a feedback slack that the onshore devs would process into Jira. If there was anything big that I needed to be implemented, I'd get the onshore junior dev to essentially parse my request into Jira for me, often after a few meetings because chopping my idea into baby-sized steps is never a straightforward task. It'd be well-documented what the desired change was, and what the steps were to get there before it went to offshore. Then a week or so later they'd be working on it and have a bunch of extra questions. Then I'd get a new TestFlight version of the app, I'd do a bunch of testing with that, and would often come back with half a dozen edge cases and misinterpretations. Back and forth, testing, questions, back and forth, testing and then it would usually be fine.
The first months of her job, she was not allowed to write code, instead just having to review other people's code in order to initiate osmosis for the inner workings of EHRs because there is, essentially, no documentation. The entire company from what I gather seems to operate on tribal knowledge, the elders passing down quirks and edge cases that stay in. It is also company policy to forbid writing any documentation in the code. Instead of documentation, they've created a layer cake of processes that code has to go through to be reviewed and checked.
Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?
Does an intern cost $20/month? Because that’s what Cursor.ai costs.
My expectation is that management will, sometime in the next years, realize that they can actually fix the fundamental problems with their organization and get more done.