They're probably talking about some point after the capabilities of LLMs started to become clear.
It's why Codex, Claude Code, Gemini CLI etc. were developed at all - it was clear that if you wanted a concrete application of LLMs with clear productivity benefits, coding was low-hanging fruit, so all the AI vendors jumped on that and started hyping it.
Sure, but jumping from its amazing these things work for code at all to software engineering is solved is something only grifters or those drunk on the kool-aid did.
I do agree that it was thought that these llm-agents would be extremely useful and that is why they were developed, and I happen to believe they in fact are extremely useful (without disagreeing that much of the stuff in the article definitely does happen.)
I just sort of resent the setup that it was supposed to be X but actually it failed, when not only is there only minor evidence that it failed, but it was only a brief period in time when it was supposed to be X.
I can't deny that this might be a trend in practice, but at companies with reasonably self-aware practices, it isn't, or doesn't need to be.
There's this weird thing that happens with new tools where people seem to surrender their autonomy to them, e.g. "welp, I just get pings from [Slack|my phone|etc] all the time, nothing I can do than just be interrupted constantly." More recently, it's "this failed because Claude chose..." No, Claude didn't choose, the person who submitted the PR chose to accept it.
It's possible to use tools responsibly and effectively. It's also possible to encourage and mentor employees to do that. The idea that a dev has to be effectively on call because they're pushing AI slop is just wrong on so many levels.
Well if the management want to get more AI, they gonna get more AI, and no, I am not gonna be running around making sure their dreams work smoothly under my human supervision - I am gonna let it go all the way they want. In the mean time I focus on improving my skills.
> It's possible to use tools responsibly and effectively. It's also possible to encourage and mentor employees to do that.
It's not in the company's interest to stop employees from overworking. Having people overwork for the same pay under pressure is the desired outcome, actually.
There are many companies which don’t operate like this. What you’re describing is the rather uniquely American ultracapitalist perspective. But as an employee, you have a choice to work for a company like that, or work for one that’s run by humans.
This is a real thing. I spent all of January doing Greenfield development using Claude (I finished the requirements) and all I can say is thank goodness I had the Max 5x plan and not the 20x as I got breaks once the tokens were used up till the next cycle. I was forced to get up and do something else. That something else was biking, rowing, walking. My productivity had never been higher but at what cost? My health no thanks. So I'm glad I'm using the time till token reset for my health. I time it perfectly. I do a walk, row, bike for 1 hour then as I arrive back the tokens are reset. I get like 3 hours nonstop use per token batch with the 5x plan. I've been thinking about going 20x but am scared...
I don’t get this tbh, I use Claude too and my issue is the opposite - too many small breaks. Every time I hit enter my brain wants to checkout because the agent just spins while it creates thousands of tokens and churns on the subject. Even if it’s only 2m, that’s 2m where my mind has nothing to work on.
Hard to stay in flow and engaged.
Feels weirdly similar to being interrupted over slack.
yes agreed. I'm running 3-5 parallel Claude at once with requirements as the input. My prompt is say work on section 5.1 or something very specific. Then I'm monitoring the work across all instances.
you are correct flow is not achieved as this is not programming more like system design, architecture, QA, Product Owner work. It's using the swarm as your own dev team.
But it's also programming as you have to study outputs to ensure they're correct. Some (it seems many) don't do this, and then their outputs usually aren't correct.
In my experience, QA is something like ensuring it responds correctly to input. This is similar, but not the same as code review. I would more liken QA to dynamic review rather than static. Note though that code review can still be a form of QA. (Formal proofing especially.)
That’s what QA departments in software companies do. In many other contexts they examine things produced by machines to ensure they meet the specs and functional requirements for that piece, and if not, either adjust it, have someone else adjust it, or have the adjusted machine spit out another one. They might design tests, fixtures to measure things, etc etc etc but they do not make the things directly.
To be fair, ensuring that machines produce the correct outputs (even by making someone else fix it) is still the kind of process I'm talking about. After all, that's also how it works in software.
It depends on what the machines are supposed to do. I’ve never worked in software QA, but worked as a developer for over a decade and currently work in manufacturing. Is mass-manufacturing totally different? Sure. QA engineers in small high-complexity single-run prototyping shops? It’s not much different.
> It's using the swarm as your own dev team.
reply
Managing high performance dev/ops teams is it's own form of a state of flow. In fact for me, it's much more addicting than any other as the outcomes are usually many multiples of any IC role you could have. Even crazier when you have a "follow the sun" team involved so there the work just gets sequentially handed off and is always in constant motion.
I imagine AI coding is like this for a lot of folks.
I fix this by manually prompting many small changes (with a very fast model — smaller models are fine for simple changes / additions, and you can iterate fast).
So I can still work way faster than programming manually, but I stay in flow. And most importantly, my mental model stays synced the whole time. There's no catching up to do.
Hypothesis: limiting usage / tokens could have a positive effect on project quality, since it forces the developer to think more carefully about the problems they're working on. When you're forced to stop and slow down, you try to be more deliberate with token usage. But if you have unlimited tokens you can just keep generating infinite lines of code without thinking as hard about the problem.
I've seen people on social media bragging about how they're able to produce a mountain of code as if this was praiseworthy.
Sorry, I wasn’t giving a serious answer. It’s just not as amusingly worded as in thought.
Seriously, though, your question is one of those “how long is a piece of string” sort of questions. Just like any other software quality question, it depends on context, competence, goals, market dynamics, organizational culture, project timelines, team expertise, etc.
Do people pay their bills on time? Do people wear seatbelts? Do people brush their teeth for the full two recommended minutes? Depends, depends, depends.
Sorry, I didn't realise you weren't the OP. I was really asking the OP as they said they had large productivity gains from using AI to code. But if you're a professional developer, the same question can be answered by you: do you specifically review all AI generated production code?
In my own case 100% of my code is reviewed by humans (generally me), and that IMO is the only sensible option, and has been the standard since I started coding commercially 33 years ago. I don't use AI to generate code though, other than a few experiments, as I don't really need to write much code these days.
This resonates completely—AI coding assistants are flooding queues with high volumes of code, simply shifting the bottleneck from writing to reviewing. Human reviewers get burned out trying to catch subtle security misconfigurations or performance regressions hidden in AI-generated PRs. We built CloudThinker's code review module to act as a rigorous first pass, catching those vulnerabilities with 96% accuracy so developers only spend time reviewing architectural decisions.
If you’re going to spam your product, you could have the decency to have literally any other contributions on your account. It’s literally just advertisements and I’ll bet a straw penny they’re generated by a chatbot.
Selection bias? The early adopters that are motivated to adopt tools to deliver more, typically also were working more to start with and may have already been struggling with their rate of output?
In my experience, code validation (unit testing, code review, manual testing, etc.) was more of a bottleneck than producing code for the most part. This means that faster code generation wouldn't produce significant gains in throughput unless the code validation speeds up too. In my workplace, I've seen evidence that the people showing the biggest productivity gains from AI coding are now shipping enormous commits that are barely getting any validation. Given the Zeitgeist, others are for some reason more lenient towards that than they normally would be (or should be).
1. llms allow devs to be more productive, so more free time is seen as opportunity for more work. ppl overshoot and just work more
2. generalized tooling makes devs seem more replaceable putting downward pressure on job security (ie work harder or we’ll get someone who will, oh and for less money)
3. llms allow for more “multitasking” (debatable) via many running background tasks, so more opportunities to “just finish one more thing”
Personally, I make a lot more "out of hour" commits than I used to because I'll batch up low priority tasks throughout the day and let the computer chug on them at night when I'm elsewhere. Commits are coming in at all hours, but I'm not actually looking at them until the next morning.
I use it every day and I'm taking off weekends for the first time in a decade. It's done wonders for my mental health. I think teams should pay more attention to the value of pumping the brakes vs. incessant redlining. We may actually be able to have a healthy relationship with AI then.
When people talk about AI increasing developer productivity, they usually focus on the coding part.
In my experience, the bigger change happens after the code is written.
When you move from writing code to supervising agents, your output increases — but your cognitive load increases too.
Instead of writing every line yourself, you're now monitoring systems:
Did the agent go off-script?
Did it retry 50 times while I was asleep?
What did that run actually cost?
The strange part is that the mental burden doesn't disappear just because the agent is autonomous.
In some ways it gets worse, because failures become harder to notice early and harder to contain once they start.
It starts to feel less like programming and more like running operations for a team of extremely fast, extremely literal junior developers.
Curious if others are seeing the same shift.
Yeah I am not sure many people gonna hang around this - I am not sure I wanna do this role. I like building and delivering and ai is great help but I will not be happy supervising agents, there are better jobs. Unless the money is not to be refused
That's a very real concern. For a long time programming felt like a craft: you build something, run it, improve it. Agent workflows introduce a different kind of work. You're not just building anymore, you're supervising. Some people enjoy that shift toward orchestration. Others really don't. I suspect we'll eventually see tools that try to restore the "build and run" feeling instead of turning developers into supervisors.
That really sounds like micro managing jr. developers.
I wonder if the interface for this kind of thing might be better presented as a sort of JIRA ticket system. Define a dependency graph of work with the ability to break down any ticket into more tickets or change priority or relationships etc.
Though I think the micro manage part still doesn’t fit into that model. You’d need the code-level view and not just a ticket covering the tests that satisfy the spec and
performance goals.
> That really sounds like micro managing jr. developers.
That's how I tend to describe AI to a lot of non-technical people (I actually generally say it's like having an really fresh intern who can read technical docs insanely fast but needs a lot of supervision).
That's a really good analogy.
The interesting part is that the "intern" is not only fast,
but also extremely confident.
A human intern usually hesitates, asks questions,
or signals uncertainty when they are unsure.
Agents often produce very clean-looking output
even when the reasoning behind it is shaky.
So part of the supervision isn't just checking the result,
but trying to detect when the confidence is misleading.
I think a lot of people feel this tension.
Programming used to be mostly about building things directly.
You write code, run it, fix it, repeat.
With agents it starts to shift toward supervision:
define the task, watch the output, correct the drift.
It's a different kind of work.
Sometimes it feels less like programming and more like
managing a very fast team that never gets tired but
also never really understands the goal unless you
spell it out extremely carefully.
I suspect a lot of developers still enjoy the "building"
part more than the "supervising" part.
> I'm 8x more productive than I was in 2022... And I jokingly say "I'm probably not going to have a job in 1 or 2 years"...
> We are going to create incredible value to humanity. 8x rate. I don't know what our hourly will be.
Do we have 8x more demand for software than current developers can currently supply? No, we don’t. Many developers will soon have a very bitter pill to swallow once they realize that developers are the beneficiaries of good market dynamics rather than some precious intellectual elite whose skills are monetizable in any other context.
Their hourly will be whatever DoorDash pays them to deliver pizza and egg rolls. Grad school isn’t any safer, and frankly, many of the soft, arrogant and maladroit people I’ve seen try to enter blue collar trades fail very quickly once they realize how hard the road is to get to those high salaries they always hear about.
>Do we have 8x more demand for software than current developers can currently supply? No, we don’t.
Demand for software has been pretty elastic historically. In 1970, the quantity of software that's written now would have seemed absurd (and demand for that amount of software certainly didn't exist back then). I don't know if there'll be another huge increase in the amount of software that gets written or not, but it doesn't seem so implausible. For example, there's a long tail of Excel spreadsheets waiting to be turned into applications. The cheaper it becomes to do that, the more of these spreadsheets become suitable candidates for appification.
And what good does today’s software demand do for late career developers laid off during the .com bust? Just in time to have the derivatives shenanigans kill their retirement accounts right when they needed them? The fact that companies making more money than ever are laying off software engineers indicates that at the very least, this demand isn’t imminent. I don’t think it even makes sense to imagine there’s going to be another internet revolution in the near future.
I don’t care about the health of the industry — that will do fine. I’m more concerned with the wellbeing of people pretty likely about to get run over by the productivity train. People’s mortgages, cancer treatments, tuition for kids, and retirement plans don’t care if Microsoft is doing great in 10 years if there’s 8 years until the market needs them again.
This isn't the first time in history that companies have laid off software engineers. Late career software engineers are in a much better position economically than most people.
I feel totally the opposite. I feel like I'm better able to have more work-life balance. Our predictions are more accurate. I'm enjoying working on actual problems rather than boilerplate. These tools are amazing
Prior to the rise of LLM coding, developers had to, from time to time, spent time deep diving through large amounts of code they didn't write. Hundreds of thousands of lines or even millions. This might happens when starting a new job, or changing projects, or when tasked with evaluating some third party tech or integrating it, or taking over something that was previously owned by another developer.
During such phases of work, it's not unusual to put in some long hours in order to get up to speed.
With LLMs, it is possible to perpetually experience hundreds of thousands of lines of third party code, on about a weekly basis.
But this is not the same. The code is not known by anyone, anywhere else. It exists nowhere else, and so has no track record of deployment. No documentation, nothing. It's not something where you can concentrate on making a small modification, while trusting the rest of it to be working.
Im waiting for a large incident to occur - something on the level of the Concorde crash - that shuttered any further attempts to going faster in commercial air travel.
furyofantares | a month ago
At what point in time? Did anyone foresee coding being one of the best and soonest applications of this stuff?
antonvs | a month ago
It's why Codex, Claude Code, Gemini CLI etc. were developed at all - it was clear that if you wanted a concrete application of LLMs with clear productivity benefits, coding was low-hanging fruit, so all the AI vendors jumped on that and started hyping it.
furyofantares | a month ago
I do agree that it was thought that these llm-agents would be extremely useful and that is why they were developed, and I happen to believe they in fact are extremely useful (without disagreeing that much of the stuff in the article definitely does happen.)
I just sort of resent the setup that it was supposed to be X but actually it failed, when not only is there only minor evidence that it failed, but it was only a brief period in time when it was supposed to be X.
whattheheckheck | a month ago
throwaway314155 | a month ago
djeastm | a month ago
SoftTalker | a month ago
antonvs | a month ago
There's this weird thing that happens with new tools where people seem to surrender their autonomy to them, e.g. "welp, I just get pings from [Slack|my phone|etc] all the time, nothing I can do than just be interrupted constantly." More recently, it's "this failed because Claude chose..." No, Claude didn't choose, the person who submitted the PR chose to accept it.
It's possible to use tools responsibly and effectively. It's also possible to encourage and mentor employees to do that. The idea that a dev has to be effectively on call because they're pushing AI slop is just wrong on so many levels.
cejast | a month ago
I can relate to this, unfortunately these tools are becoming a very convenient way to offload any kind of responsibility when something goes wrong.
democracy | a month ago
fnimick | a month ago
It's not in the company's interest to stop employees from overworking. Having people overwork for the same pay under pressure is the desired outcome, actually.
antonvs | a month ago
diavelguru | a month ago
unshavedyak | a month ago
Hard to stay in flow and engaged.
Feels weirdly similar to being interrupted over slack.
androiddrew | a month ago
diavelguru | a month ago
diavelguru | a month ago
diavelguru | a month ago
LoganDark | a month ago
haliskerbas | a month ago
DrewADesign | a month ago
LoganDark | a month ago
DrewADesign | a month ago
LoganDark | a month ago
DrewADesign | a month ago
phil21 | a month ago
Managing high performance dev/ops teams is it's own form of a state of flow. In fact for me, it's much more addicting than any other as the outcomes are usually many multiples of any IC role you could have. Even crazier when you have a "follow the sun" team involved so there the work just gets sequentially handed off and is always in constant motion.
I imagine AI coding is like this for a lot of folks.
MattGaiser | a month ago
At least in my case, flow is gone. It’s all context switching now.
bluefirebrand | a month ago
arjie | a month ago
amelius | a month ago
andai | a month ago
So I can still work way faster than programming manually, but I stay in flow. And most importantly, my mental model stays synced the whole time. There's no catching up to do.
TheAceOfHearts | a month ago
I've seen people on social media bragging about how they're able to produce a mountain of code as if this was praiseworthy.
DrewADesign | a month ago
cpncrunch | a month ago
DrewADesign | a month ago
cpncrunch | a month ago
DrewADesign | a month ago
Seriously, though, your question is one of those “how long is a piece of string” sort of questions. Just like any other software quality question, it depends on context, competence, goals, market dynamics, organizational culture, project timelines, team expertise, etc.
Do people pay their bills on time? Do people wear seatbelts? Do people brush their teeth for the full two recommended minutes? Depends, depends, depends.
cpncrunch | a month ago
In my own case 100% of my code is reviewed by humans (generally me), and that IMO is the only sensible option, and has been the standard since I started coding commercially 33 years ago. I don't use AI to generate code though, other than a few experiments, as I don't really need to write much code these days.
bmd1905 | a month ago
DrewADesign | a month ago
cpncrunch | a month ago
DrewADesign | a month ago
democracy | a month ago
Fordec | a month ago
dworks | a month ago
cpncrunch | a month ago
0xcafefood | a month ago
ausbah | a month ago
1. llms allow devs to be more productive, so more free time is seen as opportunity for more work. ppl overshoot and just work more
2. generalized tooling makes devs seem more replaceable putting downward pressure on job security (ie work harder or we’ll get someone who will, oh and for less money)
3. llms allow for more “multitasking” (debatable) via many running background tasks, so more opportunities to “just finish one more thing”
poink | a month ago
rglover | a month ago
qzira | a month ago
democracy | a month ago
qzira | a month ago
Waterluvian | a month ago
I wonder if the interface for this kind of thing might be better presented as a sort of JIRA ticket system. Define a dependency graph of work with the ability to break down any ticket into more tickets or change priority or relationships etc.
Though I think the micro manage part still doesn’t fit into that model. You’d need the code-level view and not just a ticket covering the tests that satisfy the spec and performance goals.
ender341341 | a month ago
That's how I tend to describe AI to a lot of non-technical people (I actually generally say it's like having an really fresh intern who can read technical docs insanely fast but needs a lot of supervision).
qzira | a month ago
qzira | a month ago
butILoveLife | a month ago
My 6 year old is doing my job.
The best I can hope for is that HN article that said the word "Context".
I know the magic words "Make me a single page html js web app"... or "Install Virtual Box with Fedora Cinnamon using CLI"....
I'm 8x more productive than I was in 2022... And I jokingly say "I'm probably not going to have a job in 1 or 2 years"...
We are going to create incredible value to humanity. 8x rate. I don't know what our hourly will be.
DrewADesign | a month ago
> We are going to create incredible value to humanity. 8x rate. I don't know what our hourly will be.
Do we have 8x more demand for software than current developers can currently supply? No, we don’t. Many developers will soon have a very bitter pill to swallow once they realize that developers are the beneficiaries of good market dynamics rather than some precious intellectual elite whose skills are monetizable in any other context.
Their hourly will be whatever DoorDash pays them to deliver pizza and egg rolls. Grad school isn’t any safer, and frankly, many of the soft, arrogant and maladroit people I’ve seen try to enter blue collar trades fail very quickly once they realize how hard the road is to get to those high salaries they always hear about.
foldr | a month ago
Demand for software has been pretty elastic historically. In 1970, the quantity of software that's written now would have seemed absurd (and demand for that amount of software certainly didn't exist back then). I don't know if there'll be another huge increase in the amount of software that gets written or not, but it doesn't seem so implausible. For example, there's a long tail of Excel spreadsheets waiting to be turned into applications. The cheaper it becomes to do that, the more of these spreadsheets become suitable candidates for appification.
DrewADesign | a month ago
I don’t care about the health of the industry — that will do fine. I’m more concerned with the wellbeing of people pretty likely about to get run over by the productivity train. People’s mortgages, cancer treatments, tuition for kids, and retirement plans don’t care if Microsoft is doing great in 10 years if there’s 8 years until the market needs them again.
foldr | a month ago
dwhitney | a month ago
kazinator | a month ago
During such phases of work, it's not unusual to put in some long hours in order to get up to speed.
With LLMs, it is possible to perpetually experience hundreds of thousands of lines of third party code, on about a weekly basis.
But this is not the same. The code is not known by anyone, anywhere else. It exists nowhere else, and so has no track record of deployment. No documentation, nothing. It's not something where you can concentrate on making a small modification, while trusting the rest of it to be working.
booleandilemma | a month ago
wvxf | a month ago
May take awhile. But it'll come eventually.