They absolutely do, the CEO has come out and said a few engineers have told him that they dont even write code by hand anymore. To some people that sounds horrifying, but a good engineer would not just take code blindly, they would read it and refine it using Claude, while still saving hundreds of man hours.
I would love to hear/see a definitive answer for this, but I read somewhere that the relationship between MS and \A is such that the copilot version of the \A models has a smaller context window than through CC.
This would explain the "secret sauce", if it's true. But perhaps it's not and a lot is LLM nondeterminism mixing with human confirmation bias.
Agreed. I was an early adopter of Claude Code. And at work we only had Copilot. But the Copilit CLI isn't too bad now. you've got slash commands for Agents.MD and skills.md files now for controlling your context, and access to Sonnet & Opus 4.5.
Maybe Microsoft is just using it internally, to finish copying the rest of the features from Claude Code.
Much like the article states, I use Claude Code beyond just it's coding capabilities....
Same situation, once I discovered the CLI and got it set up, my happiness went up a lot. It's pretty good, for my purposes at work it's probably as good as Claude Code.
I'm amazed that a company that's supposedly one of the big AI stocks seemingly won't spare a single QA position for a major development tool. It really validates Claude's CLI-first approach.
Microsoft have a goal that states they want to get to "1 engineer, 1 month, 1 million lines of code." You can't do that if you write the code yourself. That means they'll always be chasing the best model. Right now, that's Opus 4.5.
I've not heard that goal before. If true, it makes me sad to hear that once again, people confuse "More LOC == More Customer Value == More Profit". Sigh.
I've written a C recompiler in an attempt to build homomorphic encryption. It doesn't work (it's not correct) but it can translate 5 lines of working code in 100.000 lines of almost-working code.
Any MBAs want to buy? For the right price I could even fix it ...
Totally agreed. The numbers are silly. My only point is that you don't need 100k engineers if you're letting Claude dump all that code into production.
> “My goal is to eliminate every line of C and C++ from Microsoft by 2030,” Microsoft Distinguished Engineer Galen Hunt writes in a post on LinkedIn. “Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases.
why stopping at rust? Let's have a windows version written in python another in crystal and another in java. At least the generated code will be readable and maintainable!!!/s
Which is a bald-faced lie written in response to a PR disaster. The original claims were not ambiguous:
> My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”.
Obviously, "every line of C and C++ from Microsoft" is not contained within a single research project, nor are "Microsoft's largest codebases".
The authentic quote “1 engineer, 1 month, 1 million lines of code” as some kind of goal that makes sense, even just for porting/rewriting, is embarassing enough from an OS vendor.
As @mrbungie says on this thread: "They took the stupidest metric ever and made a moronic target out of it"
The original claims were not ambigious, it's "My" goal not "Microsoft's goal".
The fact that it's a "PR disaster" for a researcher to have an ambitious project at one of the biggest tech companies on the planet, or to talk up their team on LinkedIn, is unbelievably ridiculous.
One supposes, when a highly senior employee publicly talks about project goals in recruitment material, that they are not fancifully daydreaming about something that can never happen but are in fact actually talking about the work they're doing that justifies their ~$1,000,000/yr compensation in the eyes of their employer.
Talking about rewriting Windows at a rate of 1 million lines of code per engineer per month with LLMs is absolutely going to garner negative publicity, no matter how much you spin it with words like "ambitious" (do you work in PR? it sounds like it's your calling).
You suppose that there are no highly-paid researchers on the planet working on AGI? Because there are, and that's less proven than "porting one codebase to another language" is. What about Quantum Computers, what about power-producing nuclear fusion? Both less proven than porting code. What about all other blue-sky research labs?
Why would you continue supposing such a thing when both the employee, and the employer, have said that your suppositions are wrong?
Sure, there are plenty of researchers working on fanciful daydreams. They pursue those goals at behest of their employer. You attempted to make a distinction between the employer and the employee's goals, as though a Distinguished Engineer at Microsoft was just playing around on a whim doing hobby projects for fun. If Microsoft is paying him $1m annually to work on this, plus giving him a team to pursue the goal of rewriting Windows, it is not inaccurate to state that Microsoft's goal is to completely rewrite Windows with LLMs, and they will earn negative publicity for making that fact public. The project will likely fail given how ridiculous it is, but it is still a goal they are funding.
It is kind of funny that throughout my career, there has always been pretty much a consensus that lines of code are a bad metric, but now with all the AI hype, suddenly everybody is again like “Look at all the lines of code it writes!!”
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
It all comes from "if you can't measure it you can't improve it". The job of management is to improve things, and that means they need to measure it and in turn look for measures. When working on an assembly line there are lots of things to measure and improve, and improving many of those things have shown great value.
They want to expand that value into engineering and so are looking for something they can measure. I haven't seen anyone answer what can be measured to make a useful improvement though. I have a good "feeling" that some people I work with are better than others, but most are not so bad that we should fire them - but I don't know how to put that into something objective.
Yes, the problem of accurately measuring software "productivity" has stymied the entire industry for decades, but people keep trying. It's conceivable that you might be able to get some sort of more-usable metric out of some systematized AI analysis of code changes, which would be pretty ironic.
Ballmer hasn’t been around for a long long time. Not since the Red Ring of Death days. Ever since Satya took the reins, MBAs have filled upper and middle management to try to take over open source so that Sales guys had something to combat RedHat. Great for open source. Bad for Microsoft. However, Satya comes from the Cloud division so he knows how to Cloud and do it well. Azure is a hit with the enterprise. Then along comes AI…
Microsoft lost its way with Windows Phone, Zune, Xbox360 RRoD, and Kinect. They haven’t had relevance outside of Windows (Desktop) in the home for years. With the sole exception being Xbox.
They have pockets of excellence. Where great engineers are doing great work. But outside those little pockets, no one knows.
Ironically, AI may help get past that. In order to measure "value chunks" or some other metric where LoC is flexibly multiplied by some factor of feature accomplishment, quality, and/or architectural importance, an opinion of the section in question is needed, and an overseer AI could maybe do that.
I believe the "look at all the lines of code" argument for LLMs is not a way to showcase intelligence, but more-so a way to showcase time saved. Under the guise that the output is the/a correct solution, it's a way to say "look at all the code I would have had to write, it saved so much time".
It's all contextual. Sometimes, particularly when it comes to modern frontends, you have inescapable boilerplate and lines of code to write. Thats where it saves time. Another example is scaffolding out unit tests for series of services. There are many such cases where it just objectively saves time.
it's still a bad metric and OP is also just being loose by repeating some marketing / LinkedIn post by a person who uses bad metrics about an overhyped subject
I wonder if we can use the compression ratio that an LLM-driven compressor could generate to figure out how much entropy is actually in the system and how much is just boilerplate.
Of course then someone is just going to pregenerate a random number lookup table and get a few gigs of 'value' from pure garbage...
I used to work at a place that had the famous Antoine de Saint-Exupéry quote painted near the elevators where everyone would see it when they arrived for work:
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
> "Microsoft have a goal that states they want to get to "1 engineer, 1 month, 1 million lines of code.""
No, one researcher at Microsoft made a personal LinkedIn post that his team were using that as their 'North Star' for porting and transpiling existing C and C++ code, not writing new code, and when the internet hallucinated that he meant Windows and this meant new code, and started copypasting this as "Microsoft's goal", the post was edited and Microsoft said it isn't the company's goal.
That's still writing new code. Also, its kind of an extremely bad idea to do that because how are you going to test it? If you have to rewrite anything (hint: you probably don't) its best to do it incrementally over time because of the QA and stakeholder alignment overhead. You cannot push things into production unless it works as its users are expecting and it does exactly what stakeholders expect as well.
No no, your talking common sense and logic. You can't think like that. You have to think "How do I rush out as much code as possible?" After all, this is MS we're talking about, and Windows 11 is totally the shining example of amazing and completely stable code. /s
I mean, if 1% out of 8 billion is "top" and that applies to Lines of Code, too, than ... more code contains more quality, ... by their logic, I guess ...
Wow such bad practice, using lines of code as a performance metric has been shown to be really bad practice decades ago. For a software company to do this now...
1. Classic Coding (Traditional Development)
In the classic model, developers are the primary authors of every line.
Production Volume: A senior developer typically writes between 10,000 and 20,000 lines of code (LOC) per year.
Workflow: Manual logic construction, syntax memorization, and human-led debugging using tools like VS Code or JetBrains IDEs.
Focus: Writing the implementation details. Success is measured by the quality and maintainability of the hand-written code.
2. AI-Supported Coding (The Modern Workflow)
AI tools like GitHub Copilot and Cursor act as a "pair programmer," shifting the human role to a reviewer and architect.
Production Volume: Developers using full AI integration have seen a 14x increase in code output (e.g., from ~24k lines to over 810k lines in a single year).
Work Distribution: Major tech leaders like AWS report that AI now generates up to 75% of their production code.
The New Bottleneck: Developers now spend roughly 70% of their time reviewing AI-generated code rather than writing it.
I think realistic 5x to 10x is possible. 50.000 - 200.000 LOC per YEAR !!!! Would it be good code? We will see.
I have found that Claude Code is better in every way I've used it. I like to use LLM's just as an advanced refactoring tool, especially where plain string search isn't enough. Anyway, my first experience of Copilot was it plainly lying that it deleted files I asked it to, and it insisted the file no longer existed (it did).
I installed Claude Code yesterday after the quality of VSCode Copilot Chat continuously is getting worse every release. I can't tell yet if Claude Code is better or not but VSCode Copilot Chat has become completely unusable. It would start making mistakes which would double the requests to Claude Opus 4.5 which in January is the only model that would work at all. I spent $400 in tokens in January.
I'll know better in a week. Hopefully I can get better results with the $200 a month plan.
Claude Code’s subscription pricing is pretty ridiculously subsidized compared to their API pricing if you manage to use anywhere close to the quota. Like 10x I think. Crazy value if you were using $400 in tokens.
I just upgraded to the $100 a month 5x plan 5 minutes ago.
Starting in October with Vscode Copilot Chat it was $150, $200, $300, $400 per month with the same usage. I thought they were just charging more per request without warning. The last couple weeks it seemed that vscode copilot was just fucking up making useless calls.
Perhaps, it wasn't a dark malicious pattern but rather incompetence that was driving up the price.
Not my experience at all. Copilot launched as a useless code complete, is now basically the same as anything. It's all converging. The features are converging, but the features barely matter anyway when Opus is just doing all the heavy lifting anyway. It just 1-shots half the stuff. Copilot's payment model where you pay by the prompt not by the token is highly abusable, no way this lasts.
I would agree. I've been using VSCode Copilot for the past (nearly) year. And it has gotten significantly better. I also use CC and Antigravity privately - and got access to Cursor (on top of VSCode) at work a month ago
CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks
Explains why Windows updates have been more broken than usual lately.
But I guess having my computer randomly stop working because a billion dollar corporation needs to save money by using a shitty text generation algorithm to write code instead of hiring competent programmers is just the new normal now.
I switched to Ubuntu last week for my desktop. First time in my 25+ year career I’ve felt like Microsoft was wasting my time more than administering a Linux desktop would take. The slop effect is real.
I wasn't making an argument. It was a prediction that all major software, (including the major linux distros) will eventually be majority (>50%) AI generated. Software that is 100% human generated will be like getting a hand knitted sweater at a farmers market. Available, but expensive and only produced at very small scale.
On what reasoning do you make this prediction? Just because corporations are mandating their employees to use AI right now does not mean it will continue.
Any new software developers entering the field from this point on will have to know how to use and be expected to use AI code-gen tools to get employment. Moving forward, eventually all developers use these tools routinely. There will be a point in the future where there is no one left working that has ever coded anything complex thing from scratch without AI tools. Therefore, all* code will have AI code-gen as all* developers will be using them.
* all mean 'nearly all' as of course there will be exceptions.
So eventually, doesn't the KPI move from "more code" to "better code"? The pendulum will have to swing the other way eventually; seems like microsoft is just accelerating that process
> doesn't the KPI move from "more code" to "better code"?
I would love for this to be true. But another scenario that could play out is that this process accelerates software bloat that was already happening with human coded software. Notepad will be a 300GB executable in 2035.
And this will cause what I'm talking about -- When nobody can afford memory because it's all going into the ocean-boiling datacenters, all of a sudden someone selling a program that fits into RAM will have a very attractive product
I am not getting what that linked url is supposed to mean. It is a very decent business page where ubuntu is selling consulting for "your" projects and telling why ubuntu is great for developing AI systems.
I've used Kubuntu for several years, wife too now which is an official, supported flavor of Ubuntu using KDE desktop instead of Gnome. It gives a more Windows like or CDE (Common Desktop Environment - from UNIX systems) feel than Gnome which gives a more Mac feel.
Do you have "Get the latest updates as soon as they're available" enabled? This automatically installs preview releases, so you may unwittingly be doing QA for Microsoft.
Tldr: Copilot has 1% marketshare among web chatbots and 1.85% of paid M365 users bought a subscription to it.
As much as I think AI is overrated already, Copilot is pretty much the worst performing one out there from the big tech companies. Despite all the Copilot buttons in office, windows, on keyboards and even on the physical front of computers now.
We have to use it at work but it just feels like if they spent half the effort they spend on marketing on actually trying to make it do its job people might actually want to use it.
Half the time it's not even doing anything. "Please try again later" or the standard error message Microsoft uses for every possible error now: "Something went wrong". Another pet peeve of mine, those useless error messages.
Yeah but it's the mainstream public that was just blown away with the LLM party trick. If it sounds like a human it must be smart like a human. So that's what everyone wants to sell :(
PS: When I say party trick I don't deny it has its uses but it's currently used like the jesus-AI that can do anything.
They put it into the Azure portal, and I tried to get it to answer me what the open resource cost us in storage. It appeared retarded at first, but then I realized it didn't have access to know what I had opened or anything.
Until MS makes sure their models get the necessary context, I don't even care to click on them.
Hmm, 8M paid M365 Copilot users leaked in August, and at last week's earnings call the number was 15M.
Assuming the leak was accurate, almost doubling usage in 4 months for an enterprise product seems like pretty fast growth?
Its growth trajectory seems to be on par with Teams so far, another enterprise product bundled with their M365 suite, though to be fair Teams was bundled for free: https://www.demandsage.com/microsoft-teams-statistics/
Notable inflection point right around the time unlimited data became an afterthought and every piece of software decided it “needs” to spy on—— I mean needs to offer Fulfilling Connected Experiences at all times.
I try GitHub Copilot every once in a while, and just last month it still managed to produce diffs with unbalanced curly braces, or tried to insert (what should be) a top-level function into the middle of another function and screw up everything. This wasn’t on a free model like GPT 4.1 or 5-mini, IIRC it was 5.2 Codex. What the actual fuck? Only explanation I can come up with is that their pay-per-request model made GHC really stingy with using tokens for context, even when you explicitly ask it to read certain files it ends up grepping and adding a couple lines.
You're not using the good models and then blaming the tool? Just use claude models.
Copilot's main problem seems to be people don't know how to use it. They need to delete all their plugins except the vscode, CLI ones, and disable all models except anthropic ones.
The Claude Code reputation diff is greatly exaggerated beyond that.
What, 5.2 Codex isn’t a good model? Claude 4.5 and Gemini 3 Pro with Copilot aren’t any better, I don’t have enough of a sample of Opus 4.5 usage with Copilot to say with confidence how it fares since they charge 3x for Opus 4.5 compared to everything else.
If Copilot is stupid uniquely with 5.2 Codex then they should disable that instead of blaming the user (I know they aren’t, you are). But that’s not the case, it’s noticeably worse with everything. Compared to both Cursor and Claude Code.
I had my first go at using it (Github Copilot) last week, for a simple refactoring task. I'd have to say I reasonably specified it, yet it still managed to to fail to delete a closing brace when it removed the opening block as specified.
That was using the Claude Sonnet 4.5 model, I wonder if using the Opus 4.5 model would have managed to avoid that.
True story: a lot of the Microsoft engineers I interact with actually do use Apple hardware. Admittedly, I onto interact with the devs on the .NET (and related technologies) departments.
Specifically WHY they use Apple hardware is something I can only speculate on. Presumably it's easier to launch Windows on Mac than the other way around, and they would likely need to do that as .NET and its related technologies are cross platform as of 2016. But that's a complete guess on my part.
Am *NOT* a Microsoft employee, just an MVP for Developer Technnolgies.
I still don't understand how Microsoft lets standby remain broken. I can never leave the PC in my bedroom ij standby because it will randomly wake up and blast the coolers.
Sadly even if Microsoft had a few lineups of laptops that they'd use internally and recommend, companies would still get the shitty ones, if it saves them $10 per device.
S3 sleep was a solved problem until Microsoft decided that your laptop must download ads^Wsuggestions in the background and deprecated it. On firmwares still supporting S3, it works perfectly.
Sleep used to work perfectly fine up until, I don't know, 10 years ago. I doubt hardware/firmware/BIOS got worse since then, this is 100% a Microsoft problem.
Consumers _do not care_ if it is the firmware or Windows.
Dell was one of the earlier brands, and biggest, to suffer these standby problems. Dell has blamed MS and MS has blamed Dell, and neither has been in any hurry to resolve the issues.
I still can't put my laptop in my backpack without shutting it down, and as a hybrid worker, having to tear down and spin up my application context every other day is not productive.
Yeah I hear you. One of the reasons I’m still inclined towards Mac laptops for “daily drivers” is precisely because it’s disruptive to have to do a full shutdown that obliterates my whole workspace. Other manufacturers can be fine for single-use machines (e.g. a study laptop that only ever has Anki and maybe a browser and music app open), but every step beyond that brings increased friction.
Maybe the most tragic part is that this drags down Linux and plagues it with these hardware rooted sleep issues too.
To be fair, this was also my experience with Macbooks. This "smart sleep" from modern OS manufacturers is the dumbest shit ever, please just give me a hibernate option.
I used to have trouble with sleep on M-series macs on occasion, but after turning off wake on LAN they’ve all slept exactly as expected for the past several years.
100% true story - until a couple of months ago, the best place to talk directly to Microsoft senior devs was on the macadmins slack. Loads of them there. They would regularly post updates, talk to people about issues, discuss solutions, even happy to engage in DMS. All posting using their real names.
The accounts have now all gone quiet, guess they got told to quit it.
Because Windows' UX is trash? Anyone with leverage over their employer can and should request a Mac. And in a hot market, developers/designers did have that leverage (maybe they still do) and so did get their Macs as requested.
Only office drones who don't have the leverage to ask for anything better or don't know something better exists are stuck with Windows. Everyone else will go Mac or Linux.
Which is why you see Windows becoming so shit, because none of the culprits actually use it day-to-day. Microsoft should've enforced a hard rule about dogfooding their own product back in the Windows 7 days when the OS was still usable. I'm not sure they could get away with it now without a massive revolt and/or productivity stopping dead in its tracks.
Am a software engineer at Microsoft using a M3 MBP, opinions are my own and all. Honestly one (of many) reasons I opted to go through the exception process to request a macbook was the screen brightness. The fact you can run software to boost the screen to HDR brightness levels for SDR content is insanely useful for working outside.
For one reason or another everyone seems to be sleeping on Gemini. I have been exclusively using Gemini 3 Flash to code these days and it stands up right alongside Opus and others while having a much smaller, faster and cheaper footprint. Combine it with Antigravity and you're basically using a cheat code.
Yeah I don't understand why everyone seems to have forgotten about the Gemini options. Antigravity, Jules, and Gemini CLI are as good as the alternatives but are way more cost effective. I want for nothing with my $20/mo Google AI plan.
Yeah I'm on the $20/mo Google plan and have been rate limited maybe twice in 2 months. Tried the equivalent Claude plan for a similar workload and lasted maybe 40 minutes before it asked me to upgrade to Max to continue.
It's crazy that we're having such different experiences. I purchased the Google AI plan as an alternative to my ChatGPT (Codex) daily driver. I use Gemini a fair amount at work, so I thought it would be a good choice to use personally. I used it a few times but ran into limits the first few projects I worked on. As a result I switched to Claude and so, far, I haven't hit any limits.
I think Gemini is an excellent model, it's just not a particularly great agent. One of the reasons is that its code output is often structured in a way that looks like it's answering a question, rather than generating production code. It leaves comments everywhere, which are often numbered (which not only is annoying, but also only makes sense if the numbering starts within the frame of reference of the "question" it's "answering").
It's also just not as good at being self-directed and doing all of the rest of the agent-like behaviors we expect, i.e. breaking down into todolists, determining the appropriate scope of work to accomplish, proper tool calling, etc.
Yeah, you may have nailed it. Gemini is a good model, but in the Gemini CLI with a prompt like, "I'd like to add <feature x> support. What are my options? Don't write any code yet" it will proceed to skip right past telling me my options and will go ahead an implement whatever it feels like. Afterward it will print out a list of possible approaches and then tell you why it did the one it did.
Codex is the best at following instructions IME. Claude is pretty good too but is a little more "creative" than codex at trying to re-interpret my prompt to get at what I "probably" meant rather than what I actually said.
Can you (or anyone) explain how this might be? The "agent" is just a passthrough for the model, no? How is one CLI/TUI tool better than any other, given the same model that it's passing your user input to?
I am familiar with copilot cli (using models from different providers), OpenCode doing the same, and Claude with just the \A models, but if I ask all 3 the same thing using the same \A model, I SHOULD be getting roughly the same output, modulo LLM nondeterminism, right?
I've had the exact opposite experience. After including in my prompt "don't write any code yet" (or similar brief phrase), Gemini responds without writing code.
My go-to models have been Claude and Gemini for a long time. I have been using Gemini for discussions and Claude for coding and now as an agent. Claude has been the best at doing what I want to do and not doing what I don’t want to do. And then my confidence in it took a quantum leap with Opus 4.5. Gemini seems like it has gotten even worse at doing what I want with new releases.
I think counter to the assumption of myself (and many), for long form agent coding tasks, models are not as easily hot swappable as I thought.
I have developed decent intuition on what kinds of problems Codex, Claude, Cursor(& sub-variants), Composer etc. will or will not be able to do well across different axes of speed, correctness, architectural taste, ...
If I had to reflect on why I still don't use Gemini, it's because they were late to the party and I would now have to be intentional about spending time learning yet another set of intuitions about those models.
I feel like "prompting language" doesn't translate over perfectly either. It's like we become experts at operating a particular AI agent.
I've been experimenting with small local models and the types of prompts you use with these are very different than the ones you use with Claude Code. It seems less different between Claude, Codex, and Gemini but there are differences.
It's hard to articulate those differences but I think that I kind of get in a groove after using models for a while.
Maybe it's the types of projects I work on but Gemini is basically unusable to me. Settled on Claude Code for actual work and Codex for checking Claude's work.
If I try to mix in Gemini it will hallucinate issues that do not exist in code at very high rate. Claude and Codex are way more accurate at finding issues that actually exist.
For me it just depends on the project. Sometimes one or the other performs better. If I am digging into something tough and I think it's hallucinating or misunderstanding, I will typically try another model.
I've never, ever had a good experience with Gemini (3 Pro). It's been embarrassingly bad every time I've tried it, and I've tried it lots of times. It overcomplicates almost everything, hallucinates with impressive frequency, and needs to be repeatedly nudged to get the task fully completed. I have no reason to continue attempting to use it.
Same. Sometimes even repeated nudges don't help. The underlying 3.0 Pro model is great to talk and ideate with, but its inability to deliver within the Gemini CLI harness is ... almost comical.
Oddly enough, as impressive as Gemini 3 is, I find myself using it infrequently. The thing Gemini 2.5 had over the other models was dominance in long context, but GPT5.2-codex-max and Opus 4.5 Thinking are decent at long context now, and collectively they're better at all the use cases I care about.
For all the hype I see about Gemini, we integrated it with our product (an AI agent) and it consistently performs worse[0] than Claude Sonnet, Opus, and ChatGPT 5.2
This comment is a bit confusing and surprising to me because I tried Antigravity three weeks ago and it was very undercooked. Claude was actually able to identify bugs and get the bigger picture of the project, while Gemini 3 with Antigravity often kept focusing on unimportant details.
My default everyday model is still Gemimi 3 in AI Studio, even for programming related problems. But for agentic work Antigravity felt very early-stages beta-ware when I tried it.
I will say that at least Gemimi 3 is usually able to converge on a correct solution after a few iterations. I tried Grok for a medium complexity task and it quickly got stuck trying to change minor details without being able to get itself out.
Do you have any advice on how to use Antigravity more effectively? I'm open to trying it again.
Ask it to verify stuff in the browser. It can open a special Chrome instance, browse URLs, click and scroll around, inspect the DOM, and generally do whatever it takes to verify that the problem is actually solved, or it will go back and iterate more. That feedback loop IMO makes it very powerful for client-side or client-server development.
I've mentioned this before, but I think Gemini is the smartest raw model for answering programming questions in chatbot mode, but these CC/Codex/gemini-cli tools need more than just the model, the harness has to be architected intelligently and I think that's where Google is behind for the moment.
I've used Gemini CLI a fair amount as well—it's included with our subscription at work. I like it okay, but it tends to produce "lies" a bit too often. It tends to produce language that reads as over confident that it's found a problem or solution. This causes me extra work to verify or causes me extra time because I believed it. In my experience Claude Code does this quite a bit less.
I'm also using Gemini and it's the only option that consistently works for me so far. I'm using it in chat mode with copy&paste and it's pleasant to work with.
Both Claude and ChatGPT were unbearable, not primarily because of lack of technical abilities but because of their conversational tone. Obviously, it's pointless to take things personally with LLMs but they were so passive-aggressive and sometimes maliciously compliant that they started to get to me even though I was conscious of it and know very well how LLMs work. If they had been new hires, I had fired both of them within 2 weeks. In contrast, Gemini Pro just "talks" normally, task-oriented and brief. It also doesn't reply with files that contain changes in completely unrelated places (including changing comments somewhere), which is the worst such a tool could possibly do.
Edit: Reading some other comments here I have to add that the 1., 2. ,3. numbering of comments can be annoying. It's helpful for answers but should be an option/parameterization.
I think you’re highlighting an aspect of agentic coding that’s undervalued: what to do once trust is breached… ?
With humans you can categorically say ‘this guy lies in his comments and copy pastes bullshit everywhere’ and treat them consistently from there out. An LLM is guessing at everything all the time. Sometimes it’s copying flawless next-level code from Hacker News readers, sometimes it’s sabotaging your build by making unit tests forever green. Eternal vigilance is the opposite of how I think of development.
I feel like this is exactly the use case for things like Hooks and Skills. Which, if you don't want to write them yourself, I get it. But I do think we can get the tool to do it; sounds like you want it doing that a little more actively/out-of-the-box?
I've heard Opus 4.5 might have an edge especially in long running agentic coding scenarios (?) but personally yes Gemini 3 series is what I was expecting GPT-5 to be.
I'm also mostly on Gemini 3 Flash. Not because I've compared them all and I found it the best bar none, but because it fulfills my needs and then some, and Google has a surprisingly little noted family plan for it. Unlike OpenAI, unlike Anthropic. IIRC it's something like 5 shared Gemini Pro subs for the price of 1. Even being just a couple sharing it, it's a fantastic deal. My wife uses it during studies, I professionally with coding and I've never run into limits.
they should just acquire one of the many agent code harnesses. Something like opencode works just as well as claude-code and has only been around half of the time.
I used opencode happily for a while before switching to copilot cli. Been a minute , but I don't detect a major quality difference since they added Plan mode. Seems pretty solid, and first party if that matters to your org.
I read that a few times but from my personal observations, Claude Opus 4.5 is not significantly different in GitHub Copilot. The maximum context size is smaller for sure, but I don’t think the model remembers that well when the context is huge.
Anthropic's Models are better though. It may not "perform" as well on the LLM task benchmarks, but its the only one that actual gives semi-intelligent responses and seems aligned with human wants. And yes, they definitely have much better execution. It's the only one I considered shelling out 20 bucks for.
That's a "business" model, not a language model, which I believe is what the poster is referring to. In any case though, MS does have a number of models, most notably Phi. I don't think anyone is using them for significant work though.
Which is kind of a bummer - it'd have helped the standards based web to have an actual powerful entity maintain a distinct implementation. Firefox is on life-support and is basically taking code from Blink wholesale, and Webkit isn't really interested in making a browser thats particularly compliant to web standards.
MS's calculus was obvious - why spend insane amounts of engineering effort to make a browser engine that nobody uses - which is too bad, because if I remember correctly they were not too far behind Chrome in either perf or compatibility for a while.
Then they took their eyes off the ball - whether it was protecting the Windows fort (why create an app that has all the functionality of an OS that you give away for free - mostly on Windows, some Mac versions, but no Linux support) when people are paying for Windows OR they just diverted the IE devs to some other "hot" product, browser progress stagnated, even with XMLHttpRequest.
We love to hate on Microsoft here, but the fact is they are one of the most diversified tech companies out there. I would say they are probably the most diversified, actually. Operating systems, dev tools, business applications, cloud, consumer apps, SaaS, gaming, hardware. They are everywhere in the stack.
I don’t plan on using the feature and I don’t plan on using Windows much longer in the first place, but I find that going beyond the ragebait headlines and looking at the actual offering and its privacy policy and security documentation makes it look a lot more reasonable.
Microsoft is very explicit in detailing how the data stays on device and goes to great lengths to detail exactly how it works to keep data private, as well as having a lot of sensible exceptions (e.g., disabled for incognito web browsing sessions) and a high degree of control (users can disable it per app).
On top of all this it’s 100% optional and all of Microsoft’s AI features have global on/off switches.
Until those switches come in the crosshairs of someone's KPIs, and then magically they get flipped in whatever direction makes the engagement line go up. Unfortunately we live in a world where all of these companies have done this exact thing, over and over again. These headlines aren't ragebait, they're prescient.
Well, now you’re just doing the same exact thing I described. You’re basically making up hypothetical things that could happen in the future.
I’ll agree with you the moment Microsoft does that. But they haven’t done it. And again, I’m not their champion, I’m actively migrating away from Microsoft products. I just don’t think this type of philosophy is helpful. It’s basically cynicism for cynicism’s sake.
Fun fact: I used to automatically screenshot my desktop every few minutes eons ago. This would occasionally save me when I lost some work and could go back to check the screenshots.
I only gave it up because it felt like a liability and, ahem, it was awkward to review screenshots and delete inopportune ones.
Long time ago I had a script that would regularly screenshot my desktop… and display the latest screenshot on a page in my `public_html`, on the public web. Just because I thought it would be fun.
I mean. Ask any gamer if the original Xbox One announcement needing a Kinect and persistent internet connection was a feature request from them or a three letter org.
As someone that was there, we saved the Xbox brand by bullying Microsoft out of normalizing spying on kids and their whole families.
Microsoft really needs to get a better handle with the naming conventions.
There is Microsoft Copilot, which replaced Bing Chat, Cortana and uses OpenAI’s GPT-4 and 5 models.
There is Github Copilot, the coding autocomplete tool.
There is Microsoft 365 Copilot, what they now call Office with built in GenAI stuff.
There is also a Copilot cli that lets you use whatever agent/model backend you want too?
Everything is Copilot. Laptops sell with Copilot buttons now.
It is not immediately clear what version of Copilot someone is talking about. 99% of my experience is with the Office and it 100% fails to do the thing it was advertised to do 2 years ago when work initially got the subscription. Point it a SharePoint/OneDrive location, a handful of excel spreadsheets and pdfs/word docs and tell it to make a PowerPoint presentation based on that information.
It cannot do this. It will spit out nonsense. You have to hold it by the hand tell it everything to do step by step to the point that making the PowerPoint presentation yourself is significantly faster because you don’t have to type out a bunch of prompts and edit it’s garbage output.
And now it’s clear they aren’t even dogfooding their own LLM products so why should anyone pay for Copilot?
>Microsoft really needs to get a better handle with the naming conventions
Microsoft cannot and will not ever get better at naming things. It is said the universe will split open and and eldritch beast will consume the stars the day Microsoft stops using inconsistent and overlapping names for different and conflicting products.
About a year ago I had to buy a new Xbox. It took me time to figure out what model I had and what the new models are. It’s the least intuitive marketing on the market.
Nadella has the golden ship taking on water right now. He has entirely botched AI top to bottom. He has screwed that up to such a degree that it would be difficult to overstate. If he doesn't correct these mistakes extremely soon, he'll unravel much of the progress he made for Microsoft and they'll miss this generation of advancement (which will be the end of their $3 trillion market cap - as the market has recently perked up to).
There is no tech giant that is more vulnerable than Microsoft is at this moment.
Most document originations will begin out of or adjacent to of LLM sessions in the near future, as everything will blur in terms of collaborating with AI agents. Microsoft has no footing (or worse, their position is terrible courtesy of copilot) and is vulnerable to death by inflection point. Windows 11 is garbage and Google + Linux may finally be coming for their desktop (no different than what AMD has managed in unwinding the former Intel monopoly in PCs).
Someone should be charging at them with a new take on Office, right now. This is where you slice them in half. Take down Office and take down Windows. They're so stupid at present that they've opened the gates to Office being destroyed, which has been their moat for 30 years.
I am no big fan of MS, and especially not a fan of W11, but you're operating under the false assumption that their users are still their most important customers.
MS's bottom line doesn't depend on how happy users are with W11, especially not power users like ourselves. W11 is just a means of selling subscriptions (office, ai, etc). The question isn't 'are users happy' it's 'will OEMs and business continue to push it?'. The answer to that is almost certainly yes. OEMs aren't going to be selling most pcs with ubuntu included any time soon. Businesses are not going to support libreoffice when MS office is the established standard.
Maybe apple could make inroads here, but they don't seem willing to give up their profit margins on overpriced hardware, and I don't think I've ever seen them release anything 'office' related that was anywhere near feature parity with MSO, and especially not cross platform.
If their whole business is based around being an established standard and making users happy is not a relevant goal, then why do anything at all? They already are an established standard, so why would they bother taking any further actions whatsoever, making any changes or rolling out any new products? Clearly they are trying to achieve something, right? So what is it?
It is about making specific high value users happy. If the rest of us are unhappy - we don't matter. They know for most people ubuntu or whatever isn't a realistic option and so they can take whatever money they can get from those people. Sure a few people like me will run *BSD or linux, but we are a footnote not worth their time.
The only danger is every once in a while one of those little footnotes becomes large enough to be a problem and you lose the market of those who do matter as well. While there are many obvious examples of where that happened, there are also a lot of cases where it didn't.
The confusion is when I say “I have a terrible time using Copilot, I don’t recommend using it” and someone chimes in with how great their experience with Github Copilot is, a completely different product and how I must be “holding it wrong” when that is not the same Copilot. That Microsoft has like 5 different products all using Copilot in the name, even people in this very comment section are only saying “Copilot” so it is hard to know what product they are talking about!
I mean, sure. But aside from the fact that everything in AI gets reduced to a single word ("Gemini", "ChatGPT", "Claude") [1], it's clearly not an excuse for misrepresenting the functionality of the product when you're writing a post broadly claiming that their AI products don't work.
Github Copilot is actually a pretty good tool.
[1] Not just AI. This is true for any major software product line, and why subordinate branding exists.
I specifically mention that my experience is with the Office 365 Copilot and how terrible that is and in online discussions I mention this and then people jump out of the woodwork to talk about how great Github Copilot is so thank you for demonstrating that exact experience I have every time I mention Copilot :)
> No, there is Github Copilot, the AI agent tool that also has autocomplete, and a chat UI.
When it came out, Github Copilot was an autocomplete tool. That's it. That may be what the OP was originally using. That's what I used... 2 years ago. That they change the capabilities but don't change the name, yet change names on services that don't change capabilities further illustrates the OP's point, I would say.
That's silly. Gmail is a wildly different product than it was when it launched, but I guess it doesn't count since the name is the same?
Microsoft may or may not have a "problem" with naming, but if you're going to criticize a product, it's always a good starting place to know what you're criticizing.
To be fair, Github Copilot (itself a horrible name) has followed the same arc as Cursor, from AI-enhanced editor with smart autocomplete, to more of an IDE that now supports agentic "vibe coding" and "vibe editing" as well.
I do agree that conceptually there is a big difference between an editor, even with smart autocomplete, and an agentic coding tool, as typified by Claude Code and other CLI tools, where there is not necessarily any editor involved at all.
GitHub Copilot is available from website https://github.com/copilot together with services like Spark (not available from other places), Spaces, Agents etc.
This absolutely sucks, especially since tool calling uses tokens really really fast sometimes. Feels like a not-so-gentle nudge to using their 'official' tooling (read: vscode); even though there was a recent announcement about how GHCP works with opencode: https://github.blog/changelog/2026-01-16-github-copilot-now-...
No mention of it being severely gimped by the context limit in that press release, of course (tbf, why would they lol).
However, if you go back to aider, 128K tokens is a lot, same with web chat... not a total killer, but I wouldn't spend my money on that particular service with there being better options!
That was back when they were going wild naming everything "Active". Active Desktop made sense, Active Directory? What made that "Active". ActiveMovie? It's just a video playing framework... ActiveX?? X?? ActiveSync, I don't want my sync to be active. ActiveStore was apparently a thing?
Not that I disagree, but this is nothing compared to the ".NET" craze in the early 2000s. Everything had to have ".NET" in its name even if it had absolutely nothing to do with the actual .NET technology.
There was also "Active" before that, but .NET was next level crazy...
My colleague works in a functional role for a medium sized SaaS company(1000-5000 employees), working with banks, family offices, hedge funds. They use teams and copilot, they all hate it.
One thing that I don't know about is if they have an AI product that can work on combining unstructured and databases to give better insights on any new conversation? e.g. like say the LLM knows how to convert user queries to the domain model of tables and extract information? What companies are doing such things?
This would be something that can be deployed on-prem/ their own private cloud that is controlled by the company, because the data is quite sensitive.
This isn't a Microsoft thing, it's a big dumb corporation thing. Most big corporations are run by dumb executives who are 100% out of touch with the customer (though even if they were in touch, they wouldn't care). Their only consideration is the stock price. If adding new names to things, chanting the magic spell "AI" over and over, and claiming the new name will make them more money can cause the stock price to increase, that's what they'll do. (Making customers happy doesn't make the stock price rise; if it did, we'd all be a lot less depressed and a lot richer)
Like Microsoft Defender, which is now Defender Antivirus, or Defender for Endpoint if you have a real license. You will also get Defender for Identity, and maybe Defender for Office 365, which is probably not ASR. And Defender for Cloud, not to be confused with Defender for Cloud Apps.
> Microsoft really needs to get a better handle with the naming conventions.
They really won't, though; Microsoft just does this kind of thing, over and over and over. Before everything was named "365", it was all "One", before that it was "Live"... 20 years ago, everything was called ".NET" whether it had anything to do with the Internet or not. Back in the '90s they went crazy for a while calling everything "Active".
There’s got to be solid reasons why they do this and have done so for so damn long. At the very least institutional reasons. At best, actual research that suggests they make more money this way. But as a consumer, I hate it.
Marketing has too much power. They get some hairbrained scheme to goose the numbers and just slam a mandate all the way down the org.
Is "Copilot" not getting enough clicks? Make every button say "copilot", problem solved. Marketing doesn't know or care what was there before, someone needs numbers up to get their promotion.
>> Is "Copilot" not getting enough clicks? Make every button say "copilot", problem solved. Marketing doesn't know or care what was there before, someone needs numbers up to get their promotion.
So Microsoft isn't bringing copilot to all these applications? It's just bringing a copilot label to them? So glad I don't use this garbage at home.
This is actually one of their smart decisions. "Copilot" is currently going through the corporate regulators, who know nothing about technology, but I can't buy it until they say everything is Legal.
So once we have signoff then my counterpart in Sharepoint/M365 land gets his "Copilot" for Office, while my reporting and analytics group gets "Copilot" for Power BI, while my coding team gets "Copilot" for llm assisted development in GitHub.
In the meantime everybody just plugs everything into ChatGPT and everybody pretends it isn't happening. It's not unlawful if they lawyers can't see it!
It's because Microsoft isn't a software company. They're a marketing company that happens to make software and a few other bits.
We're now on the back end of that, where Microsoft must again make products with independent substance, but are instead drowning in their own infrastructural muck.
To further your argument, look at the XBOX. It is impossible to tell which is the latest model by name alone. Where the playstation is simple, the latest is the 5, the previous was the 4, and the one before that was the 3.
It's over 19 years old, but this video is a brutal but hilarious commentary on Microsoft's inherent dysfunction when it comes to product naming and packaging. Still on point decades later.
That is exactly what IBM thought too when they allowed Bill Gates to license the new OS they were supposed to be making for IBM. They had no competition, who are these kids going to sell their OS to?
Nintendo's strategy isn't the absolute worst. They mostly just give new names to new console designs, with modifiers to specify next-gen-without-major-changes. So the SNES was a next-gen NES, the N64 was its own thing, the GameCube was its own thing, the Gameboy, Gameboy Color, and Gameboy Advanced were iterations on the same thing, DS, DSi, 3DS were all generation steps. WiiU was a next-gen Wii, Switch 2 is a next-gen Switch.
They probably should have called the WiiU the Super Wii or Wii 2 or something, but on the whole they've got a mostly coherent naming convention.
I don’t think Nintendo’s scheme was ever that great as it blurred the difference between variant form factors (Game Boy Pocket vs Game Boy, Game Boy Micro/SP vs Game Boy Advance, DS Lite, 2/3DS XL, Wii Mini), pro models with limited exclusives (Game Boy Color, DSi, New 2/3DS), and full on new generations (Game Boy Advance, 3DS, Wii U).
Some musings from someone who has not worked in microsoft but has in big tech.
This often happens because the people inside are incentivized to build their own empire.
If someone comes and wants to get promoted/become an exec, there's a ceiling if they work under the an existing umberlla + dealing the politics of introducing a feature which requires dealing with an existing org.
So they build something new.
And the next person does the same.
And so you have 365, One, Live, .Net, etc
Google Plus was the same. Lots of unrelated google products were temporarily branded as part of google plus for some reason, including your google account and google hangouts (meet).
That was a very intentional strategy. In hindsight, not a good one, of course, but Plus and its integration across the whole company was blessed by Page and Brin, who were quietly panicking that Facebook could eat Google's lunch by becoming the "start page of the Internet" the moment they integrated search. Which they never did and never appear to have wanted.
The Dev Tools division had Quick- prefix for some tools before settling on Visual- once VB took off.
Then there's DirectX and its subs - though Direct3D had more room for expanded feature set compared to DXSound or DXInput so now they're up to D3D v12.
> Microsoft really needs to get a better handle with the naming conventions.
AI really should be a freaking feature, not the identity of their products. What MS is doing now is like renaming Photoshop to Photoshop Neural Filter.
That's a great analogy, but could be taken one step further. Because Adobe would also have to rename the rest of their products to come close to what MS is doing.
By the way, why is app lowercase in "the Microsoft 365 Copilot app"? Is it not part of the trademark but even they couldn't deal with how confusing that was?
'app' isnt part of that trademark, but on other products (Windows App) it is.
Searching the store or a company portal for one of these rebraned apps returns dozens of hits because 'windows', 'copilot', '365' and 'app' are all common words in most application descriptions.
>There is Github Copilot, the coding autocomplete tool.
There is also Github Copilot, the subscription, that lets you use Anthropic, OpenAI and Google models.
People already do pay for it: office 365. It’s just like getting cloud storage with the subscription. OneDrive has been one of the better cloud storage options for consumers.
Also, a great use is Microsoft Forms I was surprised with the AI features. At first I just used it to get some qualitative feedback but ended up using copilot to enter questions Claude helped me create and it converted them into the appropriate forms for my surveys!
Objectives -> Claude -> Surveys (markdown) -> Copilot -> MS Forms -> Emailed.
Insights and analysis can use copilot too.
Main thing to remember is the models behind the scenes will change and evolve, Copilot is the branding. In fact, we can expect most companies will use multiple AI solutions/pipelines moving forward.
The craziest thing was how Microsoft took the super established brand from decades, and renamed Microsoft Office to Microsoft 365.
I'm not sure if it's named Microsoft 365 Copilot nowadays, or if that's an optional AI addon? I thought it was renamed once more, but they themselves claim simply "Microsoft 365" (in a few various tiers) sans-Copilot. https://www.microsoft.com/microsoft-365/buy/compare-all-micr...
>>Point it a SharePoint/OneDrive location, a handful of excel spreadsheets and pdfs/word docs and tell it to make a PowerPoint presentation based on that information. It cannot do this. It will spit out nonsense. You have to hold it by the hand tell it everything to do step by step to the point that making the PowerPoint presentation yourself is significantly faster because you don’t have to type out a bunch of prompts and edit it’s garbage output.
Everyone I know who use AI day-to-day is just using Copilot to mostly do things like add a transition animation to a Powerpoint slide or format a word document to look nice. The only problem these LLM products seem to solve is giving normal people a easy way to interact with terrible software processes and GUIs. And better solution to that problem would be for developers to actually observe how the average use interacts with both a computer and their program in particular.
Can somebody give me a TLDR on what the "copilot button" does? I've never had one of those laptops and have never understood that. Does it just start the AI front-end? Does it power up the NPU?
It is just a button. Default it starts the Copilot app which is really the Office app that already existed but now with the copilot tab preselected. Also that Copilot runs in the cloud and doesn't use your NPU.
The only thing until now I've found using the NPU are the built in blur, auto frame and eye focus modes for the webcam.
I got one of those Asus ROG laptops that bragged about having an NPU and I kept trying to figure out how to access it until I realized they just meant the integrated Ryzen GPU (which is also responsible for getting anything rendered by the discrete Nvidia GPU to the actual display). Also its alleged 16G of video memory included a mapped 8G of system RAM. I miss when vendors at least pretended to be honest.
They have been unstable for decades. Does anyone still use self-hosted (running in a basement) windows servers? Running a windows machine feels like it's about as reliable as fast food order accuracy. Most of the time sure, but I hope you can afford to miss out sometimes.
Crazy to think that Github Copilot was the first mainstream AI coding tool. It had all the hype and momentum in the world, and Microsoft decided to do...absolutely nothing with it.
Did it have all the hype and momentum, though? It was pretty widely viewed as a low- to negative-value addition, and honestly when I see someone on here talking about how useless AI is for coding, I assume they were tainted by Github copilot and never bothered updating their priors.
just my experience of course, but it had a lot of hype. It got into a lot of people's workflow and really had a strong first mover advantage. The fact that they supported neovim as a first-class editor surely helped a ton. But then they released their next set of features without neovim support and only (IIRC) support VS Code. That took a lot of wind out of the sails. Then combined with them for some reason being on older models (or with thinking turned down or whatever), the results got less and less useful. If Co-pilot had made their agent stuff work with neovim and with a CLI, I think they'd be the clear leader.
My first experience was with cursor and my entire team went through a honeymoon period before it got kind of sidelined. Average usage was giving an agent a couple shots at a problem but usually solving it ourselves ultimately. Internal demos were lackluster. Team was firmware though so might not be a great topic for GenAI yet.
I use Copilot in VSCode at work, and it's pretty effective. You can choose from quite a few models, and it has the agentic editing you'd expect from an IDE based AI development tool. I don't know if it does things like browser integration because I don't do frontend work. It's definitely improved over the last 6 months.
There's also all the other Copilot branded stuff which has varying use. The web based chat is OK, but I'm not sure which model powers it. Whatever it is it can be very verbose and doesn't handle images very well. The Office stuff seems to be completely useless so far.
Have you tried any other popular agentic coding tool? Like Claude Code, Cursor, Opencode, or Codex or something else? Because I've used all of these and Copilot in anger in the last three months, and Copilot wasn't even in the same league as the others. Comparatively it just plain sucked. Slow and gave poor results. All the others I mentioned are withing spitting distance of each other from what I can tell from my usage.
I have found copilot to be very noisy to the point where I have had to turn it off and then uninstall on both IntelliJ and VSCode. I have found generating code via your favourite coding agent and then reviewing the output to be less taxing since the review burden is reduced. For agentic, you often have to review a bunch of code but its usually very close to the spec. Reviews are then easier.
They launched GitHub Codespaces, a free containerized dev environment with VScode & Copilot, and it's broken six ways from Sunday. VScode/Copilot extensions are constantly breaking and changing. The GitHub web interface is now much harder to use, to the point I've just stopped browsing it. Nobody over there cares if these things work. (But weirdly, the Copilot CLI works 4x better than the Copilot VSCode extension at actually writing code)
The "smart autocomplete" part of Github Copilot is still the most useful AI coding thing for me at the moment. I continue to subscribe to it just for that.
It really says something that MS/Github has been trying to shovel Copilot down our throats for years, and Anthropic just builds a tool in a short period of time and it takes off.
It's interesting to think back, what did Copilot do wrong? Why didn't it become Claude Code?
It seems for one thing its ambition might have been too small. Second, it was tightly coupled to VS Code / Github. Third, a lot of dumb big org Microsoft politics / stakeholders overly focused on enterprise over developers? But what else?
It's pretty clear that Microsoft had "Everything must have Copilot" dictated from the top (or pretty close). They wanted to be all-in on AI but didn't start with any actual problems to solve. If you're an SWE or a PM or whatever and suddenly your employment/promotion/etc prospects depend on a conspicuously implemented Copilot thing, you do the best you can and implement a chat bot (and other shit) that no one asked for or wants.
I don't know Anthropic's process but it produced a tool that clearly solves a specific problem: essentially write code faster. I would guess that the solution grew organically given that the UI isn't remotely close to what you'd expect a product manager to want. We don't know how many internal false-starts there were or how many people were working on other solutions to this problem, but what emerged clearly solved that problem, and can generalize to other problems.
In other words, Microsoft seems to have focused on a technology buzzword. Anthropic let people solve their own problems and it led to an actual product. The kind that people want. The difference is like night and day.
Who knows what else might have happened in the last 12 months if C-suites were focused more on telling SWEs to be productive and less on forcing specific technology buzzwords because they were told it's the future.
They have copilot-cli, which is something like Claude Code, it's actually pretty effective, at least more effective than Copilot+VSCode.
I think in the end it's branding. They want people to think "Copilot = AI" but the experience is anywhere from fairly effective to absolute trash. And the most visible applications are absolute trash. It really says something when Ethan Mollick is out there demonstrating that OpenAI is more effective at working with Excel than the built in AI.
There was an article posted here yesterday that said "MS has a lot to answer for with Copilot", and that was the point: MS destroyed their AI brand with this strategy.
I have to deal with C) at $BiGTech where multiple ML teams reported to someone who never worked with ML. For machine learning, it is especially problematic since even being a good engineer requires you to understand the algorithms on some fundamental level. Now thats hard if you have never done anything remotely ML in life.
Microsoft can just get one of thier devs to build a coding agent but instead all of these companies are just bowing down to Anthropic just because Anthropic is selling execs a dream situation where they can fire most of the devs. None of the other coding agents are any worse than CC, Gemini & Crush are even better, Codex is decent and even something like Opencode is catching up.
Nah, Claude Code is really that better. I should know, every few months I try to move away from Claude Code, only to come running back to it.
Gemini CLI (not the model) is trash, I wish it weren't so, but I only have to try to use for a short time before I give up. It regularly retains stale file contents (even after re-prompting), constantly crashes, does not show diffs properly, etc, etc.
I recently tried OpenCode. It's got a bit better, but I still have all kind of API errors with the models. I also have no way to scroll back properly to earlier commands. Its edit acceptance and permissions interface is wonky.
And so on. It's amazing how Claude Code just nails the agentic CLI experience from the little things to the big.
Advice to agentic CLI developers: Just copy Claude Code's UX exactly, that's your starting point. Then, add stuff that make the life of user even easier and more productive. There's a ton of improvements I'd like to see in Claude Code:
- I frequently use multiple sessions. It's kinda hard to remember the context when I come back to a tab. Figure out a way to make it immediately obvious.
- Allow me to burn tokens to keep enough persistent context. Make the agent actually read my AGENTS.md before every response. Ensure thew agents gets closer and closer to matching the way I'd like it work as the sessions progresses (and even across sessions).
- Add a Replace tool, like the Read tool, that is reliable and prevents the agent from having to make changes manually one by one, or worse using sed (I've banned my agents from using sed because of the havoc they cause with it).
I think big corporations are just structurally unable to create products people actually want to use. They have too much experience with their customers being locked in and switching costs keeping them locked in. Anthropic needed a real product to win mind-share first, they will start enshitifying later (by some accounts they may already have). The best thing a big corporation can do with a nascent technology like that is to make it available to use to everywhere and then acquire the startup that converts it to a winner first. Microsoft even fumbled that.
There is an entire generation of devs that TFS ruined for version control. I've had to essentially rehabilitate folks and heal old TFS wounds to get them properly using git (so many copies of repos on their filesystem...).
The tools or the models? It's getting absurdly confusing.
"Claude Code" is an interface to Claude, Cursor is an IDE (I think?! VS Code fork?), GitHub Copilot is a CLI or VS Code plugin to use with ... Claude, or GPT models, or ...
If they are using "Claude Code" that means they are using Anthropic's models - which is interesting given their huge investment in OpenAI.
But this is getting silly. People think "CoPilot" is "Microsoft's AI" which it isn't. They have OpenAI on Azure. Does Microsoft even have a fine-tuned GPT model or are they just prompting an OpenAI model for their Windows-builtins?
When you say you use CoPilot with Claude Opus people get confused. But this is what I do everyday at work.
2 week old post feeling like part of the other weirdly promotional "Claude is everywhere right now" pieces that were around. Someone called it an advertising carpet bombing run.
A.I. Tool Is Going Viral. Five Ways People Are Using It
Anthropic is known for this, they are purposefully putting these stories on the websites execs and managers read. They even astroturf HN every chance they get with 2 blog posts a week (sometimes even more).
I'm one of those really odd beasts that feels some sort of loyalty to Microsoft, so I started out on Copilot and was very reluctant to try Claude Code. But as soon as I did, I figured out what the hype was about. It's just able to work over larger code bases and over longer time horizons than Copilot. The last time I tried Copilot, just to compare, I noticed that it would make some number of tool calls (not even involving tokens!) and then decide, "Nah, that's too many. We're just not going to do any work for a while." It was bizarre. And sometimes it would decide that a given bog-standard tool call (like read a file or something) needed to get my permission every. single. time. I couldn't do anything to convince it otherwise. I eventually gave up. And since then, we've built all our LLM support infrastructure around Claude Code, so it would be painful to go back to anything else.
I don't really like how Claude Code kind of obscures the actual code from you - I guess that's why people keep putting out articles about how certain programmers have absolutely no idea whats going on inside the code.
It's truly more capable but still not capable enough that Im comfortable blindly trusting the output.
This is not a problem when you assume the role of an architect and a reviewer and leave the entirety of the coding to Claude Code. You'll pretty much live in the Git Changes view of your favorite IDE leaving feedback for Claude Code and staging what it managed to get right so far. I guess there is a leap of faith to make because if you don't go all the way and you try to code together with Claude Code, it will mess with your stuff and undo a lot of it and it's just frustrating and not optimal. But if you remove yourself from the loop completely, then indeed you'll have no idea what's going on. There still needs to be a human in the loop, and in the right part of it, otherwise you're just vibe coding garbage.
That's the big difference for me. I use Github Copilot because I want to see the output and work with it. For people who are fine just shooting a prompt out and getting code back, I'm sure Claude Code is better.
> Claude Code kind of obscures the actual code from you
not sure what you mean, I have vscode open and make code changes in between claude doing its thing. I have had it revert my changes once which was amusing. Not sure why it did that, I've also seen it make the same mistake twice after being told not to.
What I don't understand is why so few people talk about AugmentCode, it uses claude (and others) but builds context of your project and tends to understand your repos better.
Its accelerated slopfication of Microsoft. And what are they are doing to fix it? More AI. The thing what is clearly making it worse. I think every division except Azure and Office are losing money at this point.
To this day I cannot wrap my head around the fact why did Microsoft allow a culture to grow inside the company (either through hiring, or through despondence) that at best is indifferent towards the company's products and at worst openly despises them?
I'm sure no other tech company is like this.
I think technologies like the Windows kernel and OS, the .NET framework, their numerous attempts to build a modern desktop UI framework with XAML, their dev tools, were fundamentally good at some point.
Yet they cant or wont hire people who would fix Windows, rather than just maintain it, really push for modernization, make .NET actually cool and something people want to use.
They'd rather hire folks who were taught at school that Microsoft is the devil and Linux is superior in all ways, who don't know the first thing about the MS tech stack, and would rather write React on the Macbooks (see the start menu incident), rather than touch anything made by Microsoft.
It seems somehow the internal culture allows this. I'm sure if you forced devs to use Copilot, and provided them with the tools and organizational mandate to do so, it would become good enough eventually to not have to force people to use it.
My main complaint I keep hearing about Azure (which I do not use at workr)
Because the products have become terrible, and they keep using more AI to solve it when AI is the problem with Microsoft. Microsoft execs are only riding Azure success, rest of the orgs are completely useless.
At the beginning of my career, sometime around 1999 or 2000, I was at Microsoft with our team because we were trying to integrate our product with this absolute piece of junk called Microsoft Biztalk.
It simply didn’t work. I complained about it and was eventually hauled into a room with some MS PMs who told me in no uncertain terms that indeed, Biztalk didn’t work and it was essentially garbage that no one, including us, should ever use. Just pretend you’re doing something and when the week is up, go home. Tell everyone you’ve integrated with Biztalk. It won’t matter.
I work for Microsoft/Azure and my incentives are (roughly in descending order): minimize large/long outages, ship lots of stuff (with some concern for customer utility, but not too much), don't get yelled at for missing mandated work (security, compliance, etc.) I'd love to improve product quality, but incentives for that are negative. We're running a tight ship, and every second I spend on quality is a second I don't spend on the priorities above. Since there isn't any slack in the system, that means my performance assessment will drop, which I obviously don't want. Multiply that by 200k employees, and you get the current state of quality across the whole product portfolio.
My experience in the Teams org is the same. It's all about security, compliance, and recently AI. Fixing bugs and similar "non-flashy" work is a sure way of postponing one's promotion indefinitely.
Reading about ubiquitous Claude Code use inside of Apple and Microsoft, and not Codex, makes me very worried about forthcoming software quality.
Claude Code is fun, full of personality, many features to hack around model shortcomings, and very quick, but it should not be let anywhere near serious coding work.
That's also why OpenClaw uses Claude for personality, but its author (@steipete) disallows any contribution to it using Claude Code and uses Codex exclusively for its development. Claude Code is a slop producer with illusions of productivity.
I dont know about Apple, but Microsoft is completely consumed by this new AI coding wave. Apple probably still has some reasonable use policy, but microsoft has lost it entirely. I dont see myself using any microsoft software anytime soon.
Yes the product's secret sauce is out and it's becomming a commodity.
But OpenAI is still innovating with new subcategories, and even in cases where it did not innovate (Claude Code came first and OpenAI responded with Codex), it outdoes its competitors. Codex is being widely preferred by the most popular vibecode devs, notably Moltbook's dev, but also Jess Fraz.
In terms of pricing, OAI holds by far the most expensive product so it's still positioned as a quality option, to give an example, most providers have a 3 tier price for API calls.
Anthropic has 1$/3$/5$ (per output MTokens)
Gemini has 3$/12$ (2tier)
OpenAI has 2$/14$/168$
So the competitors are mainly competing in price in the API category
To give another datapoint, Google just released multimodal (image input) models like 1 or 2 months ago. This has been in ChatGPT for almost a year now
Learning about tech folklore is the best part of Hacker News, there's stuff you can't learn from books or tutorials (well maybe you could, but you are unlikely to reach it on your own.)
So, is claude code really better than codex with latest gpt model, or do people just hate on openai so much that no one (but me apparently) is using them? I am asking this question seriously because if so I will make the switch, but codex seems to be quite good to me so I don't want to waste time switching.
I used to use Claude Code with Opus exclusively because of how good it is IME. Then Anthropic banned me so I switched to OpenCode. I really want OpenCode to win, but there is long way for it to get the same polish in the UX department (and to get a handle on the memory leaks). I am 100 % sure Claude Code is hacks upon hacks internally, but on the surface, it works quite well (not that they have fixed the flashing issue). With OpenCode I also switched to GPT-5.2-Codex and I have to say it's fairly garbage IME. I can't get it to keep working, it takes every opportunity to either tell me what I should do next for it or just tell me it figured a particular piece of the larger puzzle out and that if I want it can continue. It is not nearly as independent as Opus it. Now I'm on the Codex CLI with GPT-5.2 as I figured maybe the harness is the issue, but it is not very good either.
I feel like I must be missing something, but I just cannot understand the hype around Claude Code. Don't get me wrong, I'm fully bought in on using AI for development and am super happy to use Copilot or Cursor, but as an experienced developer just chatting with the terminal feels so wrong. I've tried it so many times to switch and I can't get into it.
Can anyone else share what their workflow with CC looks like? Even if I never end up switching I'd like to at least feel like I gave it a good shot and made a choice based on that, but right now I just feel like I'm doing something wrong.
I'll take a crack at it. I liked using Cursor and it was my first introduction but my main editor is Emacs and I like Emacs, it has a bunch of configuration that has built up like barnacles on the bottom of a ship so it was kind of hard using VS Code. I use a project package (projectile) that allows me to quickly move between different projects (git repos, TRAMP sessions, anything really) and I can open a CC terminal there that I can have pop in and out as I need it. Really it's pretty similar to how I used Cursor.
Workflow is this:
- I have emacs open for code editing/reviews/git.
- Separate terminal emulator with 1-3 claudes
- I work on a story by splitting it into small steps ("Let's move this email logic to the email.service.ts", "here's the fields I'd need to add to the request, create a schema validation in a separate file, and update the router and controller")
- I mostly watch claude, and occasionally walk through the code in emacs whenever I feel like I want to review code.
- I handle external tools like git or db migrations myself not letting LLMs near them.
In essence, this is pretty much how you'd run a group of juniors - you'd sit on slack and jira diving up work and doing code reviews.
> I work on a story by splitting it into small steps
It's funny because that's basically the approach I take in GH Copilot. I first work with it to create a plan broken up into small steps and save that to an md file and then I have it go one step at a time reviewing the changes as it goes or just when it's done.
I understand that you're using emacs to keep an eye on the code as it goes, so maybe what I wasn't groking was that people were using terminal based code editors to see the changes it was making. I assumed most people were just letting it do it s thing and then trying to review everything at the end, but felt like an anti-pattern given how much we (dev community) push for small PRs over gigantic 5k line PRs.
At that point though isn't it just as fast/easy to cut/paste the code yourself? That was my conclusion after spending a week breaking things down - I was able to get good code out of the AI, but only after spending as much time writing the prompt as if I just did it myself. (note that this was my first attempt at using an agent, maybe I'll learn to do it better)
I use it like having a bunch of L3/L4 engineers. I give them a description of the changes I want to be made, sometimes chat a bit with it to help them design the features and then tell them to have a go at it. Then I create PRs and review them and have them clean up/improve the code and merge it. I try to balance giving it enough stuff to build so I can switch to another agent, and not giving them too much so that they make a weird assumption and run really far with it.
I got really good at reviewing code efficiently from my time at Google and others, which helps a lot. I'm sure my personal career experience influences a lot how I'm using it.
FWIW, I use Codex CLI, but I assume my flow would be the same with Claude Code.
One thing I really like it for is if you have a lot of something similar - let's say plugins. I can then just go to the plugins directory, and tell claude something as simple as "this is the plugins directory where plugins live. I want to add one called 'sample' that samples records". Note that I don't even have to tell it what that means usually.
It will read the existing plugins, understand the code style/structure/how they integrate, then create a plugin called "sample" AND code that is usually what you wanted without telling it specifically, and write 10 tests for it.
In those cases it's magic. In large codebases, asking it to add something into existing code or modify a behavior I've found it to be...less useful at.
iterm and talk to Claude, command+tab to vscode fix/adjust things, command+tab back to iterm and talk more to Claude. Not the most technically advanced setup but it works pretty well for me. I don't like the turbo auto-complete in vscode, it's very distracting. If i want an agent's help I tab over and ask claude.
Also, use the Superpowers plugin for Claude. That really helps for larger tasks but it can over do it hah. It's amusing to watch the code reviewer, implementor, and tester fight and go back and forth over something that doesn't even really matter.
A friend of mine over there told me their VP put a mandate that everyone should install and use Claude Code and write a weekly report on their usage (what they did, what worked, etc.). They also track token usage and have a leaderboard of who uses the most token.
I think Copilot is a platform or marketplace more than anything an Microsoft doesn't really need to care about what models are being used. They don't need to have a secret sauce as much as they need to make the entire ecosystem easy to use. They have had a lot of success over the years with VSC and this seems to build on that.
I think is funny, because is not the first time I hear about microsoft employees not using the company products.
I worked on a project with some microsoft engineers to create a chatbot plugin for Salesforce, using Microsoft Power Virtual Agent, and the comunication tool they used was Slack and not teams. And I was obligated to use teams because of the consuting company I worked at the time.
And also the version control they used at the time was I think SVN, and not TFS.
After their gaming stock crashing, making Windows 11 completely useless, not to mention its Copilot adoption getting nowhere, this was just a matter of time.
Windows 11 falling apart after AI adoption tells their AI, vibe coding is not going as planned.
If you saw their latest report claiming to focus on fixing the trust on Windows, it is a little too late, even newbies moved to Linux, and with AMD driver support, gaming is no longer an excuse.
A lot of Claude love in here. I have used Claude on the web (free tier) well over a year ago and had good results with it, but I need good integration with IntelliJ since I work almost exclusively in Kotlin. Can anyone attest to it? The reviews on the plugin are awful, but so are the reviews for the Copilot plugin. I find Copilot pretty good in there, though the tooling is a little second-class, and it often gets "stuck" in the terminal.
The rapid adoption of AI coding agents raises important questions about trust boundaries. When an agent like Claude Code needs to handle sensitive operations - API keys, credentials, database connections - how do you prevent those secrets from ending up in the model's context or logs?
We ran into this building a password automation tool (thepassword.app). The solution: the AI orchestrates browser navigation, but actual credential values are injected locally and never enter the model's reasoning loop. Prompt injection can't exfiltrate what's not in the context.
As these tools move into enterprise settings, I expect we'll see more architectural patterns emerge for keeping sensitive data out of agentic workflows entirely.
The embarrassing part isn't that Microsoft employees prefer Claude Code. It's that Microsoft had every advantag, the OpenAI partnership, the distribution, the enterprise relationships, the $13B investment and still built a product their own engineers don't want to use. That's not a model problem. That's a product taste problem. Anthropic built Claude Code with like 30 engineers. Microsoft has tens of thousands. At some point you have to accept that no amount of investment compensates for not actually understanding what developers need.
andyjohnson0 | 17 hours ago
phito | 17 hours ago
azaras | 17 hours ago
https://github.com/features/copilot/cli
hpdigidrifter | 17 hours ago
CC has some magic secret sauce and I'm not sure what it is.
My company pays for both too, I keep coming back to Claude all-round
mcintyre1994 | 17 hours ago
danw1979 | 16 hours ago
https://x.com/bcherny/status/2007179832300581177
vdm | 15 hours ago
giancarlostoro | 15 hours ago
taude | 14 hours ago
MarcelOlsz | 12 hours ago
michaelcampbell | 13 hours ago
This would explain the "secret sauce", if it's true. But perhaps it's not and a lot is LLM nondeterminism mixing with human confirmation bias.
k__ | 17 hours ago
taude | 14 hours ago
Maybe Microsoft is just using it internally, to finish copying the rest of the features from Claude Code.
Much like the article states, I use Claude Code beyond just it's coding capabilities....
yesiamyourdad | 4 hours ago
tveita | 14 hours ago
I'm amazed that a company that's supposedly one of the big AI stocks seemingly won't spare a single QA position for a major development tool. It really validates Claude's CLI-first approach.
dude250711 | 17 hours ago
onion2k | 17 hours ago
nrawe | 17 hours ago
spwa4 | 14 hours ago
Any MBAs want to buy? For the right price I could even fix it ...
bondarchuk | 17 hours ago
"Microsoft has over 100,000 software engineers working on software projects of all sizes."
So that would mean 100 000 000 000 (100 billion) lines of code per month. Frightening.
kace91 | 17 hours ago
conartist6 | 17 hours ago
Eddy_Viscosity2 | 17 hours ago
oleganza | 17 hours ago
mjevans | 17 hours ago
wolvoleo | 15 hours ago
One of the many reasons why it's such a bad practice (overly verbose solutions id another one of course)
root_axis | 17 hours ago
sarchertech | 16 hours ago
That’s 200 Windows’ worth of code every month.
amarant | 15 hours ago
root_axis | 15 hours ago
clickety_clack | 17 hours ago
FergusArgyll | 16 hours ago
torginus | 14 hours ago
falloutx | 12 hours ago
petcat | 10 hours ago
Claude doesn't require paying payroll tax, health insurance, unemployment, or take family leave.
copilot_king_2 | 17 hours ago
they're fucked
skandinaff | 17 hours ago
pjmlp | 17 hours ago
Zardoz84 | 17 hours ago
davey48016 | 14 hours ago
gafferongames | 14 hours ago
derjames | 12 hours ago
mrbungie | 17 hours ago
reactordev | 17 hours ago
brookst | 15 hours ago
reactordev | 12 hours ago
Take some arbitrary scaler and turn it into a mediocre metric, for some moronic target.
brookst | an hour ago
sarchertech | 17 hours ago
anonymous908213 | 17 hours ago
> My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”.
Obviously, "every line of C and C++ from Microsoft" is not contained within a single research project, nor are "Microsoft's largest codebases".
sarchertech | 16 hours ago
The fact that there are distinguished engineers at MS who think that is a reasonable goal is frightening though.
coldtea | 16 hours ago
As @mrbungie says on this thread: "They took the stupidest metric ever and made a moronic target out of it"
jodrellblank | 15 hours ago
The fact that it's a "PR disaster" for a researcher to have an ambitious project at one of the biggest tech companies on the planet, or to talk up their team on LinkedIn, is unbelievably ridiculous.
anonymous908213 | 15 hours ago
Talking about rewriting Windows at a rate of 1 million lines of code per engineer per month with LLMs is absolutely going to garner negative publicity, no matter how much you spin it with words like "ambitious" (do you work in PR? it sounds like it's your calling).
jodrellblank | 15 hours ago
Why would you continue supposing such a thing when both the employee, and the employer, have said that your suppositions are wrong?
anonymous908213 | 14 hours ago
smoe | 17 hours ago
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
reactordev | 17 hours ago
walt_grata | 15 hours ago
chillfox | 15 hours ago
bluGill | 15 hours ago
They want to expand that value into engineering and so are looking for something they can measure. I haven't seen anyone answer what can be measured to make a useful improvement though. I have a good "feeling" that some people I work with are better than others, but most are not so bad that we should fire them - but I don't know how to put that into something objective.
mwigdahl | 15 hours ago
reactordev | 12 hours ago
groundzeros2015 | 15 hours ago
Most models of productivity look like factories with inputs, outputs, and processes. This is just not how engineering or craftsmanship happen.
stoneforger | 10 hours ago
heliumtera | 15 hours ago
Findecanor | 15 hours ago
(I think it is from "Triumph of the Nerds" (1996), but I can't find the time code)
reactordev | 12 hours ago
Microsoft lost its way with Windows Phone, Zune, Xbox360 RRoD, and Kinect. They haven’t had relevance outside of Windows (Desktop) in the home for years. With the sole exception being Xbox.
They have pockets of excellence. Where great engineers are doing great work. But outside those little pockets, no one knows.
martinflack | 16 hours ago
austinthetaco | 15 hours ago
SoftTalker | 15 hours ago
stoneforger | 14 hours ago
austinthetaco | 12 hours ago
make3 | 15 hours ago
randusername | 14 hours ago
Totally agree. I see LOC as a liability metric. It amazes me that so many other people see it as an asset metric.
torginus | 14 hours ago
jayd16 | 14 hours ago
snovv_crash | 9 hours ago
Of course then someone is just going to pregenerate a random number lookup table and get a few gigs of 'value' from pure garbage...
badgersnake | 17 hours ago
m4rtink | 16 hours ago
Generating bilions of lines of code that is unmaintainable and buggy should easily achieve that. ;-)
javawizard | 16 hours ago
bookofjoe | 15 hours ago
rkomorn | 15 hours ago
the_duke | 16 hours ago
ReptileMan | 16 hours ago
HumblyTossed | 15 hours ago
This will lead to so much enshitification.
jodrellblank | 15 hours ago
No, one researcher at Microsoft made a personal LinkedIn post that his team were using that as their 'North Star' for porting and transpiling existing C and C++ code, not writing new code, and when the internet hallucinated that he meant Windows and this meant new code, and started copypasting this as "Microsoft's goal", the post was edited and Microsoft said it isn't the company's goal.
giancarlostoro | 15 hours ago
rkozik1989 | 15 hours ago
kavalg | 14 hours ago
bigfishrunning | 11 hours ago
ethin | 14 hours ago
richsouth | 15 hours ago
funkyfiddler369 | 15 hours ago
philipwhiuk | 15 hours ago
funkyfiddler369 | 13 hours ago
wolvoleo | 15 hours ago
scrlk | 15 hours ago
https://www.folklore.org/Negative_2000_Lines_Of_Code.html
heliumtera | 14 hours ago
So with this level of productivity Windows could completely degrade itself and collapse in one week instead of 15 years.
WD-42 | 14 hours ago
esafak | 13 hours ago
progx | 12 hours ago
1. Classic Coding (Traditional Development) In the classic model, developers are the primary authors of every line.
2. AI-Supported Coding (The Modern Workflow) AI tools like GitHub Copilot and Cursor act as a "pair programmer," shifting the human role to a reviewer and architect. I think realistic 5x to 10x is possible. 50.000 - 200.000 LOC per YEAR !!!! Would it be good code? We will see.lloydatkinson | 17 hours ago
The difference between the two is stark.
blibble | 17 hours ago
fastThinking | 17 hours ago
k__ | 17 hours ago
cush | 15 hours ago
thesdev | 15 hours ago
SoftTalker | 14 hours ago
taude | 14 hours ago
theanonymousone | 14 hours ago
monocularvision | 14 hours ago
dataviz1000 | 17 hours ago
I'll know better in a week. Hopefully I can get better results with the $200 a month plan.
oefrha | 16 hours ago
dataviz1000 | 16 hours ago
Starting in October with Vscode Copilot Chat it was $150, $200, $300, $400 per month with the same usage. I thought they were just charging more per request without warning. The last couple weeks it seemed that vscode copilot was just fucking up making useless calls.
Perhaps, it wasn't a dark malicious pattern but rather incompetence that was driving up the price.
joncrane | 12 hours ago
zzbzq | 15 hours ago
dktp | 14 hours ago
CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks
cush | 15 hours ago
bakugo | 17 hours ago
But I guess having my computer randomly stop working because a billion dollar corporation needs to save money by using a shitty text generation algorithm to write code instead of hiring competent programmers is just the new normal now.
johnebgd | 17 hours ago
Eddy_Viscosity2 | 17 hours ago
bflesch | 17 hours ago
Eddy_Viscosity2 | 16 hours ago
vv_ | 15 hours ago
Eddy_Viscosity2 | 14 hours ago
* all mean 'nearly all' as of course there will be exceptions.
bigfishrunning | 11 hours ago
Eddy_Viscosity2 | 10 hours ago
I would love for this to be true. But another scenario that could play out is that this process accelerates software bloat that was already happening with human coded software. Notepad will be a 300GB executable in 2035.
bigfishrunning | 9 hours ago
And this will cause what I'm talking about -- When nobody can afford memory because it's all going into the ocean-boiling datacenters, all of a sudden someone selling a program that fits into RAM will have a very attractive product
DowsingSpoon | an hour ago
calgoo | 17 hours ago
pjmlp | 17 hours ago
https://ubuntu.com/ai
eklavya | 16 hours ago
pjmlp | 16 hours ago
unlimit | 16 hours ago
vv_ | 15 hours ago
newsoftheday | 13 hours ago
endemic | 8 hours ago
wcoenen | 17 hours ago
pjmlp | 17 hours ago
wolvoleo | 15 hours ago
Tldr: Copilot has 1% marketshare among web chatbots and 1.85% of paid M365 users bought a subscription to it.
As much as I think AI is overrated already, Copilot is pretty much the worst performing one out there from the big tech companies. Despite all the Copilot buttons in office, windows, on keyboards and even on the physical front of computers now.
We have to use it at work but it just feels like if they spent half the effort they spend on marketing on actually trying to make it do its job people might actually want to use it.
Half the time it's not even doing anything. "Please try again later" or the standard error message Microsoft uses for every possible error now: "Something went wrong". Another pet peeve of mine, those useless error messages.
pjmlp | 14 hours ago
Improve the workflows that would benefit "AI" algorithms, image recognition, voice control, hand writing, code completion, and so on.
No need to put buttons to chat windows all over the place.
wolvoleo | 11 hours ago
PS: When I say party trick I don't deny it has its uses but it's currently used like the jesus-AI that can do anything.
tomjen3 | 13 hours ago
Until MS makes sure their models get the necessary context, I don't even care to click on them.
keeda | 10 hours ago
Assuming the leak was accurate, almost doubling usage in 4 months for an enterprise product seems like pretty fast growth?
Its growth trajectory seems to be on par with Teams so far, another enterprise product bundled with their M365 suite, though to be fair Teams was bundled for free: https://www.demandsage.com/microsoft-teams-statistics/
Lammy | an hour ago
Notable inflection point right around the time unlimited data became an afterthought and every piece of software decided it “needs” to spy on—— I mean needs to offer Fulfilling Connected Experiences at all times.
oefrha | 16 hours ago
zzbzq | 15 hours ago
Copilot's main problem seems to be people don't know how to use it. They need to delete all their plugins except the vscode, CLI ones, and disable all models except anthropic ones.
The Claude Code reputation diff is greatly exaggerated beyond that.
oefrha | 14 hours ago
If Copilot is stupid uniquely with 5.2 Codex then they should disable that instead of blaming the user (I know they aren’t, you are). But that’s not the case, it’s noticeably worse with everything. Compared to both Cursor and Claude Code.
howdareme9 | 13 hours ago
dfawcus | 12 hours ago
That was using the Claude Sonnet 4.5 model, I wonder if using the Opus 4.5 model would have managed to avoid that.
kcb | 16 hours ago
GaProgMan | 15 hours ago
Specifically WHY they use Apple hardware is something I can only speculate on. Presumably it's easier to launch Windows on Mac than the other way around, and they would likely need to do that as .NET and its related technologies are cross platform as of 2016. But that's a complete guess on my part.
Am *NOT* a Microsoft employee, just an MVP for Developer Technnolgies.
arcologies1985 | 15 hours ago
https://youtu.be/OHKKcd3sx2c
m-schuetz | 15 hours ago
cosmic_cheese | 14 hours ago
mrweasel | 13 hours ago
Nextgrid | 13 hours ago
m-schuetz | 12 hours ago
beart | 10 hours ago
Consumers _do not care_ if it is the firmware or Windows.
Dell was one of the earlier brands, and biggest, to suffer these standby problems. Dell has blamed MS and MS has blamed Dell, and neither has been in any hurry to resolve the issues.
I still can't put my laptop in my backpack without shutting it down, and as a hybrid worker, having to tear down and spin up my application context every other day is not productive.
cosmic_cheese | 10 hours ago
Maybe the most tragic part is that this drags down Linux and plagues it with these hardware rooted sleep issues too.
kibwen | 14 hours ago
cosmic_cheese | 14 hours ago
einsteinx2 | 13 hours ago
taude | 14 hours ago
frozenlettuce | 8 hours ago
khkjhkjiug | 15 hours ago
The accounts have now all gone quiet, guess they got told to quit it.
epolanski | 15 hours ago
He didn't dislike it, but got himself a Macbook nonetheless at his cost.
koakuma-chan | 14 hours ago
stoobs | 14 hours ago
mkl | 7 hours ago
Nextgrid | 13 hours ago
Because Windows' UX is trash? Anyone with leverage over their employer can and should request a Mac. And in a hot market, developers/designers did have that leverage (maybe they still do) and so did get their Macs as requested.
Only office drones who don't have the leverage to ask for anything better or don't know something better exists are stuck with Windows. Everyone else will go Mac or Linux.
Which is why you see Windows becoming so shit, because none of the culprits actually use it day-to-day. Microsoft should've enforced a hard rule about dogfooding their own product back in the Windows 7 days when the OS was still usable. I'm not sure they could get away with it now without a massive revolt and/or productivity stopping dead in its tracks.
benkaiser | 5 hours ago
fragmede | 16 hours ago
make3 | 14 hours ago
hxugufjfjf | 13 hours ago
falloutx | 12 hours ago
Claude still cant do half the things Crush can do.
Plus: you can use Kimi 2.5 with Crush soon
paxys | 14 hours ago
catlover76 | 14 hours ago
GPT-5.2 sometimes does this too. Opus-4.5 is the best at understanding what you actually want, though it is ofc not perfect.
jckahn | 14 hours ago
paxys | 14 hours ago
codazoda | 11 hours ago
riku_iki | 10 hours ago
Zopieux | 5 hours ago
All providers are opt-out. The moat is the data, don't pretend like you don't know.
riku_iki | 4 hours ago
ralusek | 14 hours ago
It's also just not as good at being self-directed and doing all of the rest of the agent-like behaviors we expect, i.e. breaking down into todolists, determining the appropriate scope of work to accomplish, proper tool calling, etc.
freedomben | 14 hours ago
Codex is the best at following instructions IME. Claude is pretty good too but is a little more "creative" than codex at trying to re-interpret my prompt to get at what I "probably" meant rather than what I actually said.
michaelcampbell | 13 hours ago
I am familiar with copilot cli (using models from different providers), OpenCode doing the same, and Claude with just the \A models, but if I ask all 3 the same thing using the same \A model, I SHOULD be getting roughly the same output, modulo LLM nondeterminism, right?
taylorius | 10 hours ago
phainopepla2 | 10 hours ago
It won't make any changes until a detailed plan is generated and approved.
PantaloonFlames | 10 hours ago
Using Gemini 2.5 or 3, flash.
sutterd | 13 hours ago
satvikpendem | 14 hours ago
causal | 13 hours ago
whalee | 14 hours ago
I have developed decent intuition on what kinds of problems Codex, Claude, Cursor(& sub-variants), Composer etc. will or will not be able to do well across different axes of speed, correctness, architectural taste, ...
If I had to reflect on why I still don't use Gemini, it's because they were late to the party and I would now have to be intentional about spending time learning yet another set of intuitions about those models.
codazoda | 11 hours ago
I've been experimenting with small local models and the types of prompts you use with these are very different than the ones you use with Claude Code. It seems less different between Claude, Codex, and Gemini but there are differences.
It's hard to articulate those differences but I think that I kind of get in a groove after using models for a while.
qaq | 14 hours ago
mfro | 14 hours ago
pRusya | 14 hours ago
bastawhiz | 14 hours ago
JoshMandel | 12 hours ago
CuriouslyC | 14 hours ago
OsrsNeedsf2P | 13 hours ago
[0] based on user Thumbs up/Thumbs down voting
psyclobe | 13 hours ago
notatoad | 12 hours ago
TZubiri | 12 hours ago
It's on the top of most leaderboards on lmarena.ai
TheAceOfHearts | 12 hours ago
My default everyday model is still Gemimi 3 in AI Studio, even for programming related problems. But for agentic work Antigravity felt very early-stages beta-ware when I tried it.
I will say that at least Gemimi 3 is usually able to converge on a correct solution after a few iterations. I tried Grok for a medium complexity task and it quickly got stuck trying to change minor details without being able to get itself out.
Do you have any advice on how to use Antigravity more effectively? I'm open to trying it again.
paxys | 11 hours ago
Analemma_ | 11 hours ago
8note | 2 hours ago
the tools its built with seem to suck, but it can cook with serena mcp.
the flash models seem to get better results than the pro ones as far as ive seen, but theres not a big difference
codazoda | 12 hours ago
jonathanstrange | 12 hours ago
Both Claude and ChatGPT were unbearable, not primarily because of lack of technical abilities but because of their conversational tone. Obviously, it's pointless to take things personally with LLMs but they were so passive-aggressive and sometimes maliciously compliant that they started to get to me even though I was conscious of it and know very well how LLMs work. If they had been new hires, I had fired both of them within 2 weeks. In contrast, Gemini Pro just "talks" normally, task-oriented and brief. It also doesn't reply with files that contain changes in completely unrelated places (including changing comments somewhere), which is the worst such a tool could possibly do.
Edit: Reading some other comments here I have to add that the 1., 2. ,3. numbering of comments can be annoying. It's helpful for answers but should be an option/parameterization.
bonesss | 10 hours ago
With humans you can categorically say ‘this guy lies in his comments and copy pastes bullshit everywhere’ and treat them consistently from there out. An LLM is guessing at everything all the time. Sometimes it’s copying flawless next-level code from Hacker News readers, sometimes it’s sabotaging your build by making unit tests forever green. Eternal vigilance is the opposite of how I think of development.
aantix | 10 hours ago
It fails to be pro-active. "Why didn't you run the tests you created?"
I want it to tell me if the implementation is working.
Feels lazy. And it hallucinates solutions frequently.
It pales in comparison to CC/Opus.
zhengyi13 | 10 hours ago
jug | 9 hours ago
I'm also mostly on Gemini 3 Flash. Not because I've compared them all and I found it the best bar none, but because it fulfills my needs and then some, and Google has a surprisingly little noted family plan for it. Unlike OpenAI, unlike Anthropic. IIRC it's something like 5 shared Gemini Pro subs for the price of 1. Even being just a couple sharing it, it's a fantastic deal. My wife uses it during studies, I professionally with coding and I've never run into limits.
tylerchilds | 14 hours ago
“What do we actually need to be productive?”
Which is how Anthropic pulled ahead of Microsoft, that prioritized
checks notes
Taking screenshots of every windows user’s desktop every few seconds. For productivity.
paxys | 14 hours ago
satvikpendem | 14 hours ago
bhadass | 14 hours ago
w0m | 7 hours ago
formerly_proven | 13 hours ago
speedgoose | 13 hours ago
doomslayer999 | 30 minutes ago
satvikpendem | 28 minutes ago
pixl97 | 14 hours ago
Attempt to build a product... Fail.
Buy someone else's product/steal someone else's product... Succeed.
Octoth0rpe | 13 hours ago
pixl97 | 13 hours ago
I mean they fought the browser war for years, then just used Chrome.
torginus | 13 hours ago
MS's calculus was obvious - why spend insane amounts of engineering effort to make a browser engine that nobody uses - which is too bad, because if I remember correctly they were not too far behind Chrome in either perf or compatibility for a while.
Nevermark | 12 hours ago
torginus | 10 hours ago
canucker2016 | 9 hours ago
Then they took their eyes off the ball - whether it was protecting the Windows fort (why create an app that has all the functionality of an OS that you give away for free - mostly on Windows, some Mac versions, but no Linux support) when people are paying for Windows OR they just diverted the IE devs to some other "hot" product, browser progress stagnated, even with XMLHttpRequest.
icedchai | 13 hours ago
bee_rider | 14 hours ago
tylerchilds | 10 hours ago
Gud | 10 hours ago
iAMkenough | 10 hours ago
jug | 9 hours ago
bobsmooth | 14 hours ago
jacquesm | 14 hours ago
Nevermark | 12 hours ago
For some reason, people have great cognitive difficulty with defensive trust. Charlie Brown, Sally.
dangus | 12 hours ago
Microsoft is very explicit in detailing how the data stays on device and goes to great lengths to detail exactly how it works to keep data private, as well as having a lot of sensible exceptions (e.g., disabled for incognito web browsing sessions) and a high degree of control (users can disable it per app).
On top of all this it’s 100% optional and all of Microsoft’s AI features have global on/off switches.
Dusseldorf | 10 hours ago
dangus | 10 hours ago
I’ll agree with you the moment Microsoft does that. But they haven’t done it. And again, I’m not their champion, I’m actively migrating away from Microsoft products. I just don’t think this type of philosophy is helpful. It’s basically cynicism for cynicism’s sake.
tylerchilds | 10 hours ago
https://www.cbsnews.com/news/google-voice-assistant-lawsuit-...
https://www.cbsnews.com/news/lopez-voice-assistant-payout-se...
dangus | 7 hours ago
2. Settlements are just that: settlements. You can be sued frivolously and still decide to settle because it’s cheaper/less risky.
tylerchilds | 6 hours ago
https://www.bcs.org/articles-opinion-and-research/crowdstrik...
2. Settlements also avoid discovery because the impact is likely way worse than checks notes less than one day of profits per company, respectively.
luddit3 | 13 hours ago
tylerchilds | 10 hours ago
I hate how I’ve had a web site with my name on it since 2008 and when you google my name verbatim it says “did you mean Tyler Childers”
Such shade from the algorithm, I get it, I get it, software is lamer than music.
halapro | 13 hours ago
I only gave it up because it felt like a liability and, ahem, it was awkward to review screenshots and delete inopportune ones.
Sharlin | 10 hours ago
zamadatix | 9 hours ago
rustyhancock | 9 hours ago
Part of me wonders if Microsoft knew it would appeal to governments.
https://arstechnica.com/tech-policy/2025/12/uk-to-encourage-...
tylerchilds | 8 hours ago
As someone that was there, we saved the Xbox brand by bullying Microsoft out of normalizing spying on kids and their whole families.
kemotep | 14 hours ago
There is Microsoft Copilot, which replaced Bing Chat, Cortana and uses OpenAI’s GPT-4 and 5 models.
There is Github Copilot, the coding autocomplete tool.
There is Microsoft 365 Copilot, what they now call Office with built in GenAI stuff.
There is also a Copilot cli that lets you use whatever agent/model backend you want too?
Everything is Copilot. Laptops sell with Copilot buttons now.
It is not immediately clear what version of Copilot someone is talking about. 99% of my experience is with the Office and it 100% fails to do the thing it was advertised to do 2 years ago when work initially got the subscription. Point it a SharePoint/OneDrive location, a handful of excel spreadsheets and pdfs/word docs and tell it to make a PowerPoint presentation based on that information.
It cannot do this. It will spit out nonsense. You have to hold it by the hand tell it everything to do step by step to the point that making the PowerPoint presentation yourself is significantly faster because you don’t have to type out a bunch of prompts and edit it’s garbage output.
And now it’s clear they aren’t even dogfooding their own LLM products so why should anyone pay for Copilot?
pixl97 | 14 hours ago
Microsoft cannot and will not ever get better at naming things. It is said the universe will split open and and eldritch beast will consume the stars the day Microsoft stops using inconsistent and overlapping names for different and conflicting products.
Isn't that right .Net/dotnet
twisteriffic | 14 hours ago
anal_reactor | 13 hours ago
neogodless | 13 hours ago
Tempest1981 | 13 hours ago
throw20251220 | 12 hours ago
akiselev | 12 hours ago
estimator7292 | 11 hours ago
neogodless | 11 hours ago
anonymars | 12 hours ago
Seriously, how?
lazzurs | 12 hours ago
phkahler | 11 hours ago
josephg | 9 hours ago
anal_reactor | 10 hours ago
But I actually had in mind the Windows app named "Xbox".
gosub100 | 11 hours ago
i80and | 13 hours ago
Paradigma11 | 13 hours ago
simplyinfinity | 13 hours ago
ksec | 13 hours ago
Nadella might have fixed a few things, but Microsoft still have massive room for improvement in many areas.
adventured | 12 hours ago
There is no tech giant that is more vulnerable than Microsoft is at this moment.
Most document originations will begin out of or adjacent to of LLM sessions in the near future, as everything will blur in terms of collaborating with AI agents. Microsoft has no footing (or worse, their position is terrible courtesy of copilot) and is vulnerable to death by inflection point. Windows 11 is garbage and Google + Linux may finally be coming for their desktop (no different than what AMD has managed in unwinding the former Intel monopoly in PCs).
Someone should be charging at them with a new take on Office, right now. This is where you slice them in half. Take down Office and take down Windows. They're so stupid at present that they've opened the gates to Office being destroyed, which has been their moat for 30 years.
rayiner | 12 hours ago
wing-_-nuts | 11 hours ago
MS's bottom line doesn't depend on how happy users are with W11, especially not power users like ourselves. W11 is just a means of selling subscriptions (office, ai, etc). The question isn't 'are users happy' it's 'will OEMs and business continue to push it?'. The answer to that is almost certainly yes. OEMs aren't going to be selling most pcs with ubuntu included any time soon. Businesses are not going to support libreoffice when MS office is the established standard.
Maybe apple could make inroads here, but they don't seem willing to give up their profit margins on overpriced hardware, and I don't think I've ever seen them release anything 'office' related that was anywhere near feature parity with MSO, and especially not cross platform.
shawnz | 10 hours ago
2snakes | 10 hours ago
bluGill | 8 hours ago
The only danger is every once in a while one of those little footnotes becomes large enough to be a problem and you lose the market of those who do matter as well. While there are many obvious examples of where that happened, there are also a lot of cases where it didn't.
canucker2016 | 9 hours ago
HPsquared | 12 hours ago
DrTung | 12 hours ago
HPsquared | 12 hours ago
https://www.youtube.com/watch?v=EUXnJraKM3k
imglorp | 12 hours ago
christophilus | 12 hours ago
butlike | 9 hours ago
anonymars | 12 hours ago
Related: https://www.cnet.com/tech/tech-industry/windows-servers-iden...
estimator7292 | 12 hours ago
Completely impossible. The search is bad to begin with, but it explicitly ignores anything that isn't a-9.
anonymars | 12 hours ago
stackghost | 11 hours ago
You mean Microsoft Career Copilot 365?
vlowther | 11 hours ago
josephg | 9 hours ago
anonymars | 11 hours ago
semi-extrinsic | 9 hours ago
nobodyandproud | 12 hours ago
pradeeproark | 11 hours ago
jslaby | 8 hours ago
timr | 14 hours ago
No, there is Github Copilot, the AI agent tool that also has autocomplete, and a chat UI.
I understand your point about naming, but it's always helpful to know what the products do.
jacquesm | 14 hours ago
timr | 14 hours ago
jacquesm | 14 hours ago
timr | 14 hours ago
Retric | 13 hours ago
Leaving Microsoft’s ecosystem a few years ago has been a great productivity boost, saved quite a bit of cash, and dramatically reduced my frustration.
kemotep | 14 hours ago
timr | 14 hours ago
Github Copilot is actually a pretty good tool.
[1] Not just AI. This is true for any major software product line, and why subordinate branding exists.
kemotep | 14 hours ago
nananana9 | 13 hours ago
mgkimsal | 14 hours ago
When it came out, Github Copilot was an autocomplete tool. That's it. That may be what the OP was originally using. That's what I used... 2 years ago. That they change the capabilities but don't change the name, yet change names on services that don't change capabilities further illustrates the OP's point, I would say.
timr | 14 hours ago
Microsoft may or may not have a "problem" with naming, but if you're going to criticize a product, it's always a good starting place to know what you're criticizing.
adastra22 | 14 hours ago
kortilla | 14 hours ago
HarHarVeryFunny | 14 hours ago
I do agree that conceptually there is a big difference between an editor, even with smart autocomplete, and an agentic coding tool, as typified by Claude Code and other CLI tools, where there is not necessarily any editor involved at all.
almosthere | 10 hours ago
georgeven | 10 hours ago
mirekrusin | 14 hours ago
GitHub Copilot is a service, you can buy subscription from here https://github.com/features/copilot.
GitHub Copilot is available from website https://github.com/copilot together with services like Spark (not available from other places), Spaces, Agents etc.
GitHub Copilot is VSCode extension which you can download at https://marketplace.visualstudio.com/items?itemName=GitHub.c... and use from VSCode.
New version has native "Claude Code" integration for Anthropic models served via GitHub Copilot.
You can also use your own ie. local llama.cpp based provider (if your github copilot subscription has it enabled / allows it at enterprise level).
Github Copilot CLI is available for download here https://github.com/features/copilot/cli and it's command line interface.
Copilot for Pull Requests https://githubnext.com/projects/copilot-for-pull-requests
Copilot Next Edit Suggestion https://githubnext.com/projects/copilot-next-edit-suggestion...
Copilot Workspace https://githubnext.com/projects/copilot-workspace/
Copilot for Docs https://githubnext.com/projects/copilot-for-docs/
Copilot Completions CLI https://githubnext.com/projects/copilot-completions-cli/
Copilot Voice https://githubnext.com/projects/copilot-voice/
GitHub Copilot Radar https://githubnext.com/projects/copilot-radar/
Copilot View https://githubnext.com/projects/copilot-view/
Copilot Labs https://githubnext.com/projects/copilot-labs/
This list doesn't include project names without Copilot in them like "Spark" or "Testpilot" https://githubnext.com/projects/testpilot etc.
Octoth0rpe | 14 hours ago
> GitHub Copilot is a service
and maybe, the api behind
> GitHub Copilot is VSCode extension
???
What an absolute mess.
eulers_secret | 12 hours ago
This absolutely sucks, especially since tool calling uses tokens really really fast sometimes. Feels like a not-so-gentle nudge to using their 'official' tooling (read: vscode); even though there was a recent announcement about how GHCP works with opencode: https://github.blog/changelog/2026-01-16-github-copilot-now-...
No mention of it being severely gimped by the context limit in that press release, of course (tbf, why would they lol).
However, if you go back to aider, 128K tokens is a lot, same with web chat... not a total killer, but I wouldn't spend my money on that particular service with there being better options!
perryh2 | 9 hours ago
3acctforcom | 11 hours ago
Put together a nice and clean price list for your friends in the purchasing department.
I dare you.
almosthere | 10 hours ago
dec0dedab0de | 14 hours ago
bluedino | 13 hours ago
rvnx | 12 hours ago
kalleboo | 5 hours ago
whobre | 13 hours ago
There was also "Active" before that, but .NET was next level crazy...
anonymars | 12 hours ago
Office.com is now "Welcome to Microsoft 365 Copilot"
adamrezich | 13 hours ago
Is it the context menu key? Or did they do another Ctrl+Alt+Shift+Win+L thing?
lionkor | 12 hours ago
adamrezich | 11 hours ago
itissid | 13 hours ago
One thing that I don't know about is if they have an AI product that can work on combining unstructured and databases to give better insights on any new conversation? e.g. like say the LLM knows how to convert user queries to the domain model of tables and extract information? What companies are doing such things?
This would be something that can be deployed on-prem/ their own private cloud that is controlled by the company, because the data is quite sensitive.
ajcp | 13 hours ago
0xbadcafebee | 13 hours ago
Foobar8568 | 13 hours ago
rubslopes | 13 hours ago
It's unbelievable how bad they failed at this. If you do the same with Claude or ChatGPT via simple web interface, they get miles ahead.
dobin | 12 hours ago
hoppp | 12 hours ago
mikkupikku | 12 hours ago
The execs buying Microsoft products are presumed to be as clueless as the execs naming Microsoft products.
boredatoms | 12 hours ago
codethief | 12 hours ago
marssaxman | 12 hours ago
They really won't, though; Microsoft just does this kind of thing, over and over and over. Before everything was named "365", it was all "One", before that it was "Live"... 20 years ago, everything was called ".NET" whether it had anything to do with the Internet or not. Back in the '90s they went crazy for a while calling everything "Active".
moomin | 12 hours ago
estimator7292 | 12 hours ago
fluidcruft | 12 hours ago
phkahler | 12 hours ago
So Microsoft isn't bringing copilot to all these applications? It's just bringing a copilot label to them? So glad I don't use this garbage at home.
Sharlin | 10 hours ago
Nevermark | 12 hours ago
If a large company has bought into "Co-Pilot", they want it all right? Or not, but let's not make carving anything out easy.
Just a thought.
3acctforcom | 11 hours ago
So once we have signoff then my counterpart in Sharepoint/M365 land gets his "Copilot" for Office, while my reporting and analytics group gets "Copilot" for Power BI, while my coding team gets "Copilot" for llm assisted development in GitHub.
In the meantime everybody just plugs everything into ChatGPT and everybody pretends it isn't happening. It's not unlawful if they lawyers can't see it!
Nevermark | 11 hours ago
> In the meantime everybody just plugs everything into ChatGPT
I believe you meant "everyone plugs everything into ChatGPT for Co-Pilot"! A statement with its own useful ambiguities.
It is comical, but I can now make a serious addition to Sun Tzu's maxims.
“All warfare is based on deception.”
“To subdue the enemy without fighting is the acme of skill.”
"Approval is best co-opted with a polysemous brand envelope."
cornonthecobra | 8 hours ago
We're now on the back end of that, where Microsoft must again make products with independent substance, but are instead drowning in their own infrastructural muck.
hightrix | 11 hours ago
coffeebeqn | 10 hours ago
Sharlin | 10 hours ago
CubityFirst | 9 hours ago
https://www.xbox.com/en-US/gaming-copilot
Sharlin | 8 hours ago
banku_brougham | 10 hours ago
semi-extrinsic | 9 hours ago
jnaina | 6 hours ago
https://www.youtube.com/watch?v=EUXnJraKM3k
debugnik | 9 hours ago
pipes | 7 hours ago
Also, it is possibly the worst console name of all time.
I don't even know what Xbox is now, is it a service, is it a console, I'm not even joking really.
Also visual studio code Vs full fat visual studio. Thanks Microsoft you just made it more difficult to web search both products.
Full fat .Net Vs dotnet core Vs standard or is that .net.
lovich | 4 hours ago
pezezin | 9 hours ago
elzbardico | 7 hours ago
System 360 OS/2 DB 2 MQ series. PC
It is like IBM just refused to entertain the idea of having competitors, why should it them name a database by any other name than DB?
adityeah | 7 hours ago
OkayPhysicist | 5 hours ago
They probably should have called the WiiU the Super Wii or Wii 2 or something, but on the whole they've got a mostly coherent naming convention.
somat | 3 hours ago
Macha | 3 hours ago
joegibbs | 5 hours ago
"Oh you mean the original one?"
No the one that came after the 360.
"The third one?"
No that was the second one, the One was the third.
"OK what are they on now?"
The Series series.
"The Series series?"
Yeah the X and S. Don't confuse that with the Xbox One X or S, or the 360 S.
"Right but what's the difference?"
The X is better than the S because X is a bigger letter. But they run the same games, but they're different. They're the same though.
ted_bunny | 4 hours ago
wasmainiac | 11 hours ago
binsquare | 10 hours ago
This often happens because the people inside are incentivized to build their own empire.
If someone comes and wants to get promoted/become an exec, there's a ceiling if they work under the an existing umberlla + dealing the politics of introducing a feature which requires dealing with an existing org.
So they build something new. And the next person does the same. And so you have 365, One, Live, .Net, etc
josephg | 9 hours ago
atombender | 5 hours ago
canucker2016 | 9 hours ago
Then there's DirectX and its subs - though Direct3D had more room for expanded feature set compared to DXSound or DXInput so now they're up to D3D v12.
raincole | 12 hours ago
AI really should be a freaking feature, not the identity of their products. What MS is doing now is like renaming Photoshop to Photoshop Neural Filter.
beart | 10 hours ago
ikr678 | an hour ago
Searching the store or a company portal for one of these rebraned apps returns dozens of hits because 'windows', 'copilot', '365' and 'app' are all common words in most application descriptions.
rdsubhas | 11 hours ago
major505 | 11 hours ago
Marketing need as much supervision as a toddler in a cristal store.
Oras | 11 hours ago
GaryBluto | 10 hours ago
basch | 10 hours ago
I think I could clean up their existing mess if they want help.
Jedd outlines my credentials well here https://news.ycombinator.com/item?id=17522649#17522861
JumpCrisscross | 10 hours ago
skfiehcusjcn | 10 hours ago
Also, a great use is Microsoft Forms I was surprised with the AI features. At first I just used it to get some qualitative feedback but ended up using copilot to enter questions Claude helped me create and it converted them into the appropriate forms for my surveys!
Objectives -> Claude -> Surveys (markdown) -> Copilot -> MS Forms -> Emailed.
Insights and analysis can use copilot too.
Main thing to remember is the models behind the scenes will change and evolve, Copilot is the branding. In fact, we can expect most companies will use multiple AI solutions/pipelines moving forward.
rustyhancock | 10 hours ago
I have 2TB with OneDrive too via a Family Office account and I've got no good reason to have the large gapps space.
A ChatGPT account and pay for two Claude accounts.
Netflix, Disney+, Prime.
How did this happen to me?
Perhaps I should sign up to one of those companies that will help me close accounts I keep seeing advertised on YouTube?
racl101 | 9 hours ago
jug | 9 hours ago
I'm not sure if it's named Microsoft 365 Copilot nowadays, or if that's an optional AI addon? I thought it was renamed once more, but they themselves claim simply "Microsoft 365" (in a few various tiers) sans-Copilot. https://www.microsoft.com/microsoft-365/buy/compare-all-micr...
isk517 | 9 hours ago
Everyone I know who use AI day-to-day is just using Copilot to mostly do things like add a transition animation to a Powerpoint slide or format a word document to look nice. The only problem these LLM products seem to solve is giving normal people a easy way to interact with terrible software processes and GUIs. And better solution to that problem would be for developers to actually observe how the average use interacts with both a computer and their program in particular.
krzyk | 9 hours ago
It is a coding everything, autocomplete, ask, edit files and an agent (claude code like).
hbn | 9 hours ago
It's also an LLM chat UI, I don't know if it's because of my work but it lets me select models from all of the major players (GPT, Claude, Gemini)
https://github.com/copilot
p0w3n3d | 8 hours ago
NuclearPM | 4 hours ago
They sucks at names.
bandrami | 4 hours ago
sebazzz | an hour ago
The only thing until now I've found using the NPU are the built in blur, auto frame and eye focus modes for the webcam.
bandrami | 25 minutes ago
felixg3 | an hour ago
moi2388 | 14 hours ago
Well, that might explain why all their products are unusable lately.
Supermancho | 13 hours ago
paxys | 14 hours ago
llm_nerd | 14 hours ago
freedomben | 14 hours ago
interestpiqued | 9 hours ago
ecshafer | 14 hours ago
eloisant | 14 hours ago
leoedin | 13 hours ago
There's also all the other Copilot branded stuff which has varying use. The web based chat is OK, but I'm not sure which model powers it. Whatever it is it can be very verbose and doesn't handle images very well. The Office stuff seems to be completely useless so far.
Sammi | 12 hours ago
another_twist | an hour ago
0xbadcafebee | 13 hours ago
LeoPanthera | 8 hours ago
softwaredoug | 14 hours ago
It's interesting to think back, what did Copilot do wrong? Why didn't it become Claude Code?
It seems for one thing its ambition might have been too small. Second, it was tightly coupled to VS Code / Github. Third, a lot of dumb big org Microsoft politics / stakeholders overly focused on enterprise over developers? But what else?
firemelt | 14 hours ago
meanwhile ms and github, is waiting for any breadcrumb that chatgpt leave with
adastra22 | 14 hours ago
moregrist | 13 hours ago
It's pretty clear that Microsoft had "Everything must have Copilot" dictated from the top (or pretty close). They wanted to be all-in on AI but didn't start with any actual problems to solve. If you're an SWE or a PM or whatever and suddenly your employment/promotion/etc prospects depend on a conspicuously implemented Copilot thing, you do the best you can and implement a chat bot (and other shit) that no one asked for or wants.
I don't know Anthropic's process but it produced a tool that clearly solves a specific problem: essentially write code faster. I would guess that the solution grew organically given that the UI isn't remotely close to what you'd expect a product manager to want. We don't know how many internal false-starts there were or how many people were working on other solutions to this problem, but what emerged clearly solved that problem, and can generalize to other problems.
In other words, Microsoft seems to have focused on a technology buzzword. Anthropic let people solve their own problems and it led to an actual product. The kind that people want. The difference is like night and day.
Who knows what else might have happened in the last 12 months if C-suites were focused more on telling SWEs to be productive and less on forcing specific technology buzzwords because they were told it's the future.
softwaredoug | 8 hours ago
(A) doesn’t align to some important persons vision, who is incentivized have their finger on whatever change comes about
(B) might step on a lot of adjacent stakeholders, and the employees stakeholder may be risk adverse and want to play nice.
(C) higher up stakeholder fundamentally don’t understand the domain they’re leading
(D) the creators don’t want to fight an uphill battle for their idea to win.
yesiamyourdad | 4 hours ago
I think in the end it's branding. They want people to think "Copilot = AI" but the experience is anywhere from fairly effective to absolute trash. And the most visible applications are absolute trash. It really says something when Ethan Mollick is out there demonstrating that OpenAI is more effective at working with Excel than the built in AI.
There was an article posted here yesterday that said "MS has a lot to answer for with Copilot", and that was the point: MS destroyed their AI brand with this strategy.
another_twist | an hour ago
falloutx | 12 hours ago
prmph | 10 hours ago
Gemini CLI (not the model) is trash, I wish it weren't so, but I only have to try to use for a short time before I give up. It regularly retains stale file contents (even after re-prompting), constantly crashes, does not show diffs properly, etc, etc.
I recently tried OpenCode. It's got a bit better, but I still have all kind of API errors with the models. I also have no way to scroll back properly to earlier commands. Its edit acceptance and permissions interface is wonky.
And so on. It's amazing how Claude Code just nails the agentic CLI experience from the little things to the big.
Advice to agentic CLI developers: Just copy Claude Code's UX exactly, that's your starting point. Then, add stuff that make the life of user even easier and more productive. There's a ton of improvements I'd like to see in Claude Code:
- I frequently use multiple sessions. It's kinda hard to remember the context when I come back to a tab. Figure out a way to make it immediately obvious.
- Allow me to burn tokens to keep enough persistent context. Make the agent actually read my AGENTS.md before every response. Ensure thew agents gets closer and closer to matching the way I'd like it work as the sessions progresses (and even across sessions).
- Add a Replace tool, like the Read tool, that is reliable and prevents the agent from having to make changes manually one by one, or worse using sed (I've banned my agents from using sed because of the havoc they cause with it).
llmslave | 12 hours ago
tomashubelbauer | 12 hours ago
jonathanoliver | 14 hours ago
pluralmonad | 12 hours ago
MarcelOlsz | 12 hours ago
firemelt | 14 hours ago
EMM_386 | 14 hours ago
The tools or the models? It's getting absurdly confusing.
"Claude Code" is an interface to Claude, Cursor is an IDE (I think?! VS Code fork?), GitHub Copilot is a CLI or VS Code plugin to use with ... Claude, or GPT models, or ...
If they are using "Claude Code" that means they are using Anthropic's models - which is interesting given their huge investment in OpenAI.
But this is getting silly. People think "CoPilot" is "Microsoft's AI" which it isn't. They have OpenAI on Azure. Does Microsoft even have a fine-tuned GPT model or are they just prompting an OpenAI model for their Windows-builtins?
When you say you use CoPilot with Claude Opus people get confused. But this is what I do everyday at work.
shrug
ChrisArchitect | 14 hours ago
A.I. Tool Is Going Viral. Five Ways People Are Using It
https://www.nytimes.com/2026/01/23/technology/claude-code.ht...
Claude Is Taking the AI World by Storm, and Even Non-Nerds Are Blown Away
https://www.wsj.com/tech/ai/anthropic-claude-code-ai-7a46460...
falloutx | 12 hours ago
veryfancy | 14 hours ago
smithkl42 | 13 hours ago
torginus | 12 hours ago
It's truly more capable but still not capable enough that Im comfortable blindly trusting the output.
tomashubelbauer | 12 hours ago
MattGrommes | 11 hours ago
chasd00 | 9 hours ago
not sure what you mean, I have vscode open and make code changes in between claude doing its thing. I have had it revert my changes once which was amusing. Not sure why it did that, I've also seen it make the same mistake twice after being told not to.
keithnz | 8 hours ago
gloomyday | 13 hours ago
falloutx | 12 hours ago
torginus | 13 hours ago
I'm sure no other tech company is like this.
I think technologies like the Windows kernel and OS, the .NET framework, their numerous attempts to build a modern desktop UI framework with XAML, their dev tools, were fundamentally good at some point.
Yet they cant or wont hire people who would fix Windows, rather than just maintain it, really push for modernization, make .NET actually cool and something people want to use.
They'd rather hire folks who were taught at school that Microsoft is the devil and Linux is superior in all ways, who don't know the first thing about the MS tech stack, and would rather write React on the Macbooks (see the start menu incident), rather than touch anything made by Microsoft.
It seems somehow the internal culture allows this. I'm sure if you forced devs to use Copilot, and provided them with the tools and organizational mandate to do so, it would become good enough eventually to not have to force people to use it.
My main complaint I keep hearing about Azure (which I do not use at workr)
falloutx | 12 hours ago
Terretta | 12 hours ago
cgh | 12 hours ago
It simply didn’t work. I complained about it and was eventually hauled into a room with some MS PMs who told me in no uncertain terms that indeed, Biztalk didn’t work and it was essentially garbage that no one, including us, should ever use. Just pretend you’re doing something and when the week is up, go home. Tell everyone you’ve integrated with Biztalk. It won’t matter.
anonymars | 12 hours ago
coffeemug | 12 hours ago
alternatex | 9 hours ago
wahnfrieden | 13 hours ago
Claude Code is fun, full of personality, many features to hack around model shortcomings, and very quick, but it should not be let anywhere near serious coding work.
That's also why OpenClaw uses Claude for personality, but its author (@steipete) disallows any contribution to it using Claude Code and uses Codex exclusively for its development. Claude Code is a slop producer with illusions of productivity.
falloutx | 12 hours ago
wahnfrieden | 11 hours ago
(Also a signal for why devs should not bother with their shoddy Xcode AI work - Apple devs are not using it)
TZubiri | 13 hours ago
disqard | 11 hours ago
chasd00 | 9 hours ago
https://www.wsj.com/tech/ai/the-100-billion-megadeal-between...
TZubiri | 8 hours ago
But OpenAI is still innovating with new subcategories, and even in cases where it did not innovate (Claude Code came first and OpenAI responded with Codex), it outdoes its competitors. Codex is being widely preferred by the most popular vibecode devs, notably Moltbook's dev, but also Jess Fraz.
In terms of pricing, OAI holds by far the most expensive product so it's still positioned as a quality option, to give an example, most providers have a 3 tier price for API calls.
Anthropic has 1$/3$/5$ (per output MTokens) Gemini has 3$/12$ (2tier) OpenAI has 2$/14$/168$
So the competitors are mainly competing in price in the API category
To give another datapoint, Google just released multimodal (image input) models like 1 or 2 months ago. This has been in ChatGPT for almost a year now
endemic | 8 hours ago
https://www.folklore.org/I'll_Be_Your_Best_Friend.html
TZubiri | 7 hours ago
kachapopopow | 12 hours ago
kachapopopow | 12 hours ago
cake_robot | 12 hours ago
wendgeabos | 12 hours ago
Freedumbs | 12 hours ago
dboon | 12 hours ago
tomashubelbauer | 12 hours ago
songodongo | 12 hours ago
superfrank | 12 hours ago
Can anyone else share what their workflow with CC looks like? Even if I never end up switching I'd like to at least feel like I gave it a good shot and made a choice based on that, but right now I just feel like I'm doing something wrong.
gganley | 12 hours ago
theflyinghorse | 12 hours ago
In essence, this is pretty much how you'd run a group of juniors - you'd sit on slack and jira diving up work and doing code reviews.
superfrank | 11 hours ago
It's funny because that's basically the approach I take in GH Copilot. I first work with it to create a plan broken up into small steps and save that to an md file and then I have it go one step at a time reviewing the changes as it goes or just when it's done.
I understand that you're using emacs to keep an eye on the code as it goes, so maybe what I wasn't groking was that people were using terminal based code editors to see the changes it was making. I assumed most people were just letting it do it s thing and then trying to review everything at the end, but felt like an anti-pattern given how much we (dev community) push for small PRs over gigantic 5k line PRs.
1899-12-30 | 11 hours ago
PantaloonFlames | 10 hours ago
mimischi | 9 hours ago
https://github.com/xenodium/agent-shell
bluGill | 8 hours ago
strongpigeon | 12 hours ago
I got really good at reviewing code efficiently from my time at Google and others, which helps a lot. I'm sure my personal career experience influences a lot how I'm using it.
FWIW, I use Codex CLI, but I assume my flow would be the same with Claude Code.
silisili | 10 hours ago
It will read the existing plugins, understand the code style/structure/how they integrate, then create a plugin called "sample" AND code that is usually what you wanted without telling it specifically, and write 10 tests for it.
In those cases it's magic. In large codebases, asking it to add something into existing code or modify a behavior I've found it to be...less useful at.
chasd00 | 9 hours ago
Also, use the Superpowers plugin for Claude. That really helps for larger tasks but it can over do it hah. It's amusing to watch the code reviewer, implementor, and tester fight and go back and forth over something that doesn't even really matter.
strongpigeon | 12 hours ago
It reminds me of this [0] Dilbert comic, but heh.
[0]: https://x.com/idera_software/status/573165928264810496
chasd00 | 9 hours ago
ddtaylor | 11 hours ago
major505 | 11 hours ago
I worked on a project with some microsoft engineers to create a chatbot plugin for Salesforce, using Microsoft Power Virtual Agent, and the comunication tool they used was Slack and not teams. And I was obligated to use teams because of the consuting company I worked at the time.
And also the version control they used at the time was I think SVN, and not TFS.
winnie_ua | 11 hours ago
throwappleaway | 8 hours ago
h4kunamata | 8 hours ago
Windows 11 falling apart after AI adoption tells their AI, vibe coding is not going as planned.
If you saw their latest report claiming to focus on fixing the trust on Windows, it is a little too late, even newbies moved to Linux, and with AMD driver support, gaming is no longer an excuse.
gurrkin | 6 hours ago
pietz | 6 hours ago
8note | 2 hours ago
Vaslo | 3 hours ago
sweetrabh | 3 hours ago
We ran into this building a password automation tool (thepassword.app). The solution: the AI orchestrates browser navigation, but actual credential values are injected locally and never enter the model's reasoning loop. Prompt injection can't exfiltrate what's not in the context.
As these tools move into enterprise settings, I expect we'll see more architectural patterns emerge for keeping sensitive data out of agentic workflows entirely.
srinath693 | an hour ago