I wonder what the average career tenure of the userbase here is now, because Github was slow and flaky well before Microsoft got involved.
Maybe it wasn't as noticeable when Github had less features, but our CI runners and other automation using the API a decade ago always had weekly issues caused by Github being down/degraded.
GitHub goes down at least once a week as I said before. [0] thanks to Copilot, Tay.ai and Zoe chatbots wrecking the platform instead of humans maintaining it.
If there was a prediction market for when GitHub experiences an outage every week, then you would make a lot of money.
>GitHub goes down at least once a week as I said before. [0] thanks to Copilot, Tay.ai and Zoe chatbots wrecking the platform instead of humans.
there are tens of thousands of stupid scripts hosted on github itself that have scheduled progmatic pushes or pulls to repos via cron jobs with millions and millions of users -- yeah LLMs accelerate the fire but let's not pretend that GH was some bastion of real-user-dom somehow at some point.
One strategy to convince is to get someone less technical than you to sit by you while you try and trace everything from one error'd HTTP request from start to finish to diagnose the problem. If they see it takes half a day to check every call to every internal endpoint to 100% satisfy a particular request sometimes that can help.
Also sometimes they just think "this is a bunch of nerd stuff, why are you involving me?!" So it's not foolproof.
Oh, my non-technical boss agrees with me already. It's actually the engineers who've convinced themselves it's a good setup. Nice guys but very unwilling to change. Seems they're quite happy to have become 'experts' in this mess over the last 5-10 years. Almost like they're in retirement mode.
The real solution is probably to leave, but the market sucks at the moment. At least AI makes the 10-repos-per-tiny-feature thing easier.
There are so many failures in microservices that just can't happen with a local binary. Inter-service communication over network is a big one with a failure rate orders of magnitude higher than running a binary on the same machine. Then you have to do deploys, monitoring, etc. across the whole platform.
You will basically need to employ solutions for problems only caused by your microservices arch. E.g. take reading the logs for a single request. In a monolith, just read the logs. For the many-service approach, you need to work out how you're going to correlate that request across them all.
Even the aforementioned network failures require a lot of design, and there's no standardization. Does the calling service retry? Does the callee have a durable queue and pick back up? What happens if a call/message gets 'too old'?
Also, from the other end, command line utils are typically made by entirely different people with entirely different philosophies/paradigms, so the encapsulation makes sense. That's not true when you're the one writing all the services, especially not at small-to-mid-size companies.
Plus, you already can do the single-concern thing in a monolith, just with modules/interfaces/etc.
> GitHub has recently seen more outages, in part because its central data center in Virginia is indeed resource-constrained and running into scaling issues. AI agents are part of the problem here. But it’s our understanding that some GitHub employees are concerned about this migration because GitHub’s MySQL clusters, which form the backbone of the service and run on bare metal servers, won’t easily make the move to Azure and lead to even more outages going forward.
Age-old lesson: change the tires on the moving vehicle that is your business when it's a Geo Metro, not when it's a freight train.
I'm sure the people with the purse strings didn't care, though, and just wanted to funnel the GH userbase into Azure until the wheels fell off, then write off the BU. Bought for $7.5B, it used to make $250M, but now makes $2B, so they could offload it make a profit. I wonder who'll buy it. Prob Google, Amazon, IBM, Oracle, or a hedge fund. They could choose not to sell it, but it'll end up a writeoff if the userbase jumps ship.
I've been using "slopocalypse". People already know AI is responsible, but slop existed before — e.g. conventionally generated SEO spam. It's just... so much worse now.
> Any massive infra migration is going to cause issues.
What? No, no it's not. The entire discipline of Infrastructure and Systems engineering are dedicated to doing these sorts of things. There are well-worn paths to making stable changes. I've done a dozen massive infrastructure migrations, some at companies bigger than Github, and I've never once come close to this sort of instability.
This is a botched infrastructure migration, onto a frankly inferior platform, not something that just happens to everyone.
I assume this is all of the pains of going from "GHA is sorta kinda on Azure", which was a bad state, to "GHA is going full Azure", which is a painful state to get to but presumably simplifies things.
I've been sitting here waiting for a critical deploy to happen via GitHub Actions (I know, hour fault, we should have left ages ago). My patience for this bullshit is gone, I'm going to be pushing very hard to get us off of GitHub entirely except for public code mirrors going forward.
Edit: oh look, their site says all good, but I still have jobs stuck. What a pile of garbage.
Sorry, I realise this comment isn't up to HN's usual standards for thoughtfulness and it is perhaps a bit inflammatory but... look, I'd bet the majority of us on this site rely on GitHub and I can't be the only one becoming incredibly frustrated with its recent unreliability[0]?
(And, yes, I did enough basic data analysis to confirm that it IS indeed getting worse versus a year, two years, and three years ago, and is particularly bad since the start of this year.)
[0] EDIT: clearly not from looking at the rest of the comments in this discussion.
@KaiserPro has pasted the link to someone else's heatmap, which is really good. Mine was just an Excel spreadsheet with a graph that I'd intended to write a blog about but then got demotivated on because I was too busy with other things and I saw that heatmap as well. Maybe I will do a proper write up next time GitHub has an outage and I'm blocked by it.
It hosts all the repositories backing applycreatures, we ran dozens of git projects on the same instance, have teams, you guys did a phenomenal work. I would say it's even easy to customise.
And a ton of the top end ruby staff have left. Many of them ended up at shopify. There is a growing about of non ruby/rails code at github, but most of the system that people think of when they think github are ruby/rails.
I don't remember that happening so much (if ever) in, say, 2016. But the frequency of noticeable incidents seemingly has been rising steadily since around 2023. The Azure migration apparently only exacerbated it.
I remember it going down semi-regularly in the 2013+ era, and seeing HN posts about it. Especially if you were using a package manager reliant on GitHub like Cocoapods. It seems to me it is more "impactful" on the dev community now that they have gone past just being a centralized Git server for the team, to being the thing that does deploys and all sorts of other things.
I remember seeing unicorn daily and "webhook delivery delayed" weekly. I think it got better, but also they got more traffic, now millions of agents read files separately over and over again.
It was not nearly as bad... I remember our company migrating to github.com, and believe it or not, it was significant performance/uptime benefit over our self-hosted instance.
(And the first thing to go was occasional 500's on github-hosted files.. the core service itself - git, PR, actions - were pretty stable until recently)
I’m still baffled that Minecraft is doing so well, despite the whole Bedrock thing. At this point I think Microsoft just forgot that they bought Mojang.
Its had its fair share of outages and outrageous changes that overreach the bounds as well. Its more stable than github is but its had at least 2 sessions of downtime this year that I recall and they were both quite long (day length).
They'd lose a whole lot of users if they killed Java edition, since the modded community is so large. They'd quickly find one of the Minecraft clones reaching feature parity. And there's no good reason for it - it's not like Java is a threat anymore.
Exactly. So why isn't Microsoft doing just that? Isn't that how Microsoft usually handles things? Just look at Xbox. They essentially screwed up everything they could and then some.
They don’t enforce or even default to 2fa to change the account email. In addition, they have no process to get a human to reverse account takeovers. Just a web form that tells you to call a number that redirects you back to a web form
On the other hand, they aggressively log out legitimate users, and require the master Microsoft account password to log back in (because your kids need access to your one drive settings, etc).
I think they largely let mojang do its own thing, occasionally forcing them to make some dumb change that usually stays exclusive to their "bedrock edition". The mojang people capitulate since the original version and the one they actually develop for is largely untouched by microsofts decision making since the backlash for dumb decisions would lose infinitely more money than if they just let it continue to be a cash cow
The worst part of all this is that GitHub's CTO and VP of Engineering sent out the usual "here's what we'll do to fix things" letter to their larger customers and, without exaggeration, it boiled down to: 1) "Here's a bunch of stuff we already did!" which... clearly isn't working, and 2) "We're continuing our Azure migration." also clearly not working.
So needless to say, if you depend on GitHub for critical business operations, you need to start thinking about what a world without GitHub looks like for your business and start working your way toward that. I know my confidence in GitHub's engineering leadership is at rock bottom.
I could sorta see a situation where the reality is "we're in the middle of a miserable transition and it'll clean up when we're done" but I don't think anyone has confidence that's all it is at this point.
Even that doesn’t really make sense to me, unless they’ve done it in a way where everything has to move at once.
Everywhere I’ve worked, if a migration is causing this much downtime then you kill the migration or slow it down. If every change has a 10% chance of bringing the site down, you only do a change every week or two until you can work out the kinks.
Here are some relevant excerpts from an October 2025 article[1]:
> In a message to GitHub’s staff, CTO Vladimir Fedorov notes that GitHub is constrained on capacity in its Virginia data center. “It’s existential for us to keep up with the demands of AI and Copilot, which are changing how people use GitHub,” he writes.
> The plan, he writes, is for GitHub to completely move out of its own data centers in 24 months. “This means we have 18 months to execute (with a 6 month buffer),” Fedorov’s memo says. He acknowledges that since any migration of this scope will have to run in parallel on both the new and old infrastructure for at least six months, the team realistically needs to get this work done in the next 12 months.
If you consider that six month parallel window to have started from the time of the October memo (written presumably at the start of October), then that puts us currently or past the point where they would have cut off their old DC and defaulted to Azure only.
Whether plans or timelines changed, I have no idea of course but the above does make for a convenient timeline that would explain the recent instability. Of course, it could also just be symptomatic of increased AI usage generally and the same problems might have surfaced at a software level regardless of whether they were in a DC or on Azure.
Putting that nuance aside, personally I like the idea that Azure is simply a giant pile of shit operated by a corporation with no taste.
>It’s existential for us to keep up with the demands of AI and Copilot
if by chance the CTO reads this, as a user of GitHub I would find it really existential if GitHub continues functioning as a reliable hub for git workflows (hence the name), and I have the strong suspicion nobody except for the shareholders gives a lick about copilot or 'AI' if it makes the core service the site was designed for unusable
Incorrect. They need to appease/trick/threaten/etc those that are paying for their services. Shareholders just demand they do so at the greatest (often short term) rate.
Why? What is the correlation between profit and shareholder sentiment (besides the fact that shareholders want said profits)? They don't really influence the operation of the business meaningfully.
Sure, but I think it's the wrong way around. Appeasing shareholders doesn't make you profitable, being profitable appeases shareholders. I think there is a wealth of evidence that appeasing shareholders actually impedes profits overall.
i heard that they asked LinkedIn to do this too and they either refused or their systems were too complex so they refused to. Maybe that explains why LI availability seems ok
I remember back in the early Windows XP era when things got so bad that Microsoft basically had to make a hard pivot towards security and reliability.
I think they may need to do that once again. Almost every product of theirs feels like a dumpster fire. GitHub is down constantly, Windows 11 is a nightmare and instead of patching things they're adding stupid features nobody asked for. I think they need to stop and really look closely at what they're prioritizing.
I remember. My GitHub user ID is #5907, account created 2008-04-08T20:27:36Z. I think it is inevitable that all good things come to an end, but it's still a bummer to see.
It's starting to really look like the AI effect. It might be coincidence but I've noticed a lot more downtime and bad software lately. The last Nvidia drivers gave me a blue screen (last week or so), and speaking about Windows, I froze updates last year because it was clear they were introducing a bunch of issues with every update (not to mention unwanted features).
I like AI but actually not for coding because code quality is correlated to how well you understand the underlying systems you're building on, and AI is not really reasoning on this level at all. It's clearly synthesizing training data and it's useful in limited ways.
GitHub has been unreliable since before AI. Though it's definitely gotten far worse.
Seemingly the decline started with the Microsoft acquisition in 2018, and subsequent "unlimited private repository" change in 2019 (to match Gitlab's popular offer)
One example is the search being broken for CI logs. It takes over your browser's search hotkey too. What happens is every stage of the log is collapsed so the search doesn't work until you trigger the expansion but if you attempt to search before expanding the search will never work after it's been initialized. It's pretty infuriating when you're trying to find something in a giant build log.
Interesting how many people "Like AI" because it's good at all the jobs other than the one they happen to make a living doing.
Did you hear about the screenwriters school in which the professors said to avoid AI for writing, but it's great for storyboards. And the storyboard school where the professors said the opposite?
The reality is that AI isn't actually "good" at anything. It produces passable ersatz facsimiles of work that can fool those not skilled in the art. The second reality of AI is that everyone is busy cramming it into their products at the expense of what their products are actually useful for.
Once people realise (1), and stop doing (2), the tech industry has a chance of recovering.
Yeah, I think I heard about that. Within certain domains it is certainly a useful tool. I would say things like online search are much nicer now (in that asking an AI is equivalent to searching online but it summarizes it for you). Online search fits the strengths of LLMs nicely, but right now it's being sold as a silver bullet, which it's not.
The nvidia thing makes sense. If you get AI to write code for a platform you no longer really have an incentive to care that much about (windows) for a purpose you increasingly don't care about (actually drawing things to a screen), you're probably not going to test it as thoroughly as you used to
Man, a while ago I thought: "It happens often, alright, but every 2 weeks? Sounds like a slight exaggeration." But it really is every 2 weeks, isn't it? If I imagine in a previous job anything production being down every 2 weeks ... phew, would have had to have a few hard talks and course corrections.
i once fixed a site going down several times a year with two t1.micro instances in the same region as the majority of traffic. Instantly solved the problem for what, $20/month?
Another site was constantly getting DDoS by Russians who were made we took down their scams on forums, that had to go through verisign back then, not sure who they're using now. They may have enough aggregate pipe it doesn't matter at this point
So am I the only one thinking that maybe GitHub is succumbing to the weight of AI slop that's coming in from all the vibecoding, clawbots, and other AI workflows?
Github CEO must be on HN, right? If so, any comments?
They have not even bothered to implement entra login when they have their competitors login for years, do they even know what their product is? Or are you just a middle man for slop?
Does anyone else ever think "that code I just pushed into my repo just took down all of github..." whenever it goes down around the same time you sync your changes?
Just moved a project of mine to Gitlab. Created this very simple component with codex that will keep a mirror updated on GitHub for me, so I can focus development on Gitlab.
I'm surprised nobody has tried to throw together a commercial alternative to GitHub. 50% of it is available as FOSS, the other 50% you can vibecode in a month (you can vibecode reliably, Microsoft/Google just suck at it). Afaict, reason we all keep using GitHub is it has a million features and isn't as ugly, difficult and slow as GitLab. (sorry GitLab, I love your handbook, hate your UX)
Why don't companies with chronic outages mimic their stack from top to bottom (i.e. starting with a new domain), then before making a change, make the change on the duplicate stack and blast it with mock requests.
Might catch 90% of problems before they make it into the real stack?
E.g. every step of GitHub's migration to Azure could be mimicked on the duplicate stack before it's implemented on the primary stack. Is this just considered too much work? (I doubt cost would be the issue, because even if it costs millions, it would pay for itself in reduced reputational damage from outages).
EDIT: downvotes - why? - I think this is a good idea (I'd do it for my sites if outages were an issue).
> EDIT: downvotes - why? - I think this is a good idea (I'd do it for my sites if outages were an issue).
Because that's a monumental amount of work, and extraordinarily difficult to retrofit into a system that wasn't initially designed that way. Not to mention the unstated requirement of mirroring traffic to actually exercise that system (given the tendency of bugs to not show up until something actually uses the system).
Agree, but look at the alternative; GitHub is constantly being savaged by users who (quite reasonably) expect uptime. Ignoring impacts on morale and reputation, damage to their bottom line alone might tens (hundreds?) of millions per year.
> mirroring traffic
yeah, I agree that's difficult, but it need to not be exact to still be useful.
Downvotes are probably because that is what companies without chronic outages do.
If you'd ever worked on a codebase as terrible as I imagine GH's internals are and looked at the git history, you'd find two things:
1) fixing it would require rolling back 100's-1000's of engineer-years of idiocy that make things like testing or refactoring untenable
2) many prior engineers got part of the way through such improvements before leaving or being kicked out. Their efforts mostly just made it worse, because now you never know what sort of terribleness to expect when you open an unfamiliar file.
How much of this is due to Microsoft Culture of not innovating and buying leading companies with their revenue from windows/office and slowly destroying the aspects of companies that made them great in the first place?
Is all the recent GitHub downtime entirely attributable to GitHub AI Copilot related development? How hard can it be to reduce the blast radius of new AI features to not affect the core parts of hosting repositories? Because of Copilot everywhere, The UX has become bad and I had to click all over the place and on my profile to find repositories.
I do not care about much of it other than the git and API. I also sometimes use the Issues, although only with the API. But if it stops working sometimes, that is not too significantly a problem since the files can be sent after they start to work again; it does not have to be immediately.
[OP] MattIPv4 | 22 hours ago
inaros | 22 hours ago
amarant | 22 hours ago
corvad | 22 hours ago
bartread | 22 hours ago
Would you like help?
- Get help with developing the software
- Just develop the software without help
[ ] Don't show me this tip again"
hedora | 15 hours ago
insin | 13 hours ago
Waterluvian | 21 hours ago
kenhwang | 21 hours ago
Maybe it wasn't as noticeable when Github had less features, but our CI runners and other automation using the API a decade ago always had weekly issues caused by Github being down/degraded.
jmtulloss | 20 hours ago
morkalork | 21 hours ago
rvz | 22 hours ago
If there was a prediction market for when GitHub experiences an outage every week, then you would make a lot of money.
[0] https://news.ycombinator.com/item?id=47487881
serf | 22 hours ago
there are tens of thousands of stupid scripts hosted on github itself that have scheduled progmatic pushes or pulls to repos via cron jobs with millions and millions of users -- yeah LLMs accelerate the fire but let's not pretend that GH was some bastion of real-user-dom somehow at some point.
pak9rabid | 22 hours ago
Imustaskforhelp | 22 hours ago
dylan604 | 21 hours ago
Imustaskforhelp | 2 hours ago
ahstilde | 22 hours ago
Imustaskforhelp | 22 hours ago
abound | 22 hours ago
brookst | 22 hours ago
munchler | 22 hours ago
the_real_cher | 21 hours ago
mememememememo | 12 hours ago
1 - 10 ^ -N (multiply by 100 for percent)
So 9% is 0.09 for the calc
1 - 10 ^ -N = 0.09
So
10 ^ -N = 0.91
So
N = -log10 0.91
So 0.09 (9%) reliability is 0.0409586077 of a nine.
And running it thru... a tenth of a nine is 0.2056717653 or about 20.57% reliability
mememememememo | 22 hours ago
99.99
99.90
99.00
90.00
msandford | 21 hours ago
the_real_cher | 21 hours ago
nuker | 19 hours ago
0x3f | 21 hours ago
Currently consulting somwhere with 30 services per engineer. I cannot convince them this is hell. Maybe that makes it my personal hell.
msandford | 21 hours ago
One strategy to convince is to get someone less technical than you to sit by you while you try and trace everything from one error'd HTTP request from start to finish to diagnose the problem. If they see it takes half a day to check every call to every internal endpoint to 100% satisfy a particular request sometimes that can help.
Also sometimes they just think "this is a bunch of nerd stuff, why are you involving me?!" So it's not foolproof.
0x3f | 21 hours ago
The real solution is probably to leave, but the market sucks at the moment. At least AI makes the 10-repos-per-tiny-feature thing easier.
KaiserPro | 21 hours ago
In that every night you're playing murder mystery, and its never fun.
0x3f | 18 hours ago
NooneAtAll3 | 19 hours ago
how is such service spam different from unix "small functions that do one thing only" culture?
why in unix case it is usually/historically seen as nice, while in web case it makes stuff worse?
0x3f | 18 hours ago
You will basically need to employ solutions for problems only caused by your microservices arch. E.g. take reading the logs for a single request. In a monolith, just read the logs. For the many-service approach, you need to work out how you're going to correlate that request across them all.
Even the aforementioned network failures require a lot of design, and there's no standardization. Does the calling service retry? Does the callee have a durable queue and pick back up? What happens if a call/message gets 'too old'?
Also, from the other end, command line utils are typically made by entirely different people with entirely different philosophies/paradigms, so the encapsulation makes sense. That's not true when you're the one writing all the services, especially not at small-to-mid-size companies.
Plus, you already can do the single-concern thing in a monolith, just with modules/interfaces/etc.
anotherjesse | 20 hours ago
rdtsc | 21 hours ago
corvad | 22 hours ago
mememememememo | 22 hours ago
That helps with Git not so much issues etc.
corvad | 22 hours ago
steeleduncan | 22 hours ago
voidfunc | 22 hours ago
altairprime | 22 hours ago
To explain this one-word comment for those unfamiliar, see previously:
GitHub will prioritize migrating to Azure over feature development (5 months ago) https://news.ycombinator.com/item?id=45517173
In particular:
> GitHub has recently seen more outages, in part because its central data center in Virginia is indeed resource-constrained and running into scaling issues. AI agents are part of the problem here. But it’s our understanding that some GitHub employees are concerned about this migration because GitHub’s MySQL clusters, which form the backbone of the service and run on bare metal servers, won’t easily make the move to Azure and lead to even more outages going forward.
0xbadcafebee | 19 hours ago
I'm sure the people with the purse strings didn't care, though, and just wanted to funnel the GH userbase into Azure until the wheels fell off, then write off the BU. Bought for $7.5B, it used to make $250M, but now makes $2B, so they could offload it make a profit. I wonder who'll buy it. Prob Google, Amazon, IBM, Oracle, or a hedge fund. They could choose not to sell it, but it'll end up a writeoff if the userbase jumps ship.
smartmic | 22 hours ago
bartread | 22 hours ago
zahlman | 21 hours ago
bartread | 20 hours ago
At any rate, it seems like GitHub is back up now, so we'll see how long that lasts.
KaiserPro | 21 hours ago
adzm | 20 hours ago
yoyohello13 | 22 hours ago
qudat | 22 hours ago
seneca | 21 hours ago
What? No, no it's not. The entire discipline of Infrastructure and Systems engineering are dedicated to doing these sorts of things. There are well-worn paths to making stable changes. I've done a dozen massive infrastructure migrations, some at companies bigger than Github, and I've never once come close to this sort of instability.
This is a botched infrastructure migration, onto a frankly inferior platform, not something that just happens to everyone.
pixelesque | 22 hours ago
cyanydeez | 21 hours ago
paxys | 21 hours ago
staticassertion | 21 hours ago
dec0dedab0de | 21 hours ago
the_real_cher | 21 hours ago
Artificial intelligence, Azure integration, many other things.
pera | 21 hours ago
https://www.forbes.com/sites/bernardmarr/2025/07/08/microsof...
GiorgioG | 22 hours ago
olivia-banks | 22 hours ago
packetlost | 22 hours ago
Edit: oh look, their site says all good, but I still have jobs stuck. What a pile of garbage.
I'm so sick of this.
mememememememo | 22 hours ago
karim79 | 22 hours ago
xtracto | 21 hours ago
FTFY. (I've read AWS word it like that)
mememememememo | 20 hours ago
odiroot | 22 hours ago
anematode | 20 hours ago
bartread | 22 hours ago
Sorry, I realise this comment isn't up to HN's usual standards for thoughtfulness and it is perhaps a bit inflammatory but... look, I'd bet the majority of us on this site rely on GitHub and I can't be the only one becoming incredibly frustrated with its recent unreliability[0]?
(And, yes, I did enough basic data analysis to confirm that it IS indeed getting worse versus a year, two years, and three years ago, and is particularly bad since the start of this year.)
[0] EDIT: clearly not from looking at the rest of the comments in this discussion.
zahlman | 21 hours ago
> And, yes, I did enough basic data analysis to confirm
Perhaps you'd consider showing us that analysis? That sounds like it would make a pretty substantive, thoughtful comment.
KaiserPro | 21 hours ago
Gaze upon the tapestry in which github paints it's failure with a thin copper red thread:
https://www.githubstatus.com/
bartread | 20 hours ago
dsm4ck | 22 hours ago
workfromspace | 21 hours ago
kraemahz | 21 hours ago
nasretdinov | 22 hours ago
sc__ | 22 hours ago
hirako2000 | 22 hours ago
mfenniak | 21 hours ago
hirako2000 | 21 hours ago
https://foja.applycreatures.com
Edit: it has a wonderful API so I posted the link it may tempt some to ditch MS/Azure hub.
paxys | 22 hours ago
xeonmc | 21 hours ago
cm2187 | 21 hours ago
https://trends.google.com/trends/explore?date=all&geo=GB&q=s...
htrp | 21 hours ago
workfromspace | 21 hours ago
carlosft | 20 hours ago
ambicapter | 19 hours ago
merlindru | 21 hours ago
nine_k | 21 hours ago
gbear605 | 21 hours ago
georgel | 20 hours ago
0x457 | 19 hours ago
IMO it's much better now.
bandrami | 18 hours ago
theamk | 16 hours ago
(And the first thing to go was occasional 500's on github-hosted files.. the core service itself - git, PR, actions - were pretty stable until recently)
AndroTux | 21 hours ago
7777332215 | 21 hours ago
Biganon | 20 hours ago
PaulKeeble | 21 hours ago
spauldo | 21 hours ago
AndroTux | 20 hours ago
cedws | 20 hours ago
joshribakoff | 18 hours ago
hedora | 15 hours ago
hedora | 15 hours ago
I just use an offline server, so I wouldn't notice if they had GitHub levels of availability.
mghackerlady | 11 hours ago
guywithabike | 21 hours ago
So needless to say, if you depend on GitHub for critical business operations, you need to start thinking about what a world without GitHub looks like for your business and start working your way toward that. I know my confidence in GitHub's engineering leadership is at rock bottom.
packetlost | 21 hours ago
Eji1700 | 21 hours ago
everforward | 21 hours ago
Everywhere I’ve worked, if a migration is causing this much downtime then you kill the migration or slow it down. If every change has a 10% chance of bringing the site down, you only do a change every week or two until you can work out the kinks.
acedTrex | 21 hours ago
shrikant | 20 hours ago
suriya-ganesh | 21 hours ago
sysworld | 21 hours ago
cyanydeez | 21 hours ago
spondyl | 21 hours ago
> In a message to GitHub’s staff, CTO Vladimir Fedorov notes that GitHub is constrained on capacity in its Virginia data center. “It’s existential for us to keep up with the demands of AI and Copilot, which are changing how people use GitHub,” he writes.
> The plan, he writes, is for GitHub to completely move out of its own data centers in 24 months. “This means we have 18 months to execute (with a 6 month buffer),” Fedorov’s memo says. He acknowledges that since any migration of this scope will have to run in parallel on both the new and old infrastructure for at least six months, the team realistically needs to get this work done in the next 12 months.
If you consider that six month parallel window to have started from the time of the October memo (written presumably at the start of October), then that puts us currently or past the point where they would have cut off their old DC and defaulted to Azure only.
Whether plans or timelines changed, I have no idea of course but the above does make for a convenient timeline that would explain the recent instability. Of course, it could also just be symptomatic of increased AI usage generally and the same problems might have surfaced at a software level regardless of whether they were in a DC or on Azure.
Putting that nuance aside, personally I like the idea that Azure is simply a giant pile of shit operated by a corporation with no taste.
[1]: https://thenewstack.io/github-will-prioritize-migrating-to-a...
Barrin92 | 21 hours ago
if by chance the CTO reads this, as a user of GitHub I would find it really existential if GitHub continues functioning as a reliable hub for git workflows (hence the name), and I have the strong suspicion nobody except for the shareholders gives a lick about copilot or 'AI' if it makes the core service the site was designed for unusable
jwoq9118 | 20 hours ago
conception | 20 hours ago
denkmoon | 20 hours ago
kevin_thibedeau | 17 hours ago
denkmoon | 16 hours ago
ncruces | 19 hours ago
kleene_op | 21 hours ago
pm90 | 21 hours ago
comice | 20 hours ago
I wonder if the extended downtime is just due to the on-call engineers waiting for their azure auth tokens to refresh within azure's own damn network.
ryukoposting | 21 hours ago
trvz | 21 hours ago
justinko | 18 hours ago
AustinDev | 17 hours ago
"The evidence is clear: Either you embrace AI, or get out of this career." -Github CEO
"Sooner than later, 80% of the code is going to be written by Copilot. And that doesn’t mean the developer is going to be replaced." -Github CEO
zombot | 3 hours ago
rileymichael | 21 hours ago
esafak | 21 hours ago
duped | 21 hours ago
pbkompasz | 21 hours ago
pylua | 21 hours ago
I can’t be specific but we are constantly complaining.
keithnz | 21 hours ago
pylua | 21 hours ago
sysworld | 21 hours ago
overgard | 21 hours ago
I think they may need to do that once again. Almost every product of theirs feels like a dumpster fire. GitHub is down constantly, Windows 11 is a nightmare and instead of patching things they're adding stupid features nobody asked for. I think they need to stop and really look closely at what they're prioritizing.
ekropotin | 21 hours ago
whalesalad | 21 hours ago
Freedom2 | 21 hours ago
rdedev | 21 hours ago
wrqvrwvq | 19 hours ago
jeppester | 21 hours ago
proc0 | 21 hours ago
I like AI but actually not for coding because code quality is correlated to how well you understand the underlying systems you're building on, and AI is not really reasoning on this level at all. It's clearly synthesizing training data and it's useful in limited ways.
newbish | 21 hours ago
qbane | 21 hours ago
holoduke | an hour ago
someperson | 21 hours ago
Seemingly the decline started with the Microsoft acquisition in 2018, and subsequent "unlimited private repository" change in 2019 (to match Gitlab's popular offer)
hparadiz | 20 hours ago
ivanjermakov | 19 hours ago
davebranton | 18 hours ago
Did you hear about the screenwriters school in which the professors said to avoid AI for writing, but it's great for storyboards. And the storyboard school where the professors said the opposite?
The reality is that AI isn't actually "good" at anything. It produces passable ersatz facsimiles of work that can fool those not skilled in the art. The second reality of AI is that everyone is busy cramming it into their products at the expense of what their products are actually useful for.
Once people realise (1), and stop doing (2), the tech industry has a chance of recovering.
proc0 | 18 hours ago
danillonunes | 17 hours ago
thewhitetulip | 16 hours ago
mghackerlady | 11 hours ago
stevepotter | 21 hours ago
kace91 | 21 hours ago
RevEng | 20 hours ago
philipallstar | 20 hours ago
ransom1538 | 21 hours ago
wenbin | 21 hours ago
s_u_d_o | 21 hours ago
cyanydeez | 21 hours ago
zelphirkalt | 21 hours ago
genewitch | 21 hours ago
Another site was constantly getting DDoS by Russians who were made we took down their scams on forums, that had to go through verisign back then, not sure who they're using now. They may have enough aggregate pipe it doesn't matter at this point
newbish | 21 hours ago
jrm4 | 21 hours ago
TimReynolds | 21 hours ago
tholford | 20 hours ago
https://about.gitea.com/
messe | 20 hours ago
I've been considering it for a while, but I'm definitely now pitching a move away from GitHub at our organization.
jbmilgrom | 20 hours ago
justinholmes | 20 hours ago
lousken | 20 hours ago
They have not even bothered to implement entra login when they have their competitors login for years, do they even know what their product is? Or are you just a middle man for slop?
mayhemducks | 20 hours ago
butterlesstoast | 20 hours ago
gchamonlive | 20 hours ago
https://gitlab.com/gabriel.chamon/ci-components/-/tree/main/...
0xbadcafebee | 20 hours ago
nomilk | 19 hours ago
Might catch 90% of problems before they make it into the real stack?
E.g. every step of GitHub's migration to Azure could be mimicked on the duplicate stack before it's implemented on the primary stack. Is this just considered too much work? (I doubt cost would be the issue, because even if it costs millions, it would pay for itself in reduced reputational damage from outages).
EDIT: downvotes - why? - I think this is a good idea (I'd do it for my sites if outages were an issue).
worik | 18 hours ago
drewbug01 | 18 hours ago
Because that's a monumental amount of work, and extraordinarily difficult to retrofit into a system that wasn't initially designed that way. Not to mention the unstated requirement of mirroring traffic to actually exercise that system (given the tendency of bugs to not show up until something actually uses the system).
nomilk | 18 hours ago
Agree, but look at the alternative; GitHub is constantly being savaged by users who (quite reasonably) expect uptime. Ignoring impacts on morale and reputation, damage to their bottom line alone might tens (hundreds?) of millions per year.
> mirroring traffic
yeah, I agree that's difficult, but it need to not be exact to still be useful.
hedora | 15 hours ago
If you'd ever worked on a codebase as terrible as I imagine GH's internals are and looked at the git history, you'd find two things:
1) fixing it would require rolling back 100's-1000's of engineer-years of idiocy that make things like testing or refactoring untenable
2) many prior engineers got part of the way through such improvements before leaving or being kicked out. Their efforts mostly just made it worse, because now you never know what sort of terribleness to expect when you open an unfamiliar file.
rco8786 | 19 hours ago
gverrilla | 18 hours ago
swed420 | 18 hours ago
nichos | 15 hours ago
sathish316 | 18 hours ago
Is all the recent GitHub downtime entirely attributable to GitHub AI Copilot related development? How hard can it be to reduce the blast radius of new AI features to not affect the core parts of hosting repositories? Because of Copilot everywhere, The UX has become bad and I had to click all over the place and on my profile to find repositories.
jiveturkey | 15 hours ago
zzo38computer | 15 hours ago
zombot | 3 hours ago
ygritte | 3 hours ago