The comparison is somewhat skewed, since they went from an (expensive) virtual server to a cheaper dedicated server (hardware).
One of the new risks is if anything critical happens with the hardware, network, switch etc. then everything is down, until someone at Hetzner go fixes it.
With a virtual server it’ll just get started on a different server straight away. Usually hypervisors also has 2 or more network connections etc.
And hopefully they also got some backup setup.
It’s still a huge amount of of savings and I’d probably do the same of I were in their shoes, but there is tradeoffs when going from virtual- to dedicated hardware.
> We need more competition across the board. These savings are insane and DO should be sweating, right?
As the other person already said here, this blog post comparison is skewed.
BUT
EU cloud providers are much better value for money than the US providers.
The US providers will happily sit there nickle and diming you, often with deliberately obscure price sheets (hello AWS ;).
EU cloud provider pricing is much clearer and generally you get a lot more bang for your buck than you would with a US provider. Often EU providers will give you stuff for free that US providers would charge you for (e.g. various S3 API calls).
Therefore even if this blog post is skewed and incorrect, the overall argument still stands that you should be seriously looking at Hetzner or Upcloud or Exoscale or Scaleway or any of the other EU providers.
In addition there is the major benefit of not being subject to the US CLOUD and PATRIOT acts. Which despite what the sales-droids will tell you, still applies to the fake-EU provided by the US providers.
When some component in OP's dedicated server fails, they will find out what that extra DO money was going toward. The DO droplet will live migrate to a healthy server. OP gets to take an extended outage while they file a Hetzner service ticket and wait for a human to perform the hardware replacement. Do some online research and see how long this often takes. I don't believe this Hetzner dedicated server model even has redundant PSUs.
Anyone who thinks DO and Hetzner dedicated servers are fungible products is making a mistake. These aren't the same service at all. There are savings to be had but this isn't a direct "unplug DO, plug in Hetzner" situation.
Hetzner also offers a VPS with superior specs to their old DO server for €374.99/month, or €0.6009/hour. They could just switch to a VPS temporarily while waiting for the hardware fix.
Although since they were running a LEMP server stack manually and did their migration by copying all files in /var/www/html via rsync and ad-hoc python scripts, even a DO droplet doesn't have the best guarantee. Their lowest-hanging fruit is probably switching to infrastructure as code, and dividing their stack across multiple cheaper servers instead of having a central point of failure for 34 applications.
I moved from Heztner to DO because my Hetzner IPs kept getting spoofed and then Hetzner would shut down my servers for "abuse". This hasn't happened once on DO, and I'm happy to pay a little more.
It's tough to work with these publicly traded companies. They need to boost prices to show revenue growth. At some point, they become a bad deal. I've already migrated from DO. Not because of service or quality, but solely because of price.
Yeah just how it is even outside of the cloud. At some point nearly all companies eventually try to take advantage of inertia and vendor lock in, if you are willing to undertake the pain of switching it's almost always a savings.
I suspect with that money you could get a full time customer support person for your business. Now think about it, what's creating more value to your customers: having your infra on Digital Ocean or having a better customer support?
It's a nice chunk of change, which you could use for other purposes. It might not make or break the company, but it could pay for something that actually generates business.
If you only have Rs. 100 in your pocket, you will think hard before spending Rs. 10. If you have Rs. 1000 in your pocket, you will not mind spending Rs. 10. That said, even if you are financially sound, why in the world would you want to pay $14k extra for a similar service that is available cheaper? That money could be better utilised elsewhere.
when i save money on something without losing performance or reliability, i feel like a real hacker and money saved is just cherry on top of self accomplishment i feel.
I've had excellent experiences with Percona xtrabackup for MySQL migration and backups in general. It runs live with almost no performance penalty on the source. It works so well that I always wait for them to release a new matching version before upgrade to a new MySQL version.
I moved two servers, one from Linode and the other from DO to Hetzner a few months ago, with similar savings. The best part was that the two servers had tens of different sites running, implemented in different languages, with obsolete libraries, MySQL and Redis instances. A total mess. Well: Claude Code migrated it all, sometimes rewriting parts when the libraries where no longer available. Today complex migrations are much simpler to perform, which, I believe, will increase the mobility across providers a lot.
I have just seen with my own eyes Claude astroturfing on a gamedev subreddit from a botting account that was picked up by Google so I could see a few of their other comments. This account's operation was going on development subs complaining about how good Claude's latest model is and how awful it is being afraid of losing one's job to AI.
I know your comment is tongue-in-cheek and the poster here is kinda known, but this kind of astroturfing is a new low and it's everywhere on forums such as these.
I was really confused, then, realized the person you’re replying to misspelled “ad” as "add", and you’re moving forward with the premises GP is an ad, and this HN submission is an ad. Then, you share you saw a Reddit account on a gamedev subreddit complaining AI is too good, & they're worried they won’t have a job, and you believe that Reddit account must have been an ad for AI.
The whole internet is like this now, and it's only just getting started. Makes me sick tbh, and I am still questioning if this is the kind of industry I want to work in.
For those who remember Digg, the recently relaunched a new version and shut it down almost immediately. They were getting hammered with AI bots when it was realized the Digg apparently still has good SEO. The explain it right on homepage.
I see a lot of these posts on Reddit, too, but I don't think it's actually Anthropic or Claude doing it. It's the same old Reddit karma farmers picking up on the latest trends. They've always combined headlines with ragebait to build karma and now LLM bots make it easier than ever.
It's too bad Reddit allows accounts to hide their comment history now. That was an easy way to identify bot accounts before they started allowing accounts to hide their post history
I've been warning people of Anthropic's astroturfing for a while now. The amount of "Insert latest model/Claude Code is scary. I'm worried about my job" posts, followed by a doom ridden writing about how their job was automated and 30 dudes got fired and the person is pivoting into plumbing or something or working at Mcdonalds, is just too suspicious not to note. Sometimes it's more covert. They don't mention any provider/model. Sometimes there's a subtle insert somewhere in the body, Opus, Claude, etc.
It's not necessarily astroturfing. There is a seismic shift under way regarding how things get done in this business, and if you don't acknowledge it, that's weird in itself.
I've certainly noticed a seismic shift in how bad support and updates have gotten with some 3rd party vendors we use, and the answer they come back with is always that they're experimenting with AI. Not saying AI isn't part of the job now, but it is getting seriously over hyped and over extended.
It's absolutely not seismic. If you've used AI for a little bit, you'll realize it's good at writing boilerplate code. Any complex logic, and you better re-read and correct the code a few times until you trust it.
Of course if all you do is "host wordpress website" (like 80% of what's "webdev" do), it will work. Now the issue is that the last 20% are the hardest to cover, and current AI methods will not get there (you need some much more complex methods, like being able to integrate logic with learning-based ML, to do this)
I mean if it were anyone else, yeah I might agree, but I think Salvatore is being genuine here (and have seen Claude do a similarly surprising job fixing ops issues).
I don't think so. I think he's clearly abusing language (saying "Claude Code migrated the stufff", rather than "I migrated the stuff after using Claude to help write boilerplate, then I went on double-checking it, testing it, and then running it")
I don't think you've nailed it either. He SHOULD be saying "54 days ago, I powered on my computer and opened a terminal. From my editor I reviewed my code files and realized I had quite a mess on my hands. Realizing it was the year A.D. 2026, I decided to fire up a modern tool. I typed "claude" into my terminal. As it launched I told it I wanted helping taking my running programs and moving them from the virtual private servers I was running in Linode (inc) and Digital Ocean (co) to Hetzner (LLC). As Claude used it's tool use abilities it read the files and made suggestions on how to do the migrations, it indicated that it could go ahead and copy the files and run the needed commands but I would need to give it permission first. I granted it permission. Once it said the services were running, I instructed it to test that they were accessible and reliable while I reviewed the glowing new code it had written. In summary, with the help of Claude Code I was able to redeploy 37 services in Hetzner."
I think the parent has a point. For how many other accomplishments is the tool framed as the responsible party? We don't say "cranes built the skyscraper", people did. Why do we shift accountability when it comes to AI?
On Monday a crane company announces it’s pivoting to AI, followed by a quick 600% boost to its stock price. I wouldn’t even be surprised at this point.
It would seem that way for sure, if it was just a random anon posting it, but the person you're replying to is the creator of Redis so I feel it's more likely a genuine opinion/experience rather than a Claude ad...
Excuse my ignorance, but how is that migration (especially of older libraries that are apparently being rewritten) not just a copy/paste action from one server to the other? When I build software to deploy it it includes everything it requires library wise. At least the few things I've deployed so far.
Sometimes you need library version X, which uses a compiled binary for the platform, which requires C library version Y, which requires glibc version Z, which is deprecated on the current version of the OS, etc etc etc.
Or you can update the app to remove the dependency on the library.
But honestly, this is what containers or VMs are built for in the first place.
You have to copy data across, and confirm that everything worked correctly, and if you're being fancy about it you need to freeze writes to the old server while you are migrating and then unfreeze after you've directed traffic to the new server. It's not trivial.
Now imagine you can do that with a local model. You're basically breaking lockin on _Every_ end. Simply beautiful. A digital guillotine for the digital elite!
Yeah, at the last job there was a single outdated external wiki server left sitting in DO for those kinds of reasons while everything updated and internal had moved already (if not twice). If it hadn't become such a security risk it would never have been moved.
The problem is a lot of this glue is proprietary by design at the various cloud services. I realize there are open source and alternative abstractions for a lot of of the same services, but there’s still quite a bit of glue if you’re on AWS, for example, and looking to move to bare metal.
But maybe I’m just thinking of the current capabilities of agents, and if we fast forward a couple years, even removing these abstractions or migrating will be very low friction.
But you can run most of the glue on your own dedicated instances.
I run k8s on a bunch of dedicated servers that are super cheap and I have all bells and whistles - just tell your coding agent to do it. You can literally design the thing you would never do yourself and it works brilliantly.
Postgres running on dedicated hardware replicated and with wal backups - easy just tell codebuff (my harness of choice) to do it. Then any number of firewalls, load balancers, bastion servers, etc. if you can imagine it , codebuff will implement it.
Sure, and then you realize it deleted the db to "simplify the migration" lol
Obviously I agree that AI can be useful to write boilerplate, but it's in no way something you should use blindly when trying to do a migration or anything touching prod
So, to be more precise: no, "Claude Code didn't migrate it all". Claude Code helped you write boilerplate so that you could migrate
I too, am bravely using Claude for more DevOps. I run all of my virtual machines on proxmox atop bare metal servers I own and I’m just blown away at how quickly Claude can optimize and set up entire new networks across all of these machines. Truly feels like a coworker or well paid sysadmin.
Linode is going to lose my business in the next couple months as well. Been there over a decade, have referred countless customers to them, but they’ve kept bumping up prices over and over and I can get a dedicated server at Hetzner or other places with 8x the memory, dedicated NVMe disks, dedicated CPU for cheaper.
Sure you lose a little of the benefit of a “virtual” server which can be migrated but Hetzner’s support has always been super fast and capable, should I wind up in a situation where I’ve got downtime.
Each has their trade offs. AWS absolutely has a high premium but Hetzner has some quirks.
Recently we had several of our VMs offline because they apparently have these large volume storage pools they were upgrading and suddenly disks died in two large pools. It took them 3 days to resolve.
Hetzner has no integrated option to backup volumes and its roll your own :/ You also can't control volume distribution on their storage nodes for redundancy.
I don't think it's fair to call AWS a scam. It's complicated and powerful and it charges a lot for many services compared to a DIY approach. But you can see the prices transparently on its site, it provides a free tier to try most services out, it is fairly good about long term support for services and how it handles forced upgrades when they become necessary, and generally it has an OK reputation for customer support even if something unexpected and very bad happens. You're certainly paying a price for the convenience and the brand but I don't think that's a scam if you're making an informed choice. If you want to save money then you can replace RDS with Postgres running on VMs but the trade off is then you have to manage your database infrastructure yourself.
No, they just don't know what value AWS provides. And honestly you'll never know until you roll out your own Dedicated servers and later you'll wonder why you never did it sooner.
Cloud used to be marketed for scalability. "Netflix can scale up when people are watching, and scale down at night".
Then the blogosphere and astroturfing got everyone else on board. How can $5 on amazon get you less than what you got from almost any VPS (VDS) provider 10 years ago?
That’s like saying Mercedes is a scam because you’re fine with a Honda Civic. It’s a totally legitimate preference but not being in the target market doesn’t make something a scam.
AWS ain't no Mercedes. Mercedes feels premium and isn't full of bugs.
AWS and Azure a charging an arm and a leg, but the offered quality is mostly perceived. Most of the bits and bobs they charge for are not providing much value for a vast majority of businesses. I won't even go over the complete lack of ergonomics with their portals.
I see you've never actually owned or worked on a German car, especially in relation to even modest Japanese models. Maybe they were a little nicer inside in the 80s and maybe 90s, but "German car" and frankly "European make" is basically synonymous with "big expensive pile of shit that's an expensive pain in the ass when things start falling apart (which they seem to with increasing rapidity)." It's like the disease that plagued British cars for the longest time got contaminated with the German propensity to build overly complex monstrosities.
What are you doing for DB backups? Do you have a replica/standby? Or is it just hourly or something like that?
Because with a single-server setup like this, I'd imagine that hardware (e.g. SSD) failure brings down your app, and in the case of SSD failure, you then have hours or days downtime while you set everything up again.
Hetzner normally advertises their hardware servers as 2x 1 TB SSD, because it's strongly recommended to run them in SWraid1 for net 1TB. (Their image installer will default to that)
Once the first SSD fails after some years, and your monitoring catches that, you can either migrate to a new box, find another intermediate solution/replica, or let them hotswap it while the other drive takes on.
Of course though, going to physical servers loses redundency of the cloud, but that's something you need to price in when looking at the savings and deciding your risk model.
And yes, running this without also at least daily snapshotting/backup to remote storage is insane - that applies to cloud aswell, albeit easier to setup there.
For over a decade I ran a small scale dedicated and virtual hosting business (hundreds of machines) and the sort of setup you describe works very well. Software RAID across 2 devices, redundant power supplies, backups. We never had a significant data loss event that I recall (significant = beyond user accidentally removing files).
For quite a while we ran single power supplies because they were pretty high quality, but then Supermicro went through a ~6 month period where basically every power supply in machines we got during that time failed within a year, and replacements were hard to come by (because of high demand, because of failures), and we switched to redundant. This was all cost savings trade-offs. When running single power supplies, we had in-rack Auto Transfer Switches, so that the single power supplies could survive A or B side power failure.
But, and this is important, we were monitoring the systems for drive failures and replacing them within 24 hours. Ditto for power supplies. If you don't monitor your hardware for failure, redundancy doesn't mean anything.
> Because with a single-server setup like this, I'd imagine that hardware ...
Yeah. This blog post reads like it was written by someone who didn't think things through and just focused on hyper-agressive cost-cutting.
I bet their DigitalOcean vm did live migrations and supported snapshots.
You can get that at Hetzner but only in their cloud product.
You absolutely will not get that in Hetzner bare-metal. If your HD or other component dies, it dies. Hetzner will replace the HD, but its up to you to restore from scratch. Hetzner are very clear about this in multiple places.
I'm not going to re-write it, the TL;DR is they are making an Apples and Oranges comparison.
Yes they "saved money" but in no way, shape or form are the two comparable.
The polite way to put is is .... they saved as much money as they did because they made very heavy handed "architectural decisions". "Decisions" that they appear to be unaware of having made.
Can you elaborate? I'm coming up with similar designs recently (static site plus redundant servers) but my designs so far assume no database and ephemeral interactions. (Realtime multiplayer arcade games.)
Curious what the delta to pain-in-ass would be if I want to deal with storing data. (And not just backups / migrations, but also GDPR, age verification etc.)
database isn't hard to have HA with, it's actually very easy to do any of this.
i already design with Auto Scale Group in mind, we run it in spot instance which tend to be much cheaper. Spot instances can be reclaimed anytime, so you need to keep this is kind.
I also have data blobs which are memory maped files, which are swapped with no downtime by pulling manifest from GCS bucket each hour, and swapping out the mmaped data.
i use replicas, with automatic voting based failover.
I've used mongo with replication and automative failover for a decade in production with no downtime, no data lost.
Recently, got into postgres, so far so good. Before that i always used RDS or other managed solution like Datastore, but they cost soo much compared to running your own stuff.
Healthchecks start new server in no time, even if my Hertzner server goes out or if whole Hertzer goes out, my system will launch digital ocean nodes which will start soaking up all requests.
Surely you must've noticed that pretty much all of their bare metal offerings ("dedicated" and the stuff on "auction") have multiple disks, allowing for various RAID configurations?
> Surely you must've noticed that pretty much all of their bare metal offerings ("dedicated" and the stuff on "auction") have multiple disks, allowing for various RAID configurations?
I don't know where to start with this comment. Do I really need to spell out the difference between cloud and bare metal ?
A few examples...
- Live migration ? Cloud only.
- Snapshots ? Cloud only.
- Want to increase disk space ? Tick box in cloud vs. replace disks (or move to different machine) and re-install/restore in bare metal....
- Want to increase RAM ? Tick box in cloud vs. shutdown, pull out of rack, install new chips (or move to different machine and re-install/restore)....
- Want to upgrade to a beefier processor ? Tick box in cloud vs move to a completely different machine and re-install/restore
> Well you did say your data is lost when a disk fails, which is not true.
Well, technically its still a possibility.
I am old enough to have seen issues with RAID1 setups not being able to restore redundancy, as well as RAID controller failures and software RAID failures.
Also, frankly you are being somewhat pedantic. My broader point was regarding cloud. I gave HD Failure as one example, randomly selected by my brain ... I could have equally randomly chosen any of the other items ... but this time, my brain chose HD.
You can get snapshots and live migrations working on-prem. The cloud isn't magic, it's just servers with hypervisors and software running on top of them. You can run that same software.
Also, with something like Hetzner you would not be going in and physically doing anything. You also just tick a box for a RAM upgrade, and then migrate over or do active/passive switch.
The cloud does have advantages, mostly in how "easy" it is to do some specific workflows, but per-compute it's at least 10x the cost. Some will argue it's less than that, but they forget to factor in just how slow virtual disks and CPU are. Cloud only makes sense for very small businesses, in which the operational cost of colocation or on-prem hosting is too expensive.
It's possible no one will care much if it's down even for that long. I couldn't care less if my HOA mobile app was down even for a week for example. We don't need constant uptime for everything.
Don’t forget that integrity matters as much as availability in many applications. You might not mind if your HOA takes time to bring a server back up but you’d care a lot more if they lost the financial records or weren’t able to recover from a ransomware attack.
If that's the tradeoff they're willing to make, who are you to say that they're doing it wrong?
Not every app needs 24/7 availability. The vast majority of websites out there will not suffer any serious consequences from a few hours of downtime (scheduled or otherwise) every now and then. If the cost savings outweigh the risk, it can be a perfectly reasonable business decision.
A more interesting question would be what kind of backup and recovery strategy they have, and which aspects of it (if any) they had to change when they moved to Hetzner.
You have to deal with a lot more stuff. You have to order/pay for a server (capex), mount it somewhere, wire up lights-out-mgmt and recovery and do a few more tasks that the provider has already done.
Then, say if the motherboard gives up, you have to do quite a bit of work to get it replaced, you might be down for hours or maybe days.
For a single server I don't think it makes sense. For 8 servers, maybe. Depends on the opportunity cost.
Have you done this yourself? If you haven't I think you'd discover server hardware is actually shockingly reliable. You could go years without needing to physically touch anything on a single machine. I find that people who are used to cloud assume stuff is breaking all the time. That's true at scale, but when you have a handful of machines you can go a very long time between failures.
If you have failover redundancy of services across your systems of some kind to mitigate then great. With proper setup no worries. I guess it depends how much you want to take on vs hand off.
Yes, having done this for decades, it happens often enough that you need to plan for it. You need to have redundancy, spare parts, and staffing or you are basically gambling. All of this has to be tested, too, or you might find that your failover mechanism has dependencies you didn’t plan for or unexpected failure modes (I’ve twice experienced data center hard outages due to the power distribution system failing oddly when switching between mains and UPS power, or UPS and generator).
Using something like AWS can make it easy to assume that servers don’t fail often but that’s because the major players have all of that behind the scenes, heavily tested, and will migrate VMs when prefail indicators trigger but before stuff is done.
“Your own server in a colo” means going to the colo to swap RAM or an SSD when something goes wrong. You rent a server and the benefit is the rentor has spare parts on hand and staff to swap parts out.
The problem with actually owning hardware is that you need a lot of it, and need to be prepared to manage things like upgrading firmware. You need to keep on top of the advisories for your network card, the power unit, the enterprise management card, etc. etc. If something goes wrong someone might need to drive in and plug in a keyboard.
Eventually we admitted to ourselves we didn't want those problems.
The migration sharing is admirable and useful teaching, thank you!
I see the DigitalOcean vs Hetzner comparison as a tradeoff that we make in different domains all day long, similar to opening your DoorDash or UberEats instead of making your own dinner(and the cost ratio is similar too).
I work in all 3 major clouds, on-prem, the works. I still head to the DigitalOcean console for bits and pieces type work or proof of concept testing. Sometimes you just want to click a button and the server or bucket or whatever is ready and here's the access info and it has sane defaults and if I need backups or whatnot it's just a checkbox. Your time is worth money too.
> Sometimes you just want to click a button and the server or bucket or whatever is ready and here's the access info and it has sane defaults and if I need backups or whatnot it's just a checkbox. Your time is worth money too.
You're describing Hetzner Cloud, which has been like this for many years. At least 6.
Hetzner also offers Hetzner Cloud API, which allows us to not have to click any button and just have everything in IaC.
Cute; I'd somehow missed ever seeing that one. The omitted con of electric engines (costs way more to build batteries than a gas tank so you're likely to have more expensive storage AND less of it) makes the XKCD joke miss. BUT... since there's probably something that Digital Ocean offers that Hetzner doesn't, that might actually be a very appropriate XKCD for the situation, precisely because there's a tradeoff the XKCD didn't mention. (I haven't used Hetzner so I don't know firsthand what the tradeoff is, but a quick search suggests Hetzner doesn't do Kubernetes so that might be the tradeoff for some people. Or it might be something else, everybody has their own situation).
One is about all the steps of zero downtime migration. It's widely applicable.
The other is the decision to replace a cloud instance with bare metal. It saves a lot in costs, but also the loss of fast failover and data backups is priced in.
If I were doing this, I would run a hot spare for an extra $200, and switched the primary every few days, to guarantee that both copies work well, and the switchover is easy. It would be a relatively low price for a massive reduction of the risk of a catastrophic failure.
I had my fair share of Hyperscaler -> $something_else migrations during the past year. I agree, especially with rented hardware the price-difference is kind of ridiculous.
The issue is though, that you loose the managed part of the whole Cloud promise. For ephemeral services this not a big deal, but for persistent stuff like databases where you would like to have your data safe this is kind of an issue because it shifts additional effort (and therefore cost) into your operations team.
For smaller setups (attention shameless self-promotion incoming) I am currently working on https://pellepelster.github.io/solidblocks/cloud/index.html which allows to deploy managed services to the Hetzner Cloud from a Docker-Compose like definition. E.g. a PostgreSQL database with automatic backup and disaster recovery.
I wish we had something like Hetzner dedicated near us-east-1.
They do offer VPS in the US and the value is great. I was seriously looking at moving our academic lab over from AWS but server availability was bad enough to scare me off. They didn't have the instances we needed reliably. Really hoping that calms down.
As such, I doubt the noted price reduction is reproducible. Combine this with Hetzner's sudden deletions of user accounts and services without warning, and it's a bad proposition. Search r/hetzner and r/vps for hetzner for these words: banned, deleted, terminated; there are many reports. What should stun you even more about it is that Hetzner could ostensibly be closely spying on user data and workloads, even offline workloads, without which they won't even know who to ban.
The only thing that Hetzner might potentially be good for is to add to an expendable distributed compute pool, one that you can afford to lose, but then you might as well also use other bottom-of-the-barrel untrustworthy providers for it too, e.g. OVH.
You could have loaded the Hetzner pricing page and checked - the server in the article is currently listed around $30/month higher. Not enough to materially change the equation
That's a trend which is more and more common nowadays.
I wish the industry would adopt more zero knowledge methods in this regards. They are existing and mathematically proven but it seems there is no real adoption.
- OpenAI wants my passport when topping up 100 USD
- Bolt wanted recently my passport number to use their service
- Anthropic seems wants to have passports for new users too
- Soon age restriction in OS or on websites
I wished there would be a law (in Europe and/or US) to minify or forbid this kind of identity verification.
I want to support the companies to not allow misuse of their platforms, at the same time my full passport photo is not their concern, especially in B2B business in my opinion.
I'm not a legal expert/lawyer but I do think a lot of this is not the company just randomly wanting to do it, but lawyer driven development. No company wants to introduce more friction for no reason, unless somehow there's precedent or risk involved in not doing it. Curious to know what legal precedents or laws have changed recently.
The only possible non legally driven reason I can think of would be if they think the tradeoff of extra friction (and lost customers) is more than offset by fraud protection efforts. This seems unlikely cause I don't see how that math could have changed in the last few years.
I dont. I'm happy the grift economy has some controls on it. As much as I love open source and all the efforts in collective without government interference; some security is required, otherwise we'll just invite more grift based economics.
It's bad enough living in America without the rest of the world adopting the grift economy.
It's partially because the internet only grants us free storage (noun), not free compute (verb).
Which is fundamental to so many XY problems, including why cloud services are so byzantine instead of just providing isolated secure shells with full root access within them. And why distrust is a growing force in the world instead of, say, unconditional love.
I always dreamed of winning the internet lottery so that I could help dismantle the systems of control which currently dominate our lives. Which starts with challenging paradigms from first principles. That looks like asking why we only have multicore computing in the cloud and not on our desktops (which could be used to build our own cloud servers).
When we're missing an abstraction layer, that creates injustice and a power drain from the many to the few. Some examples:
- CPU -> multicore MIMD (missing) -> GPU (based on the subset SIMD instead of MIMD upon which graphics libraries could be built)
- UDP -> connectionless reliable stream (missing) -> TCP (should have been a layer above UDB not beside it)
- UDP/TCP -> P2P (NAT and other limitations block this and were inherited by IPv6 as generational trauma) -> WebRTC (redundant if we had P2P that "just works")
- internet connection -> symmetric upload/download speed (blocked for legal reasons under the guise of overselling to reduce cost) -> self-hosted web servers (rare due to antitrust issues stemming from said legal reasons)
- internet connection -> multicast (missing due to suppression of content-addressable-memory/hash-tree/DHT/) -> self-hosted streaming (negates the need for regions and edge caching)
I had high hopes for Google and even Tesla (for disrupting the physical world). But instead of open standards, they gave us proprietary vendor lock-in: Google Workspace (formerly G Suite) and NACS instead of J1772 (better yet both). Because of their refusal to interoperate at the lowest levels, there is little hope that they will do the real work of solving the hard problems at the highest levels.
For example, I just heard that China has built thousands of battery swap stations to provide effectively instant charging for electric vehicles, whereas that's something that Tesla can't accomplish because they chose to build Supercharger stations instead.
Once we begin to see the world this way, it's impossible to unsee it. It calls into question the fundamentals (like scarcity) which capitalism is based upon, and even the concept of profit itself.
From a spiritual perspective, I believe that this understanding is what blocks me from using my talents to use the system for personal gain to win the internet lottery. The people who own the systems of control don't have this understanding, and even view its basis in empathy as a liability. So we sacrifice the good of the many for the good of the few and call that progress.
They have to operate within the laws of the countries they’re physically located in. Those countries want to know that they’re not hosting illegal content, providing services to crime rings, Russia or North Korea, etc.
If Hetzner allows you to host something and you use it for illegal acts, they aren’t going to jail to shield you for €10/month.
Hetzner is like 1/10th the cost of ripoffs like AWS now, the passport data is deleted after verification and I can actually trust this claim coming from an EU company under GDPR that doesn't have any use for my personal data. You can also just bypass the passport requirement entirely by making a €20 Paypal deposit to the account.
You just hear too many horror stories of data being leaked. Even if Hetzner uses a 3rd party system to do the verification - that 3rd party probably has to store your pics for some time.
But at least if there is an alternative then great.
They do not. I've never had to present any documentation whatsoever to Hetzner and have been a happy customer for many years.
As I understand it, they ask only from accounts that check several boxes for common cases of abuse. So basically, personal accounts (as opposed to business accounts) from poor countries (by per capita, so e.g. India qualifies as poor).
I did the same this year. I really liked Digital Ocean though, compared to more complex cloud offerings like AWS. AWS feels like spending more for the same complexity. At least DO feels like it does save time and mental band width. Still though, the performance of cloud VPS is abysmal for the price. I'm now on Hetzner + K3's plus Flux CD (with Cloudflare for file storage (R2) and caching. I run postgres on the same machine with frequent dump backups. If I ever need realtime read replicas, I'll likely just migrate the DB to Neon or something and keep Hetzner with snapshots for running app containers.
This is something we've[0] done a number of times for customers coming from various cloud providers. In our case we move customers onto a multi-server (sometimes multi-AZ) deployment in Hetzner, using Kubernetes to distribute workloads across servers and provide HA. Kubernetes is likely a lot for a single node deployment such as the OP, but it makes a lot more sense as soon as multiple nodes are involved.
For backups we use both Velero and application-level backup for critical workloads (i.e. Postgres WAL backups for PITR). We also ensure all state is on at least two nodes for HA.
We also find bare metal to be a lot more performant in general. Compared to AWS we typically see service response times halve. It is not that virtualisation inherently has that much overhead, rather it is everything else. Eg, bare metal offers:
- Reduced disk latency (NVMe vs network block storage)
- Reduced network latency (we run dedicated fibre, so inter-az is about 1/10th the latency)
- Less cache contention, etc [1]
Anyway, if you want to chat about this sometime just ping me an email: adam@ company domain.
Moving around k8s deployments is really nice. Very little vendor lockin compared to many of the cloud things you can buy.
My entire stack is.. k8s, hosted Postgres, s3 type storage. I can always host my own Postgres. So really down to k8s and s3. I think hetzner has some kind of s3 storage but haven’t looked into, and I assume moving in 100 TB is a process….
> We also find bare metal to be a lot more performant in general
I measured this several years back and never looked at virtual servers again. Since CPU time isn't reserved (like RAM is), the performance is abysmal compared to real hardware.
The "where's the HA?" comments are missing that this was a single DO droplet before. The migration didn't reduce redundancy, it just moved the single point of failure from one provider to another for 1/6 the cost. The HA conversation is worth having, but it's a separate conversation from this migration.
And DigitalOcean customer support is non-existent. I had a mail server down and they cut service instead of trying to contact me in any other way. But worse, when they do that, they immediately destroy your data without any possibility to restore. Or at least that's what they told me with their bog standard, garbage support replies. I was a customer for nearly a decade. After it happened, I realized that never would have happened on GCP, AWS, etc. Because they take billing seriously with multiple contact info, a recovery period, etc. All the things a company would be expected to do to maintain good relationships with customers during a billing issue that lasts a few weeks. That was a couple of years ago, so maybe they fixed some stuff. But the complete lack of support and unprofessional B2B practices was an eye opener.
DigitalOcean just absolutely is just not an enterprise solution. Don't trust it with your data.
Oh, and did I mention I had been paying the upcharge for backups the entire time?
Given the premise that zero day exploits are going to be frequent going forward, I feel like there is a new standard for secure deployment.
Namely, all remote access (including serving http) must managed by a major player big enough to be part of private disclosure (e.g. Project Glasswing).
That doesn't mean we have to use AWS et al for everything, but some sort of zero trust solution actively maintained by one of them seems like the right path. For example, I've started running on Hetzner with Cloudflare Tunnels.
Am I missing something? I'm genuinely surprised it was not deployed from the start on a dedicated server. Don't you make a cost analysis before deploy? And if the cost analysis was ok at initial deploy, why wait to have such a difference in cost before migrating? How much money goes wasted in such situations?
Managed services have value. It's less to set up, less to maintain, and less worrying about waking up at 3am when something breaks.
I've spent time eating the costs of things like DigitalOcean or SaaS products because my time is better spent growing my revenue than reducing infrastructure costs. But at some point, costs can grow large enough that's it's worthwhile to shift focus to reducing infrastructure spend.
Just watch out Hetzner don’t fail to take a payment from you from their end then proceed to flag your account for non-payment all while communicating absolutely nothing about this to you arriving at the conclusion they will delete all your servers and ban your account and identity from ever using them again.
Happened to me.
I now advise people to avoid clown-led services like Hetzner and stick to more reputable, if not as cheap, options.
I also have used DO for years, and was very happy with the quality of their service. Until I found the alternative prices. Not as easy to use, but much better performance for much lower prices.
I started with DO in 2013 when they offered 20GB SSD, 512MB RAM for $5/mo. For some reason I paid no VAT then, but I do now. Their $4/mo option now is still 512MB, still 1 vCPU, but 10GB SSD. So it's like the last decade of technological progress with regards to RAM, CPU and storage that should either lead to price cuts/spec bumps didn't happen. And yeah, DO got expensive before AI bought up all the memory.
Every time I see this kind of article, no one really bothers about sb/server redundancy, load balancers, etc. are we ok with just 1 big server that may fail and bring several services down?
You saved a lot of money but you'll spend a lot of time in maintenance and future headaches.
I wondered the same! FWIW I'm currently migrating from managed postgres to self-managed on hetzner with [autobase](https://autobase.tech/). Though of course for high availability it requires more than one server.
It depends on the service and how critical that website is.
Sometimes it's completely acceptable that a server will run for 10 years with say 1 week or 1 month of downtime spread over those 10 years, yes. That's the sort of uptime you can see with single servers that are rarely changed and over-provisioned as many on Hetzner are. Some examples:
Small businesses where the website is not core to operations and is more of a shop-front or brochure for their business.
Hobby websites too don't really matter if they go down for short periods of time occasionally.
Many forums and blogs just aren't very important too and downtime is no big deal.
There are a lot of these websites, and they are at the lower end of the market for obvious reasons, but probably the majority of websites in fact, the long tail of low-traffic websites.
Not everything has to be high availability and if you do want that, these providers usually provide load balancers etc too. I think people forget here sometimes that there is a huge range in hosting from squarespace to cheap shared hosting to more expensive self-hosted and provisioned clouds like AWS.
Also, in general, you can architect your application to be more friendly to migration. It used to be a normal thing to think about and plan for.
VMware has a conversion tool that converts bare metal into images.
One could image, then do regular snapshots, maybe centralize a database being accessed.
Sometimes it's possible to create a migration script that you run over and over to the new environment for each additional step.
Others can put a backup server in between to not put a load on the drive.
Digital Ocean makes it impossible to download your disk image backups which is a grave sin they can never be forgiven for. They used to have some amount of it.
Still, a few commands can back up the running server to an image, and stream it remotely to another server, which in turn can be updated to become bootable.
This is the tip of the iceberg in the number of tasks that can be done.
Someone with experience can even instruct LLMs to do it and build it, and someone skilled with LLMs could probably work to uncover the steps and strategies for their particular use case.
Well why have downtime if you can avoid it with a bit of work?
But I do agree the poster should think about this. I don't think it's 'off' or misleading, they just haven't encountered a hardware error before. If they had one on this single box with 30 databases and 34 Nginx sites it would probably be a bad time, and yes they should think about that a bit more perhaps.
They describe a db follower for cutover for example but could also have one for backups, plus rolling backups offsite somewhere (perhaps they do and it just didn't make it into this article). That would reduce risk a lot. Then of course they could put all the servers on several boxes behind a load-balancer.
But perhaps if the services aren't really critical it's not worth spending money on that, depends partly what these services/apps are.
to be fair a lot of ppl still run this way and just have really good backups, or have an offline / truly on-prep server where they can flip the dns switch in case of true outage.
Yes and for many services that is totally fine. As long as you have backups of data and can redeploy easily. It's not how I personally do things usually but there is definitely a place for it.
I run internal services on DO that I've considered moving to Hetzner for cost savings.
Could I take it down for the afternoon? Sure. Or could I wait and do it after hours? Also sure. But would I rather not have to deal with complaints from users that day and still go home by 5pm? Of course!
Respectfully, this type of "high availability" strawman is a dated take.
This is a general response to it.
I have run hosting on bare metal for millions of users a day. Tens of thousdands of concurrent connections. It can scale way up by doing the same thing you do in a cloud, provision more resources.
For "downtime" you do the same thing with metal, as you do with digital ocean, just get a second server and have them failover.
You can run hypervisors to split and manage a metal server just like Digital Ocean. Except you're not vulnerable to shared memory and cpu exploits on shared hosting like Digital Ocean. When Intel CPU or memory flaws or kernel exploits come out like they have, one VM user can read the memory and data of all the other processes belonging to other users.
Both Digital Ocean, and IaaS/PaaS are still running similar linux technologies to do the failover. There are tools that even handle it automatically, like Proxmox. This level of production grade fail over and simplicity was point and click, 10 years ago. Except no one's kept up with it.
The cloud is convenient. Convenience can make anyone comfortable. Comfort always costs way more.
It's relatively trivial to put the same web app on a metal server, with a hypervisor/IaaS/Paas behind the same Cloudflare to access "scale".
Digital Ocean and Cloud providers run on metal servers just like Hetzner.
The software to manage it all is becoming more and more trivial.
I'm not arguing for cloud or against bare metal hosting, just saying there is a broad range of requirements in hosting and not everyone needs or wants load balancers etc - it clearly will cost more than this particular poster wants to pay as they want to pay the bare minimum to host quite a large setup.
> This level of production grade fail over and simplicity was point and click, 10 years ago.
While some of the tools are _designed_ for point and click, they don't always work. Mostly because of bugs.
We run Ceph clusters under our product, and have seen a fair share of non-recoveries after temporary connection loss [1], kernel crashes [2], performance degradations on many small files, and so on.
Similarly, we run HA postgres (Stolon), and found bugs in its Go error checking cause failure to recover from crashes and full-disk conditions [3] [4]. This week, we found that full-disk situations will not necessarily trigger failovers. We also found that if DB connections are exhausted, the dameon that's supposed to trigger postgres failover cannot connect to do that (currently testing the fix).
I believe that most of these things will be more figured out with hosted cloud solutions.
I agree that self-hosting HA with open-source software is the way to. These softwares are good, and the more people use them, the less bugs they will have.
But I wouldn't call it "trivial".
If you have large data, it is also brutally cheaper; we could hire 10 full-time sysadmins for the cost of hosting on AWS, vs doing our own Hetzner HA with Free Software, and we only need ~0.2 sysadmins. And it still has higher uptime than AWS.
It is true that Proxomox is easy to setup and operate. For many people it will probably work well for a long time. But when things aren't working, it's not so easy anymore.
I feel like 95% of the web falls into this category. Like, have you ever said "That's it, I am never gonna visit this page again!", because of temporary downtime? Unless you are Amazon and every minute costs you bazillions, you are likely gonna get the better deal not worrying about availability and scalability. That 250€/m root server is a behemoth. Complete overkill for most anything. As a bonus, you are gonna be half the internet, when someone at AWS or Cloudflare touches DNS.
> Like, have you ever said "That's it, I am never gonna visit this page again!", because of temporary downtime?
That's a strawman version of what happens.
There have been times when I've tried to visit a webshop to buy something but the site was broken or down, so I gave up and went to Amazon and bought an alternative.
I've also experienced multiple business situations where one of our services went down at an inconvenient time, a VP or CEO got upset, and they mandated that we migrate away from that service even if alternatives cost more.
If you think of your customers or visitors as perfectly loyal with infinite patience then downtime is not a problem.
> Unless you are Amazon and every minute costs you bazillions, you are likely gonna get the better deal not worrying about availability and scalability. That 250€/m root server is a behemoth. Complete overkill for most anything.
You don't need every minute of downtime to cost "bazillions" to justify a little redundancy. If you're spending 250 euros/month on a server, spending a little more to get a load balancer and a pair of servers isn't going to change your spend materially. Having two medium size servers behind a load balancer isn't usually much more expensive than having one oversized server handling it all.
There are additional benefits to having the load balancer set up for future migrations, or to scale up if you get an unexpected traffic spike. If you get a big traffic spike on a single server and it goes over capacity you're stuck. If you have a load balancer and a pair of servers you can easily start a 3rd or 4th to take the extra traffic.
> There have been times when I've tried to visit a webshop to buy something but the site was broken or down, so I gave up and went to Amazon and bought an alternative.
Great. So how much did the webshop lose in that hour of maintenance (which realistically would be in the middle of the night for their main audience) and how much would they have paid for redundancy? Also a bit hard to believe you repeatedly ran into the situation of an item sold at a self-hosted webshop and Amazon alike. Are you sure they haven't just messed up the web dev biz? You could totally do that with AWS too...
> If you're spending 250 euros/month on a server, spending a little more to get a load balancer and a pair of servers isn't going to change your spend materially.
Of course, but that's not the argument. It's implied you can just double the 250€/m server for redundancy, as you would still get an offer at the fraction of cloud prices. But really that server needs no more optimization in terms of hardware diversification. As I said, it's complete overkill. Blogs and forums could easily be run on a 30€/m recycled machine.
Exactly. I've never not bought something because the website was temporarily down. I've even bought from b&h photo!
Even if Amazon was down, if I was planning to buy, I'd wait. heck, I got a bunch of crap in my cart right now I haven't finished out.
Intentional downtime lets everyone plan around it, reduces costs by not needing N layers of marginal utility which are all fragile and prone to weird failures at times you don't intend.
For me at least, the only thing where availability really matters is main personal communication services. If Signal was down for an hour, I'd be a little stressed. Maybe utilities like public transportation, too, but that's because I now have to do that online.
> Intentional downtime lets everyone plan around it, reduces costs by not needing N layers of marginal utility which are all fragile and prone to weird failures at times you don't intend.
Quite frankly, I would manage if things were run "on-supply" with solar and would just go dark at night.
A week of downtime every decade I think still works out to a higher uptime than I've been getting from parts of GitHub lately. So I'd consider that a win.
They may be making this decision based on a long history of, in fact, never really having run into "a lot of time in maintenance and future headaches".
To be fair, I migrated a VPS from Linode to Hetzner a few years ago. Minor downtime is a non-issue: personal website and email server. I approximately halved the monthly cost, and I haven't had any downtime except what I caused myself when rebooting to upgrade the kernel every now and then.
These articles are popular where there's a mismatch between application requirements and the solution chosen. When someone over-engineers their architecture to be enterprise-grade (substitute your own definition of enterprise-grade) when really they were running a hobby project or a small business where a day of downtime every once in a while just means your customers will come back the next day, going all-out on cloud architecture is maybe not necessary. That's why you see so many comments from people arguing that downtime isn't always a big deal or that risking an outage is fine: There are a lot of applications where this is kind of true.
The confusing part about this article is the emphasis on a zero-downtime migration toward a service that isn't really ideal for uptime. It wouldn't be that expensive to add a little bit of architecture on the Hetzner side to help with this. I guess if you're doing a migration and you're paid salary or your time is free-ish, doing the migration in a zero downtime way is smart. It's a little funny to see the emphasis on zero downtime juxtaposed to the architecture they chose where uptime depends on nothing ever failing
Clever architecture will always beat cleverly trying to pick only one cloud.
Being cloud agnostic is best.
This means setting up a private cloud.
Hosted servers, and managed servers are perfectly capable of near zero downtime. this is because it's the same equipment (or often more consumer grade) that the "cloud" works on and plans for even more failure.
Digital Ocean definitely does not guarantee zero downtime. That's a lot of 9's.
It's simple to run well established tools like Proxmox on bare metal that will do everything Digital Ocean promises, and it's not susceptible to attacks, or exploits where the shared memory and CPU usage will leak what customers believe is their private VPS.
Nothing ever failing in the case of a tool like Proxmox is, install it on two servers, one VPS exists on both nodes (you connect both servers as nodes), click high availability, and it's generally up and running. Put cloudflare in front of it like the best preference practices of today.
If you're curious about this, there's some pretty eye opening and short videos on Proxmox available on Youtube that are hard to unsee.
Sadly, hardware breaks. You still need a working backup and a working failover plan, even if it's just setting up a new server and running your Terraform / Pulumi / Saltstack scripts.
Indeed, I missed the "two servers" part; a two-node mirrored config is what I suggested myself elsewhere in the thread. It's still much less expensive than anything comparable in the cloud.
What are you running on it is the only question which matters, obviously you dont want air traffic control to go down but some app… So what if it goes down? Backup is somewhere else if you even need it anyway. Github has uptime less than 90% according to this: https://mrshu.github.io/github-statuses/ . And the world keeps turning. Obviously we should strive for better, but also lets please not continue making this uptime fetish out of it, for vast majority of the apps it absolutely doesnt fucking matter.
To be fair they were using a single VM on DigitalOcean, so they didn't had the perks of a cloud provider, except maybe the fact that a VM is probably more fault-tolerant than a bare metal server.
Usually those articles describe two situations:
- they were "on the cloud" for the wrong reasons and migrating to something more physical is the right approach
- they were "on the cloud" for the right reasons and migrating to something more physical is going to be a disaster
Here they appear to be in the first situation.
If their setup was running fine on DO and they put the right DR policies in place at Hetzner, they should be fine.
Also, don't underestimate the reliability of simplicity.
I was a Linux sysadmin for many years, and I have never seen as much downtime from simpler systems as I routinely see from the more complicated setups. Somewhere between theory and reality, simpler systems just comes out ahead most of the time.
I agree with you, even for the servers I am responsible for I always make decisions like putting db on supabase instead of local, hosting files on s3 with versioning/multi region etc. then of course come up with a backup and snapshot system.
I was thinking the same. A managed database is just set and forget pretty much. I do NOT miss the old times where I had to monitor my email from routine security checkups hoping my database didn't get hacked by some script kiddie accompanied by blackmail over email.
If you have the setup within server fully scripted and automated (bash, pyinfra or ansible etc) and backups are in place then recovery isn't that hard. Downtime for sure maybe couple of hours for which you can point your DNS entries to a static page while you're restoring everything.
DO doesn't do high availability droplets, and their migration policy is will try, if we detect poor health of server before it fails.
If someone starts thinking about redundancy and load balancers than DO's solution is rent a second similar sized droplet, and then add their load balancing service. If you do those things with Hetzner instead, you would still be spending less than you did with Digital Ocean.
Personally, what is keeping me on DO is that no single droplet I have is large enough to justify moving on its own, and I'm not prepared to deal with moving everything.
To be fair, modern dedicated servers at hetzner have two power units, and come with a redundand ssd/hdd raid-1 config. AFAIK both ssd and power unit having hotplug capability, so in case either fails they can be replaced with zero downtime.
Given the downtimes we saw in the past year(s) (AWS, Cloudflare, Azure - the later even down several times), I would argue moving to any of the big cloud providers give you not much of a better guarantee.
I myself am a Hetzner customer with a dedicated vServer, meaning it is a shared virtual server but with dedicated CPUs (read: still oversubscribed, but some performance guarantee) and had zero hardware-based downtime for years [0]. I would guess their vservers are on similar redundant hardware where the failing components can be hotswapped.
[0] = They once within the last 3 years sent me an email that they had to update a router that would affect network connectivity for the vServer, but the notification came weeks in advance and lasted about 15 minutes. No reboot/hardware failure on my vServer though.
The vast majority of services are actually alright with a little downtime here and there. In exchange, maintenance is a lot simpler with less moving parts.
People underestimate how far you can go with one or two servers. In fact, what I have seen in ky career is many examples of services that should have been running on one or two servers and instead went for a hugely complex microserviced approach, all in on Cloud providers and crazy requirements of reliability for a scale that never would come.
People also tend underestimate how much compute these dedicated servers got, compared to cloud offerings, and what that feels like without 100 layers of management abstraction in-between. You are likely not going to ever choke a plenty-cored, funny-RAMed root server at a fraction of your cloud costs. This overkill resource estate can be the answer to a lot of scalability worries. It's always there, no sharing shit all.
I had like... less than 10 minutes downtime on Hetzner in years (funny enough, that makes my personal containers more reliable than productionized AWS and GCP deployments with their constant partial outages).
So perhaps all that complexity (beyond maybe a backup container) isn't really necessary for companies where a bit of downtime doesn't really affect revenue?
Like, I know Leetcode tells otherwise, but most companies really don't need full FAANG stack with 99.999% uptime. A day of outage in a few years isn't going affect bottom lines.
I don't know about Hetzner but with Upcloud and Vultr my single VPS setups have been more reliable than multiregion with redundancy setups with other providers like Fly.
A few weeks ago, I tested deploying Rails apps to Hetzner and Vultr for the first time using Hatchbox to deploy Rails apps onto them. I'm still supporting clients on Heroku, but there are potential new projects in the coming months that I might deploy elsewhere. Render is decent in some cases, but you can get a lot of bang for your buck deploying on Vultr, and Hatchbox makes it easy to do, whether you have one instance or a cluster. Hatchbox also helps with putting multiple apps/domains on a single server, a concept I had to give up long ago on Heroku. I've thought about deploying to DO plenty of times over the years, but there was always Heroku, and if I had to find a new home for Rails 8, I think I'd skip it in favor of a more powerful Vultr server. Hatchbox can provision Postgres for you, but Vultr has managed Postgres which is appealing to me. Or if you're just using Sqlite with Rails 8, that's easy to do with Hatchbox but not on Render since Render has an ephemeral file system.
Downtime happens in all different contexts of life that a web site/service being knocked offline is soo far down the priority list for most people.
It’s amusing that the US government can shutdown for days/weeks/months over budget reasons and there’s no adult discussions that take place about fixing the cause. Yet the latest HN demo that 100 people will use need all 9’s reliability and hundreds of responses.
I already made a comment here about testing Hatchbox. You point it to your servers and it can set up a cluster and load balancer with a few button clicks.
I think Digital Ocean is not something where I would worry about costs. I would prefer server like Hetzner but I don't think DO is service where the costs are such that we need to do movement.
Plus, this is not what DHH was doing, he was not saving few bucks, but unlocking potential for his company to thrive.
Ah yes, create db replica, promote replica to primary. Seems so simple!
When I’ve seen this work well, it’s either built into the product as an established feature, or it’s a devops procedure that has a runbook and is done weekly.
Doing it with low level commands and without a lot of experience is pretty likely to have issues. And that’s what happened here.
> Old server: CentOS 7 — long past its end-of-life, but still running in production. New server: AlmaLinux 9.7 — a RHEL 9 compatible distribution and the natural successor to CentOS.
So they did same mistake all over again. Debian or Ubuntu would just be upgrade-in-place and migrate
Congrats on doing this successfully, but your setup is amateur. This would have been infinitely easier if you were using IaC (Terraform/Ansible), containerized applications (that you're not already doing that is madness), and had a high-availability cluster setup in place already. It sounds like avoiding downtime is important to you, yet there's no redundancy in the existing stack at all, and everything is done by hand.
This isn't something others should use as an example.
> Old server nginx converted to reverse proxy We wrote a Python script that parsed every server {} block across all 34 Nginx site configs, backed up the originals, and replaced them with proxy configurations pointing to the new server. This meant that during DNS propagation, any request still hitting the old IP was silently forwarded. No user would see a disruption.
What was the config on the receiving side to support this? Did you whitelist the old server IP to trust the forwarding headers? Otherwise you’d get the old server IP in your app logs. Not a huge deal for an hour but if something went wrong it can get confusing.
does anyone else start to wonder about these companys issuing vps/online space with no hardening and no warning
you can basically go on hetzner and spin up a vps with linux that is exposed to the open internet with open ports and user security and within a few hours its been hacked, there is no like warning pop up that says "if you do this your server will be pwnd"
i especialy wonder with all the ai provisioned vps and postgres dbs what will happen here
> 30 MySQL databases (248 GB of data)
> 34 Nginx virtual hosts across multiple domains
> GitLab EE (42 GB backup)
> Neo4J Graph DB (30 GB graph database)
> Supervisor managing dozens of background workers
> Gearman job queue
> Several live mobile apps serving hundreds of thousands of users
He's doing all of that on a single server?!
I'm not against vertical scaling and stuff, but 30 db instances in one server is just crazy.
It's an average of 8GB per database, I guess he serves multiple clients and decided to "segregate" each client on its instance. If it's acceptable for the business it's nothing wrong with his setup.
If you’re migrating a large MySQL database and you’re not
using mydumper/myloader, you’re doing it the hard way.
If you aren't using xtrabackup you are doing it wrong. I recently migrated a database with 2TB of data from 5.7 to 8.4 with about 15 seconds of down time. It wouldn't have been possible without xtrabackup. Mysqldumper requires a global write block, I wouldn't call blocking writes for hours a "zero downtime migration".
Correct me if I'm wrong, but done with a proxy in-between that can "pause" requests, you could have done the move with 0 seconds and no rejected requests, and I don't think mydumper/myloader/xtrabackup matters for that. The "migration" would be spinning up a new database, making it catch up, then switching over. If you can pause/hang in-flight requests while switching, not a single one needs to fail :)
The "making it catch up" is the tricky part. You need an initial backup for that. xtrabackup can take that backup "hot" without blocking read/writes. mysqldumper will block writes for whatever time that initial backup takes, for 2TB of data that's going to be hours.
Once you have that initial back up you can set your replica and make it catch up , then you switch. I choose to take the few seconds of downtime doing the switch because for my use case that was acceptable.
If you want a consistent backup that you can use to setup a replica you need to block writes while the backup is taken, take the backup while the database is shutdown OR use xtrabackup.
I have experience in migrating large DBs with replication and the article not discussing write blocks made my ears perk up as well.
Aside from the blocking you mentioned during the initial snapshot, you'd need to block writes to the old DB before the cutover as well. There's no way to guarantee in-flight writes to the old DB aren't lost when promoting the replica to a primary otherwise. I'm surprised the author didn't go into more detail here. Maybe it was fine given their workload, but the key issue I see is that they promoted the new DB to a primary before stopping the old application. During that gap, any data written to the old DB would be lost.
If I remember correctly (it has been a while since I looked), Hetzner although is a lot cheaper on the price sheet, they're European region by default and then if you look to get US region servers at Hetzner, the pricing is a lot higher and similar to Digital Ocean. Is that still the case?
For OP though who is a Turkey-based company and want European region servers anyway, it might make sense.
For what I use Hetzner for, and OP from the article, Hetzner only has dedicated servers in Europe, so there really isn't anything to compare to :) If I need dedicated servers in the US, I'd probably go with Vultr.
I think Hetzner makes most sense (for myself, and OP seemingly too) because they have dedicated servers, and they're in Europe. Extra bonus is the unmetered connection, but primarily just good and cheap servers :)
They're great but I wish Hetzner had a US (or CA) east coast presence, the latency of going across the ocean is really troublesome. They have some presence for their cloud offering, so they at least have some experience with the idea.
In the big corporate world, this would be a $600m budget, creating multiple VPs, thousands of positions, multi-cloud and multi-dc kubernetes, tons of highly paid consultants, the migration would take 9 - 12 years, create so many success stories, lessons learnt, promotions, etc etc.
A zero-downtime migration to a single database server? Power fails, disks fail, even CPU fans sometimes fail and bring a single server to a halt. Somehow I would have expected at least a high-available database cluster with multiple machines for applications "serving hundreds of thousands of users".
Did this about a year ago, went smoother than expected tbh. the main gotcha for us was DO's managed postgres — had to dump/restore manually since there's no direct migration path to Hetzner's managed DBs. ended up just self-hosting postgres on a separate box which has been fine, maybe even better.
I assume a vm on DO is HA protected. Also storage might live on a Cluster. Did you consider a socond dedi or do you just accept the risk of longer failover time and data loss time (RPO) for recovering to a newly provisioned server? Would love to know your thoughts on this especially as the migration was well designed and executed.
I’m formulating plans to switch from AWS to Hetzner. Amazon gets you by charging high prices (sometimes 20x more than competitors) and forcing you to make long-term commitments in order to get the prices to somewhere more reasonable. Then they make it exorbitantly expensive to migrate your data anywhere else. It’s a very customer-hostile approach that I’m tired of at this point.
Amazon might think that they’re locking people in with the egress fees. But they’re also locking people out. As soon as you switch one part to a competitor, the high egress forces you to switch over everything.
It’s going to be complicated to switch, but it’s made easier by the fact that I didn’t fall into the trap of building my platform on Amazon-specific services.
I'm not trying to convince you to stay (I work for neither anymore!), just wanted to note that you can technically request a waiver. I'm not sure how this works in practice though. Like, if you want to leave Athena and move to something on-premise is that enough to have just that workload? Maybe!
Edit: I also didn't follow this at the time, but the AWS wording suggests that the "EU Data Act" is also involved.
This doesn't actually work as advertised. I attempted free data egress from AWS in December. It took them 31 days to respond to my initial ticket. At which point they gave me a multi-page questionnaire to determine eligibility and they also told me I could not begin DTO until 60 days had passed from approval of the questionnaire.
By the time I was allowed "free egress" my cumulative S3 storage charges over the prior 100 days would have roughly matched the cost of egress if I just did so originally.
I'm in the US so the EU Data Act protections don't apply.
Have you tried to use the DTO? I did. They make you fill in a form saying you'll migrate all services (despite the blog post saying that isn't necessary), and then they take up to 12 weeks to make a decision. In my case they rejected it on a formality after 2 weeks and said to try again (the timer starts again).
So in my case that would have been 14 weeks plus the time to migrate away. The egress costs are equivalent to around 17 weeks storage cost. So you save around 1c/gb if they don't find some reason to reject it.
Hard to read this article as it was written by Claude as a report after the migration that Claude did for you.
If an llm helped you migrate and save this much money, kudos. But if you decide to write about it at least proof read it and remove redundant parts and llm storytelling.
A few months ago, I looked into AWS alternatives for my small SaaS side project. My main motivations were to save money and maybe support some EU cloud providers. At first, I planned to go with Hetzner and accepted that I would need to do a lot of things myself.
However, the dealbreaker for me was that Hetzner IPs have a bad reputation. At work, I learned that one of the managed AWS firewall rules blocks many (maybe all) of their IPs.
I can’t even open a website hosted on a Hetzner IP from my work laptop because it’s blocked by some IT policy (maybe this is not an issue for you if you are using CloudFlare or similar).
I've read online that the DDoS protection is very bad as well.
So in the end, I picked DO App Platform in one of the EU regions. Having the option to use a managed DB was a big plus as well.
Not sure what firewall rules you're referring to, but I'm genuinely surprised to see DO being trusted more than Hetzner. I often see DO's ASN when looking at scrapers/hackers, so I'd say it's only a matter of time until they're blocked as well.
It looks like Hetzner is Tor (and Tor adjacent) friendly, I suggested this might affect IP reputation, 2 users responded they had no IP reputation issues. But it looks like that wasn't quite the whole story
Yeah, well be careful of Hetzner, I used to love them but I just migrated away. They just shut all all of our VMs over a $36 billing dispute. (~30 VMs we were using for our CI/CD pipeline) We provided them evidence with records of the payment in totality from our bank, they refused to look at it / discuss the dispute, even when we were communicating urgently and just ultimately shut off all our access. We're on Scaleway now.
Hm. Hetzners billing stuff is highly automated - but they usually give you about a month to pay your bill if the credit card payment failed for some reason.
Have had some hiccups with payments not going through myself that ended up in server IPs being restricted but they were very helpful on the phone and service was restored in about 30 minutes after the call. Decidedly not ideal but has been easily manageable since.
I know they've been bought out by Akamai or whatever but I've been using Linode for over 10 years and I still go to them if I need a VPS. I don't have extreme needs, but they seem to be always improving or adding features comparable to other providers and the UI is consistent so I don't see a reason to change. Any time there has been an issue they've migrated me to a new host automatically without even needing to do anything. I combine it with Dokploy now and just deploy most of my projects via Docker Compose and private GitHub repos.
is a pity that Hetzner does not have monitoring agent like DO. in DO you can set alerts and view all metrics. its this one thing that keeps me from migrating because i dont want to install custom monitoring solutions.
>Skyrocketing inflation and a dramatically weakening Turkish Lira against the US dollar
This reasoning does not add up. They could simply say they needed to move somewhere cheaper, like Hetzner. Inflation is still high but getting lower. Weakened Turkish Lira part is not correct because dollar is artificially suppressed for a very long time.
> The key: proxy_ssl_verify off — the new server’s SSL cert is valid for the domain, not for the IP address. Disabling verification here is fine because we control both ends.
Yeah - no, it's not. They made the MitM attack possible with this change. The exposure was limited to those 5 minutes, but it should have been a known risk.
Also not certain how they could check the apps on the new server with the read-only database, while it was a replica?
Still, nice to hear it succeeded, the reasons sound very familiar.
> The key: proxy_ssl_verify off — the new server’s SSL cert is valid for the domain, not for the IP address. Disabling verification here is fine because we control both ends.
Not really, a MITM could do anything here. It's not very likely to happen here, but I think this comment shows a misunderstanding of what certificates and verification does.
Hetzner oversells hardware which means your neighbors are a drag on your performance. If your server is mostly idle, this might be a good move. If not, it probably won't be worth it.
Really interesting sharing, thanks! Why lower the TTL to 300 instead of something like 60 or 30, to make the switch even faster? The nameservers were DO's, so they should've been more than able to handle the increased load.
BTW, I've been a client of Hetzner (Cloud, Object Storage, and Storage Box) for a few years now, very happy with them!
I considered going to hetzner at one point but I read a lot of stories around hetzner that didn't inspire confidence. Primarily that they're not really that much cheaper than going to other companies offering something similar.
If some people can chime in with their positive experiences I might switch.
nixpulvis | 11 hours ago
bingo-bongo | 11 hours ago
One of the new risks is if anything critical happens with the hardware, network, switch etc. then everything is down, until someone at Hetzner go fixes it.
With a virtual server it’ll just get started on a different server straight away. Usually hypervisors also has 2 or more network connections etc.
And hopefully they also got some backup setup.
It’s still a huge amount of of savings and I’d probably do the same of I were in their shoes, but there is tradeoffs when going from virtual- to dedicated hardware.
traceroute66 | 10 hours ago
As the other person already said here, this blog post comparison is skewed.
BUT
EU cloud providers are much better value for money than the US providers.
The US providers will happily sit there nickle and diming you, often with deliberately obscure price sheets (hello AWS ;).
EU cloud provider pricing is much clearer and generally you get a lot more bang for your buck than you would with a US provider. Often EU providers will give you stuff for free that US providers would charge you for (e.g. various S3 API calls).
Therefore even if this blog post is skewed and incorrect, the overall argument still stands that you should be seriously looking at Hetzner or Upcloud or Exoscale or Scaleway or any of the other EU providers.
In addition there is the major benefit of not being subject to the US CLOUD and PATRIOT acts. Which despite what the sales-droids will tell you, still applies to the fake-EU provided by the US providers.
electroly | 10 hours ago
Anyone who thinks DO and Hetzner dedicated servers are fungible products is making a mistake. These aren't the same service at all. There are savings to be had but this isn't a direct "unplug DO, plug in Hetzner" situation.
joefourier | 9 hours ago
Although since they were running a LEMP server stack manually and did their migration by copying all files in /var/www/html via rsync and ad-hoc python scripts, even a DO droplet doesn't have the best guarantee. Their lowest-hanging fruit is probably switching to infrastructure as code, and dividing their stack across multiple cheaper servers instead of having a central point of failure for 34 applications.
missedthecue | 10 hours ago
spaniard89277 | 9 hours ago
xhkkffbf | 11 hours ago
infocollector | 11 hours ago
xhkkffbf | 11 hours ago
OneMorePerson | 9 hours ago
orsorna | 11 hours ago
nixpulvis | 11 hours ago
Not everyone likes wasting money.
dllrr | 10 hours ago
orsorna | 7 hours ago
Certainly the former is more predictable than the latter though.
Choosing to do neither is wasting money.
littlestymaar | 11 hours ago
mrweasel | 11 hours ago
thisislife2 | 11 hours ago
esafak | 10 hours ago
izacus | 11 hours ago
ozgrakkurt | 9 hours ago
izacus | 7 hours ago
iammrpayments | 10 hours ago
layer8 | 10 hours ago
faangguyindia | 9 hours ago
rolymath | 9 hours ago
xuki | 11 hours ago
testing22321 | 11 hours ago
Moving away from the US also felt great.
antirez | 11 hours ago
rustyhancock | 10 hours ago
How deep does this go?
antirez | 10 hours ago
So it's a Claude ad inside a Hetzner ad inside a decent grammar ad.
mirekrusin | 9 hours ago
FEELmyAGI | 9 hours ago
Btw this type of grammar error can be found by proofreading your posts with ChatGPT powered OpenClaw assistant.
antirez | 7 hours ago
brianwawok | 9 hours ago
airstrike | 9 hours ago
zephen | 6 hours ago
senordevnyc | 9 hours ago
sph | 9 hours ago
I know your comment is tongue-in-cheek and the poster here is kinda known, but this kind of astroturfing is a new low and it's everywhere on forums such as these.
refulgentis | 9 hours ago
Just noting for fellow just-waking-up people
(edit: OP edited)
rdevilla | 9 hours ago
MikeNotThePope | 9 hours ago
https://digg.com/
dwedge | 8 hours ago
sph | 8 hours ago
I'm not. I stick around for the popcorn, and I'm not gonna miss the schadenfreude in a few years.
Aurornis | 9 hours ago
It's too bad Reddit allows accounts to hide their comment history now. That was an easy way to identify bot accounts before they started allowing accounts to hide their post history
AlecSchueler | 7 hours ago
pydry | an hour ago
They're incredibly exposed to investor sentiment and were likely panicking around sept/oct/nov time when AI bubble stories were trending.
These posts were really consistent and repetitive - similar language about "scary good" models and fear of losing jobs.
Bridged7756 | 9 hours ago
CamperBob2 | 7 hours ago
ozmodiar | 7 hours ago
oulipo2 | 7 hours ago
Of course if all you do is "host wordpress website" (like 80% of what's "webdev" do), it will work. Now the issue is that the last 20% are the hardest to cover, and current AI methods will not get there (you need some much more complex methods, like being able to integrate logic with learning-based ML, to do this)
tmpz22 | 9 hours ago
Bridged7756 | 9 hours ago
jnwatson | 9 hours ago
rpcope1 | 9 hours ago
oulipo2 | 7 hours ago
pythonaut_16 | 7 hours ago
hrimfaxi | 6 hours ago
minimaxir | 2 hours ago
democracy | 2 hours ago
grim_io | 2 hours ago
qzw | 51 minutes ago
conradfr | 5 hours ago
godot | 8 hours ago
atherton94027 | 7 hours ago
m00dy | 9 hours ago
tannhaeuser | 9 hours ago
sph | 9 hours ago
pedrosorio | 9 hours ago
https://en.wikipedia.org/wiki/Salvatore_Sanfilippo
This whole thread is hilarious.
nutjob2 | 9 hours ago
mixmastamyk | 7 hours ago
freedomben | 7 hours ago
I don't see it as much different from "I used script X to do it" or something.
TheLML | 5 hours ago
orthecreedence | 3 hours ago
Or you can update the app to remove the dependency on the library.
But honestly, this is what containers or VMs are built for in the first place.
simonw | 2 hours ago
GaryBluto | 7 hours ago
cyanydeez | 9 hours ago
qudat | 8 hours ago
What’s exciting is how simple cli tools can be so impactful to dev workflows
p_stuart82 | 7 hours ago
if agents eat that glue, the moat gets thin fast.
grim_io | 7 hours ago
No wonder they hallucinate :)
andyhedges | 6 hours ago
edoceo | an hour ago
zamadatix | 6 hours ago
drewnick | 4 hours ago
But maybe I’m just thinking of the current capabilities of agents, and if we fast forward a couple years, even removing these abstractions or migrating will be very low friction.
reillyse | 3 hours ago
I run k8s on a bunch of dedicated servers that are super cheap and I have all bells and whistles - just tell your coding agent to do it. You can literally design the thing you would never do yourself and it works brilliantly.
Postgres running on dedicated hardware replicated and with wal backups - easy just tell codebuff (my harness of choice) to do it. Then any number of firewalls, load balancers, bastion servers, etc. if you can imagine it , codebuff will implement it.
onemoresoop | 11 minutes ago
oulipo2 | 7 hours ago
Obviously I agree that AI can be useful to write boilerplate, but it's in no way something you should use blindly when trying to do a migration or anything touching prod
So, to be more precise: no, "Claude Code didn't migrate it all". Claude Code helped you write boilerplate so that you could migrate
zephen | 6 hours ago
And, recent research suggests that anthropomorphization may actually be positively correlated with intelligence.
drewnick | 4 hours ago
jgalt212 | 4 hours ago
Syntax did a nice episode on this topic recently. They went over where it works well, and where it does not work well.
https://syntax.fm/show/992/migrating-legacy-code-just-got-ea...
fuckinpuppers | 3 hours ago
Sure you lose a little of the benefit of a “virtual” server which can be migrated but Hetzner’s support has always been super fast and capable, should I wind up in a situation where I’ve got downtime.
pennomi | 11 hours ago
nixpulvis | 11 hours ago
steve1977 | 11 hours ago
echelon | 11 hours ago
It's worse than Oracle and they don't even use lawyery contracts.
The technology itself is the tendrils.
delfinom | 10 hours ago
Recently we had several of our VMs offline because they apparently have these large volume storage pools they were upgrading and suddenly disks died in two large pools. It took them 3 days to resolve.
Hetzner has no integrated option to backup volumes and its roll your own :/ You also can't control volume distribution on their storage nodes for redundancy.
subscribed | 10 hours ago
Sure, it cost me £6/mo to serve ONE lambda on AWS (and perhaps 500 requests per month). Sure it was awesome and "proper". But crazy expensive.
I host it now (and 5 similar things) for free on Cloudflare.
But if you need what AWS provides, you'll get that. And that means sometimes it's not the most cost-effective place.
wiether | 8 hours ago
On the other hand, we have dozens production workloads on Lambda handling thousands of requests daily and we spend like $50/mo on Lambda.
I'm really intrigued by what you did to get to those figures!
Silhouette | 10 hours ago
faangguyindia | 9 hours ago
richwater | 9 hours ago
faangguyindia | 9 hours ago
rolymath | 9 hours ago
Cloud used to be marketed for scalability. "Netflix can scale up when people are watching, and scale down at night".
Then the blogosphere and astroturfing got everyone else on board. How can $5 on amazon get you less than what you got from almost any VPS (VDS) provider 10 years ago?
acdha | 9 hours ago
alternatex | 9 hours ago
AWS and Azure a charging an arm and a leg, but the offered quality is mostly perceived. Most of the bits and bobs they charge for are not providing much value for a vast majority of businesses. I won't even go over the complete lack of ergonomics with their portals.
DaedalusII | 8 hours ago
and mercedes is just like aws in dumb charges. new tires, EUR1000+ for set. replace car keys? EUR1000+
rpcope1 | 8 hours ago
I see you've never actually owned or worked on a German car, especially in relation to even modest Japanese models. Maybe they were a little nicer inside in the 80s and maybe 90s, but "German car" and frankly "European make" is basically synonymous with "big expensive pile of shit that's an expensive pain in the ass when things start falling apart (which they seem to with increasing rapidity)." It's like the disease that plagued British cars for the longest time got contaminated with the German propensity to build overly complex monstrosities.
rs_rs_rs_rs_rs | 6 hours ago
Maaan, I have some bad news for you...
PunchyHamster | 9 hours ago
Doohickey-d | 11 hours ago
Because with a single-server setup like this, I'd imagine that hardware (e.g. SSD) failure brings down your app, and in the case of SSD failure, you then have hours or days downtime while you set everything up again.
kro | 10 hours ago
Once the first SSD fails after some years, and your monitoring catches that, you can either migrate to a new box, find another intermediate solution/replica, or let them hotswap it while the other drive takes on.
Of course though, going to physical servers loses redundency of the cloud, but that's something you need to price in when looking at the savings and deciding your risk model.
And yes, running this without also at least daily snapshotting/backup to remote storage is insane - that applies to cloud aswell, albeit easier to setup there.
linsomniac | 9 hours ago
For quite a while we ran single power supplies because they were pretty high quality, but then Supermicro went through a ~6 month period where basically every power supply in machines we got during that time failed within a year, and replacements were hard to come by (because of high demand, because of failures), and we switched to redundant. This was all cost savings trade-offs. When running single power supplies, we had in-rack Auto Transfer Switches, so that the single power supplies could survive A or B side power failure.
But, and this is important, we were monitoring the systems for drive failures and replacing them within 24 hours. Ditto for power supplies. If you don't monitor your hardware for failure, redundancy doesn't mean anything.
faangguyindia | 27 minutes ago
Have 2x servers atleast then invest in proper monitoring.
Server can fail without disk failures.
traceroute66 | 10 hours ago
Yeah. This blog post reads like it was written by someone who didn't think things through and just focused on hyper-agressive cost-cutting.
I bet their DigitalOcean vm did live migrations and supported snapshots.
You can get that at Hetzner but only in their cloud product.
You absolutely will not get that in Hetzner bare-metal. If your HD or other component dies, it dies. Hetzner will replace the HD, but its up to you to restore from scratch. Hetzner are very clear about this in multiple places.
treesknees | 10 hours ago
traceroute66 | 10 hours ago
They could, but they didn't and instead they wrote that blog post which, even being generous is still kinda hard to avoid describing as misleading.
I would not have written the post I did if they had presented a multi-node bare-metal cluster or whatever more realistic config.
locknitpicker | 10 hours ago
What do you feel was misleading?
traceroute66 | 9 hours ago
Erm. I already spelt it out in my original post ?
I'm not going to re-write it, the TL;DR is they are making an Apples and Oranges comparison.
Yes they "saved money" but in no way, shape or form are the two comparable.
The polite way to put is is .... they saved as much money as they did because they made very heavy handed "architectural decisions". "Decisions" that they appear to be unaware of having made.
wiether | 9 hours ago
They don't.
And reading the article, they don't seem to understand that.
Someone1234 | 10 hours ago
I agree with the other poster, this is fine for a toy site or sites but low quality manual DR isn't good for production.
faangguyindia | 9 hours ago
andai | 9 hours ago
Curious what the delta to pain-in-ass would be if I want to deal with storing data. (And not just backups / migrations, but also GDPR, age verification etc.)
faangguyindia | 9 hours ago
i already design with Auto Scale Group in mind, we run it in spot instance which tend to be much cheaper. Spot instances can be reclaimed anytime, so you need to keep this is kind.
I also have data blobs which are memory maped files, which are swapped with no downtime by pulling manifest from GCS bucket each hour, and swapping out the mmaped data.
i use replicas, with automatic voting based failover.
I've used mongo with replication and automative failover for a decade in production with no downtime, no data lost.
Recently, got into postgres, so far so good. Before that i always used RDS or other managed solution like Datastore, but they cost soo much compared to running your own stuff.
Healthchecks start new server in no time, even if my Hertzner server goes out or if whole Hertzer goes out, my system will launch digital ocean nodes which will start soaking up all requests.
daneel_w | 9 hours ago
traceroute66 | 7 hours ago
I don't know where to start with this comment. Do I really need to spell out the difference between cloud and bare metal ?
A few examples...
senko | 7 hours ago
Yeah you pay for and get additional stuff with cloud. Nobody disputed that.
traceroute66 | 6 hours ago
Well, technically its still a possibility.
I am old enough to have seen issues with RAID1 setups not being able to restore redundancy, as well as RAID controller failures and software RAID failures.
Also, frankly you are being somewhat pedantic. My broader point was regarding cloud. I gave HD Failure as one example, randomly selected by my brain ... I could have equally randomly chosen any of the other items ... but this time, my brain chose HD.
array_key_first | 3 hours ago
Also, with something like Hetzner you would not be going in and physically doing anything. You also just tick a box for a RAM upgrade, and then migrate over or do active/passive switch.
The cloud does have advantages, mostly in how "easy" it is to do some specific workflows, but per-compute it's at least 10x the cost. Some will argue it's less than that, but they forget to factor in just how slow virtual disks and CPU are. Cloud only makes sense for very small businesses, in which the operational cost of colocation or on-prem hosting is too expensive.
hnthrow0287345 | 10 hours ago
wat10000 | 10 hours ago
acdha | 9 hours ago
izacus | 5 hours ago
kijin | 10 hours ago
Not every app needs 24/7 availability. The vast majority of websites out there will not suffer any serious consequences from a few hours of downtime (scheduled or otherwise) every now and then. If the cost savings outweigh the risk, it can be a perfectly reasonable business decision.
A more interesting question would be what kind of backup and recovery strategy they have, and which aspects of it (if any) they had to change when they moved to Hetzner.
faangguyindia | 10 hours ago
Recently, I did it in PostgreSQL using pg_auto_failover. I have 1 monitor node, 1 primary, and 1 replica.
Surprisingly, once you get the hang of PostgreSQL configuration and its gotchas, it’s also very easy to replicate.
I’m guessing MySQL is even easier than PostgreSQL for this.
I also achieved zero downtime migration.
acdha | 9 hours ago
JSR_FDED | 11 hours ago
Asking the obvious question: why not your own server in a colo?
vb-8448 | 11 hours ago
perbu | 11 hours ago
Then, say if the motherboard gives up, you have to do quite a bit of work to get it replaced, you might be down for hours or maybe days.
For a single server I don't think it makes sense. For 8 servers, maybe. Depends on the opportunity cost.
Yeroc | 10 hours ago
alaudet | 10 hours ago
acdha | 9 hours ago
Using something like AWS can make it easy to assume that servers don’t fail often but that’s because the major players have all of that behind the scenes, heavily tested, and will migrate VMs when prefail indicators trigger but before stuff is done.
klodolph | 11 hours ago
subscribed | 9 hours ago
PunchyHamster | 9 hours ago
Most expense is initial setup and automation, but once you get thru that hump and have non-spiky loads it can be massively cheaper
traceroute66 | 10 hours ago
Have you seen what the LLM crowd have done to server prices ?
preinheimer | 10 hours ago
The problem with actually owning hardware is that you need a lot of it, and need to be prepared to manage things like upgrading firmware. You need to keep on top of the advisories for your network card, the power unit, the enterprise management card, etc. etc. If something goes wrong someone might need to drive in and plug in a keyboard.
Eventually we admitted to ourselves we didn't want those problems.
dunham | 9 hours ago
At one point in the early 2000's, my brother was soldering new capacitors onto dell raid cards. (I like to call that full-stack ops.)
RexM | 9 hours ago
subscribed | 9 hours ago
But it's indeed cheaper with high, sustained workloads.
PunchyHamster | 9 hours ago
jonahs197 | 11 hours ago
BonoboIO | 10 hours ago
jonahs197 | 10 hours ago
subscribed | 9 hours ago
largbae | 10 hours ago
I see the DigitalOcean vs Hetzner comparison as a tradeoff that we make in different domains all day long, similar to opening your DoorDash or UberEats instead of making your own dinner(and the cost ratio is similar too).
I work in all 3 major clouds, on-prem, the works. I still head to the DigitalOcean console for bits and pieces type work or proof of concept testing. Sometimes you just want to click a button and the server or bucket or whatever is ready and here's the access info and it has sane defaults and if I need backups or whatnot it's just a checkbox. Your time is worth money too.
dividuum | 10 hours ago
locknitpicker | 10 hours ago
You're describing Hetzner Cloud, which has been like this for many years. At least 6.
Hetzner also offers Hetzner Cloud API, which allows us to not have to click any button and just have everything in IaC.
https://docs.hetzner.cloud/
andai | 9 hours ago
rmunn | 9 hours ago
nine_k | 9 hours ago
One is about all the steps of zero downtime migration. It's widely applicable.
The other is the decision to replace a cloud instance with bare metal. It saves a lot in costs, but also the loss of fast failover and data backups is priced in.
If I were doing this, I would run a hot spare for an extra $200, and switched the primary every few days, to guarantee that both copies work well, and the switchover is easy. It would be a relatively low price for a massive reduction of the risk of a catastrophic failure.
dangero | 8 hours ago
faangguyindia | 9 hours ago
i hardly ever visit their website, everything from terminal.
petesergeant | 9 hours ago
pellepelster | 10 hours ago
The issue is though, that you loose the managed part of the whole Cloud promise. For ephemeral services this not a big deal, but for persistent stuff like databases where you would like to have your data safe this is kind of an issue because it shifts additional effort (and therefore cost) into your operations team.
For smaller setups (attention shameless self-promotion incoming) I am currently working on https://pellepelster.github.io/solidblocks/cloud/index.html which allows to deploy managed services to the Hetzner Cloud from a Docker-Compose like definition. E.g. a PostgreSQL database with automatic backup and disaster recovery.
wouldbecouldbe | 10 hours ago
apitman | 10 hours ago
They do offer VPS in the US and the value is great. I was seriously looking at moving our academic lab over from AWS but server availability was bad enough to scare me off. They didn't have the instances we needed reliably. Really hoping that calms down.
igtztorrero | 10 hours ago
dessimus | 9 hours ago
OutOfHere | 10 hours ago
As such, I doubt the noted price reduction is reproducible. Combine this with Hetzner's sudden deletions of user accounts and services without warning, and it's a bad proposition. Search r/hetzner and r/vps for hetzner for these words: banned, deleted, terminated; there are many reports. What should stun you even more about it is that Hetzner could ostensibly be closely spying on user data and workloads, even offline workloads, without which they won't even know who to ban.
The only thing that Hetzner might potentially be good for is to add to an expendable distributed compute pool, one that you can afford to lose, but then you might as well also use other bottom-of-the-barrel untrustworthy providers for it too, e.g. OVH.
0123456789ABCDE | 10 hours ago
> $1,432 to $233
a difference of 5/6 in price does not materially change the decision to move between providers, even with a 40% price increase
swiftcoder | 9 hours ago
ararangua | 10 hours ago
onetimeusename | 10 hours ago
therealmarv | 10 hours ago
I wish the industry would adopt more zero knowledge methods in this regards. They are existing and mathematically proven but it seems there is no real adoption.
- OpenAI wants my passport when topping up 100 USD
- Bolt wanted recently my passport number to use their service
- Anthropic seems wants to have passports for new users too
- Soon age restriction in OS or on websites
I wished there would be a law (in Europe and/or US) to minify or forbid this kind of identity verification.
I want to support the companies to not allow misuse of their platforms, at the same time my full passport photo is not their concern, especially in B2B business in my opinion.
pmdr | 9 hours ago
OneMorePerson | 9 hours ago
The only possible non legally driven reason I can think of would be if they think the tradeoff of extra friction (and lost customers) is more than offset by fraud protection efforts. This seems unlikely cause I don't see how that math could have changed in the last few years.
cyanydeez | 9 hours ago
It's bad enough living in America without the rest of the world adopting the grift economy.
zackmorris | 6 hours ago
Which is fundamental to so many XY problems, including why cloud services are so byzantine instead of just providing isolated secure shells with full root access within them. And why distrust is a growing force in the world instead of, say, unconditional love.
I always dreamed of winning the internet lottery so that I could help dismantle the systems of control which currently dominate our lives. Which starts with challenging paradigms from first principles. That looks like asking why we only have multicore computing in the cloud and not on our desktops (which could be used to build our own cloud servers).
When we're missing an abstraction layer, that creates injustice and a power drain from the many to the few. Some examples:
- CPU -> multicore MIMD (missing) -> GPU (based on the subset SIMD instead of MIMD upon which graphics libraries could be built)
- UDP -> connectionless reliable stream (missing) -> TCP (should have been a layer above UDB not beside it)
- UDP/TCP -> P2P (NAT and other limitations block this and were inherited by IPv6 as generational trauma) -> WebRTC (redundant if we had P2P that "just works")
- internet connection -> symmetric upload/download speed (blocked for legal reasons under the guise of overselling to reduce cost) -> self-hosted web servers (rare due to antitrust issues stemming from said legal reasons)
- internet connection -> multicast (missing due to suppression of content-addressable-memory/hash-tree/DHT/) -> self-hosted streaming (negates the need for regions and edge caching)
I had high hopes for Google and even Tesla (for disrupting the physical world). But instead of open standards, they gave us proprietary vendor lock-in: Google Workspace (formerly G Suite) and NACS instead of J1772 (better yet both). Because of their refusal to interoperate at the lowest levels, there is little hope that they will do the real work of solving the hard problems at the highest levels.
For example, I just heard that China has built thousands of battery swap stations to provide effectively instant charging for electric vehicles, whereas that's something that Tesla can't accomplish because they chose to build Supercharger stations instead.
Once we begin to see the world this way, it's impossible to unsee it. It calls into question the fundamentals (like scarcity) which capitalism is based upon, and even the concept of profit itself.
From a spiritual perspective, I believe that this understanding is what blocks me from using my talents to use the system for personal gain to win the internet lottery. The people who own the systems of control don't have this understanding, and even view its basis in empathy as a liability. So we sacrifice the good of the many for the good of the few and call that progress.
uxcolumbo | 9 hours ago
Absolutely no to this - reason enough to go with AWS or alternatives. And why are ppl willingly giving it to hosting providers?
Unnecessarily exposing yourself to identity theft if they get compromised.
acdha | 9 hours ago
If Hetzner allows you to host something and you use it for illegal acts, they aren’t going to jail to shield you for €10/month.
uxcolumbo | 8 hours ago
And if someone wants to do illegal things, what's stopping them from submitting a fake ID?
edwinjm | 4 hours ago
uxcolumbo | 3 hours ago
zaptheimpaler | 9 hours ago
uxcolumbo | 8 hours ago
But at least if there is an alternative then great.
fg137 | 6 hours ago
They still decided my information was fake and terminated my account.
I'm never going to do business with them again.
victorbjorklund | 5 hours ago
goobatrooba | 9 hours ago
Not sure what differs in our cases, I'm based in EU.
xtracto | 9 hours ago
faangguyindia | 9 hours ago
ciex | 8 hours ago
Strom | 8 hours ago
As I understand it, they ask only from accounts that check several boxes for common cases of abuse. So basically, personal accounts (as opposed to business accounts) from poor countries (by per capita, so e.g. India qualifies as poor).
roel_v | 6 hours ago
api | 10 hours ago
Cloud is ludicrously marked up.
gbro3n | 10 hours ago
aungpaing | 10 hours ago
adamcharnock | 10 hours ago
For backups we use both Velero and application-level backup for critical workloads (i.e. Postgres WAL backups for PITR). We also ensure all state is on at least two nodes for HA.
We also find bare metal to be a lot more performant in general. Compared to AWS we typically see service response times halve. It is not that virtualisation inherently has that much overhead, rather it is everything else. Eg, bare metal offers:
- Reduced disk latency (NVMe vs network block storage)
- Reduced network latency (we run dedicated fibre, so inter-az is about 1/10th the latency)
- Less cache contention, etc [1]
Anyway, if you want to chat about this sometime just ping me an email: adam@ company domain.
[0] https://lithus.eu
[1] I wrote more on this 6 months ago: https://news.ycombinator.com/item?id=45615867
brianwawok | 9 hours ago
My entire stack is.. k8s, hosted Postgres, s3 type storage. I can always host my own Postgres. So really down to k8s and s3. I think hetzner has some kind of s3 storage but haven’t looked into, and I assume moving in 100 TB is a process….
traceroute66 | 9 hours ago
Your post was reasonable until the spam tagline.
Not cool.
spider-mario | 8 hours ago
High availability, in case anyone else was wondering.
jwr | 7 hours ago
I measured this several years back and never looked at virtual servers again. Since CPU time isn't reserved (like RAM is), the performance is abysmal compared to real hardware.
https://jan.rychter.com/enblog/cloud-server-cpu-performance-...
OliverGuy | 10 hours ago
Sounds like from the requirement to live migrate you can't really afford planned downtime, so why are you risking unplanned downtime?
kuzivaai | 6 hours ago
daveguy | 10 hours ago
DigitalOcean just absolutely is just not an enterprise solution. Don't trust it with your data.
Oh, and did I mention I had been paying the upcharge for backups the entire time?
sylware | 10 hours ago
Full of scanners, script kiddies and maybe worse.
Vaslo | 4 hours ago
phamilton | 10 hours ago
Namely, all remote access (including serving http) must managed by a major player big enough to be part of private disclosure (e.g. Project Glasswing).
That doesn't mean we have to use AWS et al for everything, but some sort of zero trust solution actively maintained by one of them seems like the right path. For example, I've started running on Hetzner with Cloudflare Tunnels.
Anyone else doing something similar?
locknitpicker | 10 hours ago
How much latency does this add?
nickandbro | 9 hours ago
https://slitherworld.com
My foray into multiplayer games.
ianberdin | 9 hours ago
ianberdin | 9 hours ago
faangguyindia | 22 minutes ago
raphinou | 9 hours ago
leros | 8 hours ago
I've spent time eating the costs of things like DigitalOcean or SaaS products because my time is better spent growing my revenue than reducing infrastructure costs. But at some point, costs can grow large enough that's it's worthwhile to shift focus to reducing infrastructure spend.
utopiah | 9 hours ago
rawoke083600 | 9 hours ago
And i say it every time they came up: Their cloud UX is brilliant and simple! Compared to the big ones out there.
lloydatkinson | 9 hours ago
Happened to me.
I now advise people to avoid clown-led services like Hetzner and stick to more reputable, if not as cheap, options.
talkingtab | 9 hours ago
pmdr | 9 hours ago
readyforbrunch | 9 hours ago
So a near 44% price reduction for a 50% reduction in only one of the components. Looks like progression to me.
mariopt | 9 hours ago
You saved a lot of money but you'll spend a lot of time in maintenance and future headaches.
timwis | 9 hours ago
grey-area | 9 hours ago
Sometimes it's completely acceptable that a server will run for 10 years with say 1 week or 1 month of downtime spread over those 10 years, yes. That's the sort of uptime you can see with single servers that are rarely changed and over-provisioned as many on Hetzner are. Some examples:
Small businesses where the website is not core to operations and is more of a shop-front or brochure for their business.
Hobby websites too don't really matter if they go down for short periods of time occasionally.
Many forums and blogs just aren't very important too and downtime is no big deal.
There are a lot of these websites, and they are at the lower end of the market for obvious reasons, but probably the majority of websites in fact, the long tail of low-traffic websites.
Not everything has to be high availability and if you do want that, these providers usually provide load balancers etc too. I think people forget here sometimes that there is a huge range in hosting from squarespace to cheap shared hosting to more expensive self-hosted and provisioned clouds like AWS.
rzz3 | 9 hours ago
j45 | 8 hours ago
Also, in general, you can architect your application to be more friendly to migration. It used to be a normal thing to think about and plan for.
VMware has a conversion tool that converts bare metal into images.
One could image, then do regular snapshots, maybe centralize a database being accessed.
Sometimes it's possible to create a migration script that you run over and over to the new environment for each additional step.
Others can put a backup server in between to not put a load on the drive.
Digital Ocean makes it impossible to download your disk image backups which is a grave sin they can never be forgiven for. They used to have some amount of it.
Still, a few commands can back up the running server to an image, and stream it remotely to another server, which in turn can be updated to become bootable.
This is the tip of the iceberg in the number of tasks that can be done.
Someone with experience can even instruct LLMs to do it and build it, and someone skilled with LLMs could probably work to uncover the steps and strategies for their particular use case.
grey-area | 8 hours ago
But I do agree the poster should think about this. I don't think it's 'off' or misleading, they just haven't encountered a hardware error before. If they had one on this single box with 30 databases and 34 Nginx sites it would probably be a bad time, and yes they should think about that a bit more perhaps.
They describe a db follower for cutover for example but could also have one for backups, plus rolling backups offsite somewhere (perhaps they do and it just didn't make it into this article). That would reduce risk a lot. Then of course they could put all the servers on several boxes behind a load-balancer.
But perhaps if the services aren't really critical it's not worth spending money on that, depends partly what these services/apps are.
nine_k | 6 hours ago
BorisMelnik | 8 hours ago
grey-area | 8 hours ago
chairmansteve | 8 hours ago
anticorporate | 8 hours ago
Could I take it down for the afternoon? Sure. Or could I wait and do it after hours? Also sure. But would I rather not have to deal with complaints from users that day and still go home by 5pm? Of course!
j45 | 9 hours ago
This is a general response to it.
I have run hosting on bare metal for millions of users a day. Tens of thousdands of concurrent connections. It can scale way up by doing the same thing you do in a cloud, provision more resources.
For "downtime" you do the same thing with metal, as you do with digital ocean, just get a second server and have them failover.
You can run hypervisors to split and manage a metal server just like Digital Ocean. Except you're not vulnerable to shared memory and cpu exploits on shared hosting like Digital Ocean. When Intel CPU or memory flaws or kernel exploits come out like they have, one VM user can read the memory and data of all the other processes belonging to other users.
Both Digital Ocean, and IaaS/PaaS are still running similar linux technologies to do the failover. There are tools that even handle it automatically, like Proxmox. This level of production grade fail over and simplicity was point and click, 10 years ago. Except no one's kept up with it.
The cloud is convenient. Convenience can make anyone comfortable. Comfort always costs way more.
It's relatively trivial to put the same web app on a metal server, with a hypervisor/IaaS/Paas behind the same Cloudflare to access "scale".
Digital Ocean and Cloud providers run on metal servers just like Hetzner.
The software to manage it all is becoming more and more trivial.
grey-area | 8 hours ago
nh2 | 5 hours ago
> This level of production grade fail over and simplicity was point and click, 10 years ago.
While some of the tools are _designed_ for point and click, they don't always work. Mostly because of bugs.
We run Ceph clusters under our product, and have seen a fair share of non-recoveries after temporary connection loss [1], kernel crashes [2], performance degradations on many small files, and so on.
Similarly, we run HA postgres (Stolon), and found bugs in its Go error checking cause failure to recover from crashes and full-disk conditions [3] [4]. This week, we found that full-disk situations will not necessarily trigger failovers. We also found that if DB connections are exhausted, the dameon that's supposed to trigger postgres failover cannot connect to do that (currently testing the fix).
I believe that most of these things will be more figured out with hosted cloud solutions.
I agree that self-hosting HA with open-source software is the way to. These softwares are good, and the more people use them, the less bugs they will have.
But I wouldn't call it "trivial".
If you have large data, it is also brutally cheaper; we could hire 10 full-time sysadmins for the cost of hosting on AWS, vs doing our own Hetzner HA with Free Software, and we only need ~0.2 sysadmins. And it still has higher uptime than AWS.
It is true that Proxomox is easy to setup and operate. For many people it will probably work well for a long time. But when things aren't working, it's not so easy anymore.
[1]: "Ceph does not recover from 5 minute network outage because OSDs exit with code 0" - https://tracker.ceph.com/issues/73136
[2]: "Kernel null pointer derefecence during kernel mount fsync on Linux 5.15" - https://tracker.ceph.com/issues/53819
[3]: https://github.com/sorintlab/stolon/issues/359#issuecomment-...
[4]: https://github.com/sorintlab/stolon/issues/247
jijijijij | 8 hours ago
Aurornis | 8 hours ago
That's a strawman version of what happens.
There have been times when I've tried to visit a webshop to buy something but the site was broken or down, so I gave up and went to Amazon and bought an alternative.
I've also experienced multiple business situations where one of our services went down at an inconvenient time, a VP or CEO got upset, and they mandated that we migrate away from that service even if alternatives cost more.
If you think of your customers or visitors as perfectly loyal with infinite patience then downtime is not a problem.
> Unless you are Amazon and every minute costs you bazillions, you are likely gonna get the better deal not worrying about availability and scalability. That 250€/m root server is a behemoth. Complete overkill for most anything.
You don't need every minute of downtime to cost "bazillions" to justify a little redundancy. If you're spending 250 euros/month on a server, spending a little more to get a load balancer and a pair of servers isn't going to change your spend materially. Having two medium size servers behind a load balancer isn't usually much more expensive than having one oversized server handling it all.
There are additional benefits to having the load balancer set up for future migrations, or to scale up if you get an unexpected traffic spike. If you get a big traffic spike on a single server and it goes over capacity you're stuck. If you have a load balancer and a pair of servers you can easily start a 3rd or 4th to take the extra traffic.
jijijijij | 8 hours ago
Great. So how much did the webshop lose in that hour of maintenance (which realistically would be in the middle of the night for their main audience) and how much would they have paid for redundancy? Also a bit hard to believe you repeatedly ran into the situation of an item sold at a self-hosted webshop and Amazon alike. Are you sure they haven't just messed up the web dev biz? You could totally do that with AWS too...
> If you're spending 250 euros/month on a server, spending a little more to get a load balancer and a pair of servers isn't going to change your spend materially.
Of course, but that's not the argument. It's implied you can just double the 250€/m server for redundancy, as you would still get an offer at the fraction of cloud prices. But really that server needs no more optimization in terms of hardware diversification. As I said, it's complete overkill. Blogs and forums could easily be run on a 30€/m recycled machine.
coryrc | 8 hours ago
Even if Amazon was down, if I was planning to buy, I'd wait. heck, I got a bunch of crap in my cart right now I haven't finished out.
Intentional downtime lets everyone plan around it, reduces costs by not needing N layers of marginal utility which are all fragile and prone to weird failures at times you don't intend.
jijijijij | 7 hours ago
> Intentional downtime lets everyone plan around it, reduces costs by not needing N layers of marginal utility which are all fragile and prone to weird failures at times you don't intend.
Quite frankly, I would manage if things were run "on-supply" with solar and would just go dark at night.
thelastgallon | 8 hours ago
Spot on! People still go to Chick-fil-A, even if they are closed on Sundays!
wild_egg | 7 hours ago
daneel_w | 9 hours ago
VorpalWay | 8 hours ago
As a bonus, Hetzner is European.
Aurornis | 9 hours ago
The confusing part about this article is the emphasis on a zero-downtime migration toward a service that isn't really ideal for uptime. It wouldn't be that expensive to add a little bit of architecture on the Hetzner side to help with this. I guess if you're doing a migration and you're paid salary or your time is free-ish, doing the migration in a zero downtime way is smart. It's a little funny to see the emphasis on zero downtime juxtaposed to the architecture they chose where uptime depends on nothing ever failing
j45 | 8 hours ago
Clever architecture will always beat cleverly trying to pick only one cloud.
Being cloud agnostic is best.
This means setting up a private cloud.
Hosted servers, and managed servers are perfectly capable of near zero downtime. this is because it's the same equipment (or often more consumer grade) that the "cloud" works on and plans for even more failure.
Digital Ocean definitely does not guarantee zero downtime. That's a lot of 9's.
It's simple to run well established tools like Proxmox on bare metal that will do everything Digital Ocean promises, and it's not susceptible to attacks, or exploits where the shared memory and CPU usage will leak what customers believe is their private VPS.
Nothing ever failing in the case of a tool like Proxmox is, install it on two servers, one VPS exists on both nodes (you connect both servers as nodes), click high availability, and it's generally up and running. Put cloudflare in front of it like the best preference practices of today.
If you're curious about this, there's some pretty eye opening and short videos on Proxmox available on Youtube that are hard to unsee.
nine_k | 6 hours ago
j45 | 5 hours ago
When you have 2 nodes running, both are mirrored and running, one can have hardware break.
Also, hardware can provide failure notifications before it breaks, and experience teaches to just update and upgrade before hard drives break.
Since tools like proxmox just add a node, you add new hardware, mark the VM for that node to mirror, and it is taken care of.
Terraform etc can sit below Proxmox and alleviate what you're speaking about:
Some examples: https://www.youtube.com/watch?v=dvyeoDBUtsU
nine_k | 2 hours ago
PunchyHamster | 9 hours ago
If you can tolerate few hours of downtime and some data rollback/loss, single server + robust backups can be viable strategy
chalmovsky | 9 hours ago
wiether | 9 hours ago
Usually those articles describe two situations:
Here they appear to be in the first situation. If their setup was running fine on DO and they put the right DR policies in place at Hetzner, they should be fine.chillfox | 9 hours ago
Also, don't underestimate the reliability of simplicity.
I was a Linux sysadmin for many years, and I have never seen as much downtime from simpler systems as I routinely see from the more complicated setups. Somewhere between theory and reality, simpler systems just comes out ahead most of the time.
NicoJuicy | 9 hours ago
Deploying a new docker instance or just restoring the app from a snapshot and restoring the latest db in most cases is enough.
BorisMelnik | 8 hours ago
ozim | 8 hours ago
I know people like FAANG LARPing. Not everyone has budget or need to run four nines with 24/7 and FAANG level traffic.
Gud | 8 hours ago
jgalt212 | 8 hours ago
neya | 8 hours ago
wg0 | 8 hours ago
Not a bad tradeoff for 99.8% of shops out there.
jdboyd | 8 hours ago
If someone starts thinking about redundancy and load balancers than DO's solution is rent a second similar sized droplet, and then add their load balancing service. If you do those things with Hetzner instead, you would still be spending less than you did with Digital Ocean.
Personally, what is keeping me on DO is that no single droplet I have is large enough to justify moving on its own, and I'm not prepared to deal with moving everything.
pinkgolem | 8 hours ago
If your scaling need is not that high, you can get very far with a single server
littlecranky67 | 8 hours ago
Given the downtimes we saw in the past year(s) (AWS, Cloudflare, Azure - the later even down several times), I would argue moving to any of the big cloud providers give you not much of a better guarantee.
I myself am a Hetzner customer with a dedicated vServer, meaning it is a shared virtual server but with dedicated CPUs (read: still oversubscribed, but some performance guarantee) and had zero hardware-based downtime for years [0]. I would guess their vservers are on similar redundant hardware where the failing components can be hotswapped.
[0] = They once within the last 3 years sent me an email that they had to update a router that would affect network connectivity for the vServer, but the notification came weeks in advance and lasted about 15 minutes. No reboot/hardware failure on my vServer though.
supermatt | 8 hours ago
They saved money and lost nothing.
Now, if they so wish, they could use a portion of that to increase redundancy - but that wasn't the point of the article.
surgical_fire | 8 hours ago
People underestimate how far you can go with one or two servers. In fact, what I have seen in ky career is many examples of services that should have been running on one or two servers and instead went for a hugely complex microserviced approach, all in on Cloud providers and crazy requirements of reliability for a scale that never would come.
ahofmann | 8 hours ago
Dealing with over engineered bullshit, that behaved in strange ways that disrupted the service was far more often a problem.
So, yes, redundancy is something that can be left away, if you're comfortable to be responsible for fixing things at a Saturday morning.
jijijijij | 7 hours ago
izacus | 7 hours ago
Like, I know Leetcode tells otherwise, but most companies really don't need full FAANG stack with 99.999% uptime. A day of outage in a few years isn't going affect bottom lines.
ozim | an hour ago
LARPing as a FAANG is waste of money and lots of companies doesn’t even need 3 nines let alone 5ve.
pier25 | 6 hours ago
stephenhuey | 5 hours ago
grebc | 3 hours ago
It’s amusing that the US government can shutdown for days/weeks/months over budget reasons and there’s no adult discussions that take place about fixing the cause. Yet the latest HN demo that 100 people will use need all 9’s reliability and hundreds of responses.
stephenhuey | 3 hours ago
desireco42 | 9 hours ago
Plus, this is not what DHH was doing, he was not saving few bucks, but unlocking potential for his company to thrive.
shermantanktop | 9 hours ago
When I’ve seen this work well, it’s either built into the product as an established feature, or it’s a devops procedure that has a runbook and is done weekly.
Doing it with low level commands and without a lot of experience is pretty likely to have issues. And that’s what happened here.
PunchyHamster | 9 hours ago
So they did same mistake all over again. Debian or Ubuntu would just be upgrade-in-place and migrate
hnarn | 9 hours ago
PunchyHamster | 8 hours ago
bunbun69 | 7 hours ago
neeraga | 9 hours ago
caymanjim | 9 hours ago
This isn't something others should use as an example.
Zopieux | 8 hours ago
Chaosvex | 7 hours ago
mlhpdx | 8 hours ago
koolba | 8 hours ago
What was the config on the receiving side to support this? Did you whitelist the old server IP to trust the forwarding headers? Otherwise you’d get the old server IP in your app logs. Not a huge deal for an hour but if something went wrong it can get confusing.
DaedalusII | 8 hours ago
you can basically go on hetzner and spin up a vps with linux that is exposed to the open internet with open ports and user security and within a few hours its been hacked, there is no like warning pop up that says "if you do this your server will be pwnd"
i especialy wonder with all the ai provisioned vps and postgres dbs what will happen here
_el1s7 | 8 hours ago
I'm not against vertical scaling and stuff, but 30 db instances in one server is just crazy.
leetrout | 8 hours ago
They didn't say that and the article didn't allude to that. 1 instance with 30 databases.
_el1s7 | 8 hours ago
vb-8448 | 8 hours ago
_el1s7 | 8 hours ago
It seems like he's having a database for each app.
vb-8448 | 8 hours ago
jpablo | 8 hours ago
embedding-shape | 8 hours ago
jpablo | 7 hours ago
Once you have that initial back up you can set your replica and make it catch up , then you switch. I choose to take the few seconds of downtime doing the switch because for my use case that was acceptable.
embedding-shape | 6 hours ago
jpablo | 6 hours ago
grasbergerm | 2 hours ago
Aside from the blocking you mentioned during the initial snapshot, you'd need to block writes to the old DB before the cutover as well. There's no way to guarantee in-flight writes to the old DB aren't lost when promoting the replica to a primary otherwise. I'm surprised the author didn't go into more detail here. Maybe it was fine given their workload, but the key issue I see is that they promoted the new DB to a primary before stopping the old application. During that gap, any data written to the old DB would be lost.
godot | 8 hours ago
For OP though who is a Turkey-based company and want European region servers anyway, it might make sense.
embedding-shape | 8 hours ago
I think Hetzner makes most sense (for myself, and OP seemingly too) because they have dedicated servers, and they're in Europe. Extra bonus is the unmetered connection, but primarily just good and cheap servers :)
kyledrake | 8 hours ago
l5870uoo9y | 8 hours ago
thelastgallon | 8 hours ago
l5870uoo9y | 8 hours ago
bluepuma77 | 7 hours ago
pwr1 | 7 hours ago
mitjam | 7 hours ago
donmb | 7 hours ago
ocean2 | 7 hours ago
dabinat | 6 hours ago
Amazon might think that they’re locking people in with the egress fees. But they’re also locking people out. As soon as you switch one part to a competitor, the high egress forces you to switch over everything.
It’s going to be complicated to switch, but it’s made easier by the fact that I didn’t fall into the trap of building my platform on Amazon-specific services.
boulos | 6 hours ago
AWS matched a few months later:
https://aws.amazon.com/blogs/aws/free-data-transfer-out-to-i...
I'm not trying to convince you to stay (I work for neither anymore!), just wanted to note that you can technically request a waiver. I'm not sure how this works in practice though. Like, if you want to leave Athena and move to something on-premise is that enough to have just that workload? Maybe!
Edit: I also didn't follow this at the time, but the AWS wording suggests that the "EU Data Act" is also involved.
tailscaler2026 | 5 hours ago
This doesn't actually work as advertised. I attempted free data egress from AWS in December. It took them 31 days to respond to my initial ticket. At which point they gave me a multi-page questionnaire to determine eligibility and they also told me I could not begin DTO until 60 days had passed from approval of the questionnaire.
By the time I was allowed "free egress" my cumulative S3 storage charges over the prior 100 days would have roughly matched the cost of egress if I just did so originally.
I'm in the US so the EU Data Act protections don't apply.
dwedge | 4 hours ago
So in my case that would have been 14 weeks plus the time to migrate away. The egress costs are equivalent to around 17 weeks storage cost. So you save around 1c/gb if they don't find some reason to reject it.
localhoster | 6 hours ago
smallpipe | 5 hours ago
marcosscriven | 6 hours ago
None of it is mission critical - but it’s certainly something I’d use in production with a few more instances.
Networking over Tailscale works flawlessly with my Proxmox nodes at home.
bth | 6 hours ago
However, the dealbreaker for me was that Hetzner IPs have a bad reputation. At work, I learned that one of the managed AWS firewall rules blocks many (maybe all) of their IPs. I can’t even open a website hosted on a Hetzner IP from my work laptop because it’s blocked by some IT policy (maybe this is not an issue for you if you are using CloudFlare or similar).
I've read online that the DDoS protection is very bad as well.
So in the end, I picked DO App Platform in one of the EU regions. Having the option to use a managed DB was a big plus as well.
mmarian | 6 hours ago
Maledictus | 5 hours ago
napolux | 5 hours ago
source: moved away from DO for this very reason.
TZubiri | 5 hours ago
https://news.ycombinator.com/item?id=47279518
It looks like Hetzner is Tor (and Tor adjacent) friendly, I suggested this might affect IP reputation, 2 users responded they had no IP reputation issues. But it looks like that wasn't quite the whole story
https://community.torproject.org/relay/community-resources/g...
It seems that Hetzner holds 7% of the Tor network. (if I understood the table right)
iririririr | 3 hours ago
so much that I'm thinking of selling nothing but an aws and azure blocker as a service.
pier25 | 6 hours ago
Over the years I've heard plenty of horror stories from Hetzner customers.
Ken_At_EM | 6 hours ago
chris_engel | 5 hours ago
antoniojtorres | 5 hours ago
jcrben | 5 hours ago
tylergetsay | 5 hours ago
rob | 5 hours ago
grzes | 5 hours ago
throwaway132448 | 5 hours ago
aaa_aaa | 4 hours ago
This reasoning does not add up. They could simply say they needed to move somewhere cheaper, like Hetzner. Inflation is still high but getting lower. Weakened Turkish Lira part is not correct because dollar is artificially suppressed for a very long time.
bornfreddy | 4 hours ago
Yeah - no, it's not. They made the MitM attack possible with this change. The exposure was limited to those 5 minutes, but it should have been a known risk.
Also not certain how they could check the apps on the new server with the read-only database, while it was a replica?
Still, nice to hear it succeeded, the reasons sound very familiar.
SahAssar | 4 hours ago
Not really, a MITM could do anything here. It's not very likely to happen here, but I think this comment shows a misunderstanding of what certificates and verification does.
TurdF3rguson | 3 hours ago
faangguyindia | 24 minutes ago
Hetzner cannot outsell your CPU or SSD
VMs on cloud can.
BrunoBernardino | 3 hours ago
BTW, I've been a client of Hetzner (Cloud, Object Storage, and Storage Box) for a few years now, very happy with them!
written-beyond | 3 hours ago
If some people can chime in with their positive experiences I might switch.
elgertam | 3 hours ago
addybojangles | 2 hours ago
infomiho | 2 hours ago
Nice to see it used _twice_ :D
collinmanderson | 37 minutes ago
- reduce dns ttl (if not doing an ip swap)
- rsync website files
- rsync /etc/letsencrypt/ ssl certificates
- copy over database (if writes don't happen often and database is small enough, this can be done without replica, just go read_only during migration)
- test new server by putting new ip in local /etc/hosts
- turn off cron on old server
- convert old server nginx to reverse proxy to new server
- change dns (or ip swap between old and new server)
- turn on cron on new server