PostgresBench: A Reproducible Benchmark for Postgres Services

View article

109 points by saisrirampur a day ago on hackernews | 23 comments

whitepoplar | a day ago

I'm curious to know how PlanetScale Postgres would fare.

karlmush | 23 hours ago

Me too! Were they left out on purpose?

: [OP] saisrirampur | 23 hours ago
It wasn’t left out on purpose, it was just prioritization, we chose 5 that most commonly came up. Please feel free to submit a PR, it should be pretty straightforward.

samlambert | 22 hours ago

We have seen them benchmarking us before. I assume we won because they didn't include

: [OP] saisrirampur | 20 hours ago
It could be the other way round too. ;) We just limited the number of Postgres providers in the first batch. May add you in the next and might ping you before sharing it in public, just to make you sure you are aligned and like them.

schultzer | a day ago

While it’s important to focus on the backend itself, why don’t anyone focus on the clients which talk to the databases, it is there most of an applications problem starts also in regards to performance.

__s | a day ago

Lack of protocol compression is painful with postgres

jelder | 23 hours ago

Untrue since the introduction of zstd support in PostgreSQL 15.

: ahoka | 21 hours ago
Are you sure?

schultzer | 5 hours ago

Sure, but that is rarely the biggest issues facing a client: top contenders are always queue, pool and decoding. And obviously lack of SQL being a first class in most languages.

psc007 | a day ago

Tldr? Can I use this to benchmark my own CNPG on k8s to see if there is improvement to be made on the deployment?

: [OP] saisrirampur | a day ago
Yes, you absolutely can. It takes 15 mins to run the benchmark amd get results.

aurareturn | a day ago

I was just looking for something like this. I think there is a lack of benchmarks between Postgres cloud instances across different clouds. They all offer the same features mostly and cost about the same, but we don't know how they perform.

datadrivenangel | a day ago

If I had ~$10,000k to burn, I would give Claude Code an AWS account and this benchmark and have it generate parameter spaces and optimize the configuration for transaction processing by just changing the configuration.

: paulddraper | 23 hours ago
You could do an enormous number of things for $10,000k.

cuu508 | a day ago

It would be interesting to see vanilla postgres on a VPS and vanilla postgres on a bare metal server included too. I understand it's apples and oranges, but would it be slower, would it be faster, by how much?

nwhnwh | a day ago

It doesn't matter if it is apples and oranges, as long as it is an option customers care about. So, I hope that would be included too.

: [OP] saisrirampur | 23 hours ago
Good feedback, we will aim to include them in the future.

ahachete | a day ago

My 2 cents:

> Each run lasts 10 minutes, long enough to move past warmup and capture stable throughput

But short enough not to show the effect of checkpoints (which is quite noticeable). It's not representing a realistic workload. I'd use at the very least 30mins with checkpoint_timeout=5mins so you get at least 6 checkpoint cycles.

> We report average TPS, average latency, P95 latency, and P99 latency

Unless I'm mistaken, only average TPS is reported, but not TPS over time. Again, it varies a lot (especially due to checkpoints!) and you need variation over time to understand patterns and determine SLAs.

> We tested two scale factors: 6849 (~100 GB) and 34247 (~500 GB).

I'd set them an order of magnitude apart (100GB and 1TB). 1TB+ is nowadays not unseen and therefore represents a step into the scale path that I believe should also be compared.

> While all services offer high availability, the underlying architectures differ. Some use standby replication, others use shared or distributed storage layers. Since we are focused on single-node compute and storage performance, we tested without HA enabled to isolate that.

So again, it's not realistic. The impact of HA in production is quite measurable in performance, especially if semi-sync is used.

But more importantly, you are comparing systems that are already redundant and highly available (e.g. Aurora, writing six copies of the data) with a single-instance one writing (if I'm not mistaken) to an ephemeral local NVMe. This is watermelons to grapes comparison (I assume HA CH clusters talk over the network for each confirmed transaction; but here they only go to local NVMe).

> Default Postgres configuration

I believe this is a mistake. Postgres must be tuned (at least to a reasonable degree) and try to homogenize the configuration across the tested systems. Otherwise you are not benchmarking Postgres, but rather the conservationism in the decisions of the default configurations of the managed services.

> We did not compare pricing.

So you leave the fun part out?

> Systems included

I miss a self-managed EC2 cluster (with local NVMe) for comparison.

[OP] saisrirampur | 23 hours ago

Ack on most points i.e. duration, scale factor (data size) and pricing. We will try incorporating these in the next iterations.

In regards to HA, partly agreed. We are working on adding HA setup as we speak, should be released soon. However note that the local NVMe setup does have backups and WAL-archival to S3, which provides data-durability with RPO of 10s of seconds. Even with HA setup I expect performance difference to be similar across systems (may be slightly lesser), as round-trip to secondary is common across most other services except Neon, whose reliability/availability story hasn’t been a strong area in general.

In regards to default configs, that was intentional as default tuning is a differentiation across services and that needs to be measured. However we plan to add more configurability on postgres tuning in the future.

Appreciate your feedback, super helpful!

ahachete | 23 hours ago

> note that the local NVMe setup does have backups and WAL-archival to S3, which provides data-durability with RPO of 10s of seconds.

That's good, it provides a reasonable level of RPO, although I prefer just 0. But with this the RTO (or, equivalently, the downtime) is potentially quite large.

> Even with HA setup I expect performance difference to be similar across systems (may be slightly lesser)

I have done some benchmarks and I have observed a 24-27% performance penalty for semi-sync replication on instances using local NVMe. That's why I say this is non zero.

> In regards to default configs, that was intentional as default tuning is a differentiation across services and that needs to be measured. However we plan to add more configurability on postgres tuning in the future.

That'd be great.

: [OP] saisrirampur | 21 hours ago
Ack, your insight is very interesting.
Back at Citus/Microsoft, we typically saw around a 30% performance drop with synchronous replication on EBS-backed Postgres. I’d expect something in that ballpark for RDS and Crunchy as well. For NVMe-backed Postgres, we haven’t yet measured the impact of quorum-based replication, and it’s certainly possible the overhead ends up being higher than 30%.
That said, the single-node margins are already quite substantial, over 2× in all cases and up to 5× versus RDS in our benchmarks. Even with a meaningful HA penalty, NVMe-backed setups could still remain very compelling from a performance perspective. We’ve just started running HA benchmarks, so stay tuned.
Side note local NVMe backed Postgres is for perf is not new - many enterprise companies like Datadog and Instacart run their performance critical services on them, though self-managed.
In regard to RTO for single-node setups, it wouldn’t be great (at least minutes) in most systems, since recovery still needs to happen from backups.
Overall, very useful feedback. Thanks again for chiming in!

karlmush | 23 hours ago

It’s a nice project, but it doesn’t seem like the open-source community has really picked it up yet. It’s been live since May, but there aren’t many GitHub stars so far and contributors.