SQL: Incorrect by Construction

21 points by chreke 16 hours ago on lobsters | 18 comments

The solution is to choose a strong enough isolation level. Yes, databases usually choose performance rather than strictly serializable transactions, but at least SQL allows you to adjust it to your preferences. And you don’t have to fart around with explicit locks! The main downside is that conflicting transactions are more likely to abort, so you might want a retry loop; but that can turn deadlocks into livelocks, so be careful.

[OP] chreke | 15 hours ago

Yes, SERIALIZABLE would solve a lot of these problems; however, the issue I have with SERIALIZABLE is that, like you said, it's basically optimistic concurrency so you have to rely on an outside retry mechanism.

olegkovalov | 13 hours ago

Well, with Rust you know all the code upfront, with the SQL queries not that much. At any time a new unknown query may hit DB, that's why optimistic concurrency make sense.

endsofthreads | 11 hours ago

At any time a new unknown query may hit DB, that's why optimistic concurrency make sense.

Everyone knows that databases need to be flexible to new and unknown queries. What I'm presupposing... maybe they shouldn't?

To elaborate, I've wanted a database that treated query plans (and, by implication, the queries themselves) as pre-compiled, deployable artifacts, which means that the database can schedule transactions deterministically (à la Calvin), so you can do better than optimistic concurrency and the planner can produce optimal plans at CI-time instead of trying to JIT a plan at runtime. chreke links to a literature review on this subject, but... I wish this was more popular.

If you were to go down this path, you'd be able to treat suboptimal queries, at worst, as a faied build. It means that up-to-date statistics would need to be part of the deployable artifact, but... that's a good thing. Being too flexible to user input is a bad thing!

[OP] chreke | 10 hours ago

Yeah, Calvin is really cool; I’m working on a sequel to this blog post with my solution to the problem, and Calvin was a big inspiration

cajually | 7 hours ago

Making statistics into source is not a bad idea. I've seen plenty of postgres servers screw up query planning badly because the managing team thought they were close to the limit. When in fact they had loads beyond what could both serve queries and vaccume.

Every single case would have performed better if they had just kept the old query plans.

[OP] chreke | 12 hours ago

This is a good point, and the problems I describe in the article stem from trying to interleave procedural and declarative code. Unfortunately, in most applications this is unavoidable, so we’re stuck with having to try to glue queries together with locks and transactions.

For the record, optimistic concurrency is a completely resonabel default; my broader issue is that if you want deterministic concurrency there is no mainstream SQL database that can provide it.

weberc2 | an hour ago

Even with SERIALIZABLE you can end up with weird things like the database not correctly realizing that a read and written are unrelated so you get failures way more often and more easily than you expect. It’s OCC but it’s basically easier to implement it oneself at least if you want it to do the right thing reliably. Also, testing concurrency bugs is a pain and most devs frankly aren’t going to understand isolation levels (I have to refresh myself every time I touch them).

typesanitizer | 13 hours ago

Comment removed by author

gerikson | 16 hours ago

I have a nit, I believe the issues raised in the submission are with TSQL (and possibly other attempts to graft procedural programming onto SQL like PL/SQL), not SQL itself.

I mean, the distinction is hazy. Is COMMIT SQL? Well, yeah, but there are DB engines that expose a SQL interface but that don't have transactions.

[OP] chreke | 14 hours ago

This is a good point, but if you look at real-life applications they usually do the TSQL parts in application code instead, which only exacerbates the problem as there is more latency and more back-and-forth, which makes preemption and concurrency issues more likely.

If you want to write a money transfer procedure like the one in the post, I'm pretty sure you need something to stitch SQL queries together, so you will still have the same issues.

prussian | 14 hours ago

I noticed in your 2nd footnote you mention

Updating and checking the balance in a single UPDATE

It may help others, and myself, if you could show that. I understand it isn't the point of your article, however.

ABGEO | 13 hours ago

UPDATE accounts
SET balance = balance - 10
WHERE owner = 'alice' AND balance >= 10
RETURNING id;

If RETURNING comes back with zero rows, the precondition failed and you skip the credit to Bob. There's no TOCTOU window between the read and the write because they're the same statement. The row-level write lock that UPDATE takes when it locates a matching row also blocks any concurrent UPDATE on that row until the first one commits, so the second transaction re-evaluates its WHERE clause against the post-commit state and matches zero rows. This works at READ COMMITTED with no isolation level changes.

The thing this still doesn't give you is informative errors at the app layer. You only learn "no row affected," not whether the account is missing or just underfunded. A CTE or a follow-up SELECT can distinguish those cases, but at that point you're stitching statements together again, which is the broader point of the post.

Well said, but you almost certainly want to do this in a DB function anyway, and not at the app code level. You want something like select transfer_money('alice','bob',10); and have it do everything on the DB side and return any useful error messages if needed.

prussian | 8 hours ago

Thank you, but I was more curious of the "single update" statement case. It looks like this method would still require two individual UPDATEs.

I believe I solved this myself, and maybe the author or yourself can confirm or deny it:

WITH upd(owner, balance, newbal, sanity) AS
( SELECT a.owner, a.balance,
         CASE WHEN a.owner = N'bob' THEN a.balance + 10
              WHEN a.owner = N'alice' THEN a.balance - 10
         END,
         SUM(1) OVER () -- ignore if you assume both accts always exist
    FROM Scratch.Accounts a
   WHERE a.owner in (N'bob', N'alice')
)
UPDATE upd
   SET balance = newbal
 WHERE sanity = 2; -- see above

note: the table has a CHECK ( [balance] >=0 ) on balance column.

edit: someone kindly pointed out technically, if one of the accounts doesn't exist, bad things happen.

pmarreck | 7 hours ago

UPDATE accounts
SET balance = balance + CASE owner
                      WHEN 'alice' THEN -10
                      WHEN 'bob'   THEN  10
                    END
WHERE owner IN ('alice', 'bob')
AND (SELECT balance FROM accounts WHERE owner = 'alice') >= 10
AND (SELECT balance FROM accounts WHERE owner = 'bob') >= 0
RETURNING id, owner, balance;

And then just make sure 2 rows were returned, otherwise something went wrong

[OP] chreke | 13 hours ago

Good feedback, thanks! I’ll add it as a footnote

The neat thing about NoSQL DBs like CouchDB is that they don't pretend issues like this do not exist. The user is expected to handle them via mechanisms like CRDTs or document versioning.

The example in the article is short and isolated - makes it ideal for showing to SQL fanatics =)

enobayram | 9 hours ago

I can't imagine someone being a "SQL fanatic" and not already knowing the introduction paragraph of concurrency 101.