You have to call shutdown(SHUT_WR) becore close().
But the real problem is that nginx will stop sending the request as soon as it recieves any bytes of the response. This is explicitly allowed by e.g. FastCGI, but nginx does not allow it.
And in my testing it intermittently fails even if you have shutdown(SHUT_WR):
server output:
sent 10000000 bytes
client output:
connected on the 11th time, total sent 1000, total received 9423712 bytes
Recv returned error: Connection reset by peer (os error 104)
shutdown(RD_WR) seems to fail every time on my Windows machine, while shutdown(WR) fails about 10% of the time. On my Linux machine, they both seem to fail about 10% of the time.
Why should that be fundamentally different from shutdown(RD_WR)?
That will also work, but you may be able to shutdown writes before you've finished reading, and it could reset the connection if there is still data to read (see below).
And in my testing it intermittently fails even if you have shutdown(SHUT_WR):
Right. The original code should have this problem too. But the code in the blog post doesn't match the nginx/gunicorn situation since nginx does read the client data.
It doesn't explain why it succeeds 90% of the time either.
Try using a unix socket. I think that should fail all the time.
It also introduces a race condition. How do you know more data hasn't arrived between when you call recv() and when you call close()?
You don't. That's why TCP protocols need some kind of agreement for when to close the connection. In HTTP you know you can close the connection after you read Content-Length bytes. In SMTP you know you can close the connection after you receive QUIT\r\n.
I had no idea this was a thing. In my mental model, not calling recv() could potentially cause problems due to buffers filling up, but I would never have guessed that sending data would potentially fail just because there's one unread byte in the receive buffer. Good to know!
Are there any solutions to this short of consuming all incoming data (and therefore opening yourself up to a DoS)? Is there a flag or a flush function or something else you can use to say "transmit all data now, then close the socket without regard for the receive buffer"?
A bit of a tangent, but docker can cause issues like these, if you have a service that is restarting (and sometimes even if you don't). Seems in some conditions it rebuilds its virtual network devices, including messing with the firewall and routing table. This doesn't just affect the container(s), but processes outside of docker, because docker is not good software.
This is a really good write up. I've seen the same symptoms in a similar sounding deployment. I never had the time to dig down to this level because it wasn't causing any observable issues for users. Now I almost want to run into the problem again so I can use this new info :)
Forty-Bot | 13 hours ago
You have to call shutdown(SHUT_WR) becore close().
But the real problem is that nginx will stop sending the request as soon as it recieves any bytes of the response. This is explicitly allowed by e.g. FastCGI, but nginx does not allow it.
majaha | 9 hours ago
Why should that be fundamentally different from shutdown(RD_WR)?
Anyway, I recreated the test code in Rust
And in my testing it intermittently fails even if you have shutdown(SHUT_WR):
shutdown(RD_WR) seems to fail every time on my Windows machine, while shutdown(WR) fails about 10% of the time. On my Linux machine, they both seem to fail about 10% of the time.
Forty-Bot | 9 hours ago
That will also work, but you may be able to shutdown writes before you've finished reading, and it could reset the connection if there is still data to read (see below).
You have to read the data the client sends.
majaha | 8 hours ago
That's not what the original code does! https://movq.de/blog/postings/2026-05-05/1/server.c It doesn't explain why it succeeds 90% of the time either.
It also introduces a race condition. How do you know more data hasn't arrived between when you call recv() and when you call close()?
Forty-Bot | 8 hours ago
Right. The original code should have this problem too. But the code in the blog post doesn't match the nginx/gunicorn situation since nginx does read the client data.
Try using a unix socket. I think that should fail all the time.
You don't. That's why TCP protocols need some kind of agreement for when to close the connection. In HTTP you know you can close the connection after you read Content-Length bytes. In SMTP you know you can close the connection after you receive
QUIT\r\n.mort | 18 hours ago
I had no idea this was a thing. In my mental model, not calling recv() could potentially cause problems due to buffers filling up, but I would never have guessed that sending data would potentially fail just because there's one unread byte in the receive buffer. Good to know!
Are there any solutions to this short of consuming all incoming data (and therefore opening yourself up to a DoS)? Is there a flag or a flush function or something else you can use to say "transmit all data now, then close the socket without regard for the receive buffer"?
marginalia | 19 hours ago
A bit of a tangent, but docker can cause issues like these, if you have a service that is restarting (and sometimes even if you don't). Seems in some conditions it rebuilds its virtual network devices, including messing with the firewall and routing table. This doesn't just affect the container(s), but processes outside of docker, because docker is not good software.
dvogel | 13 hours ago
This is a really good write up. I've seen the same symptoms in a similar sounding deployment. I never had the time to dig down to this level because it wasn't causing any observable issues for users. Now I almost want to run into the problem again so I can use this new info :)