Tracked down another bug in my reverse proxy; this one is to do with me giving Python's
asynchat module bad input.
Every now and then, the proxy would stop responding to requests and suck up as much CPU time as it could get.
netstat -an would tell me that there were a bunch of connection requests waiting (
SYN_SENT state) and a bunch that had been closed remotely (
CLOSE_WAIT state), probably from timed-out browser requests.
strace -p told me that the proxy was just continually calling
Adding more debug output showed me that the program was stuck in a loop calling the collector channel class's
collect_incoming_data() method. The
gettimeofday() call was from a call to
time.time() inside that method.
The interesting thing was that it was calling it with an empty string as the
data parameter. My theory is that I had called
set_terminator(0) somewhere earlier and that was causing
asynchat to run around in the aforementioned loop. There was a place in the code where I was calling
set_terminator(content_length), so I changed that to
set_terminator(content_length or None), but that didn't seem to fix it - so the assumption there was probably wrong.
Anyway, now I've put
assert len(data) in
collect_incoming_data(), which should kill any channels that get stuck this way in future. I haven't seen any assertion failures in the log files yet, so let's see if this fixes it...
Update: I've found a failure path. If you call
set_terminator(123), for example,
asynchat will pass the next 123 bytes of data through to
collect_incoming_data, then set the terminator to 0 itself, then call
found_terminator(). So if you EVER call
set_terminator() with a numeric argument, you MUST set the terminator to something else in the next call to
found_terminator() or your program will hang on the next byte of input on the socket. Ouch!