asyncore/asynchat development gotcha
Tracked down another bug in my reverse proxy; this one is to do with me giving Python's asynchat
module bad input.
Every now and then, the proxy would stop responding to requests and suck up as much CPU time as it could get. netstat -an
would tell me that there were a bunch of connection requests waiting (SYN_SENT
state) and a bunch that had been closed remotely (CLOSE_WAIT
state), probably from timed-out browser requests. strace -p
told me that the proxy was just continually calling gettimeofday()
.
Adding more debug output showed me that the program was stuck in a loop calling the collector channel class's collect_incoming_data()
method. The gettimeofday()
call was from a call to time.time()
inside that method.
The interesting thing was that it was calling it with an empty string as the data
parameter. My theory is that I had called set_terminator(0)
somewhere earlier and that was causing asynchat
to run around in the aforementioned loop. There was a place in the code where I was calling set_terminator(content_length)
, so I changed that to set_terminator(content_length or None)
, but that didn't seem to fix it - so the assumption there was probably wrong.
Anyway, now I've put assert len(data)
in collect_incoming_data()
, which should kill any channels that get stuck this way in future. I haven't seen any assertion failures in the log files yet, so let's see if this fixes it...
Update: I've found a failure path. If you call set_terminator(123)
, for example, asynchat
will pass the next 123 bytes of data through to collect_incoming_data
, then set the terminator to 0 itself, then call found_terminator()
. So if you EVER call set_terminator()
with a numeric argument, you MUST set the terminator to something else in the next call to found_terminator()
or your program will hang on the next byte of input on the socket. Ouch!