Phillip Pearson - web + electronics notes

tech notes and web hackery from a new zealander who was vaguely useful on the web back in 2002 (see: python community server, the blogging ecosystem, the new zealand coffee review, the internet topic exchange).

2008-10-20

Thinking about embedding Ruby

The one thing that keeps me from doing a lot more with Ruby (or Python, for that matter) on the web is that I haven't found a decent way to run lots of little "play" applications on a server without stuffing it full of memory.

I'm having a play today, trying to see if it would be feasible to run Ruby the way PHP works - embed everything into Apache and recompile all user scripts on the fly.

The Ruby interpreter seems to start reasonably quickly. In a fork and exec loop on my laptop, I can spawn Ruby and run a nearly-empty script 54 times per second, compared with 154 and 22 times for Lua and Python respectively.

I had a quick try with embedding the Ruby interpreter, and found that forking and running ruby_init() and rb_eval_string("load 'myscript.rb'") gave about the same performance (50-65 repetitions/s). Running ruby_init() then forking and running the script worked much better, as expected, consistently running slightly over 400 repetitions/s. Here's the code: embed.cc.

This is promising. If I can code up my library functions tightly enough that recompiling them on every hit wouldn't hurt too much, it should work. An alternative is to write all library functions in C++ and load them as one big module, as with PHP.

Second test: instead of a test.rb that just sets a variable, I added a require 'mysql'. This brought the speed down to 100 repetitions per second, but this is still respectable, considering that I can precache important libraries like this with a rb_eval_string("require 'mysql'") line before the fork (i.e. somewhere in the Apache init process, if this were an Apache module). Trying this brought the speed back to 350 or so repetitions/s.

After initializing Ruby and loading the MySQL library, then forking and running the test script 1000 times, the parent process is sitting at around 6 M memory usage, which is fine.

For comparison at this point, I tried creating two Rack "hello world" apps, setting PassengerMaxPoolSize to 1, and repeatedly hitting one, then the other. This forces Passenger to unload one app and load the other on every hit. This managed about 11 requests per second (compared to about 100-250 per second for a single app).

My impression so far is that keeping an initialized Ruby interpreter sitting around, then forking on every request to compiling and running the app code, feels feasible. It will probably slow down rapidly as the app gets bigger, but hopefully there will be a natural point where it makes sense to switch from using this environment to using Passenger and keeping the loaded app in memory between requests. (... which uses an extra 20M or so of memory but lets you more or less forget about the startup time.)