Speedup "RelStorage with memcached" done right

There are some options if using RelStorage with memcached - read how to do it right (iow: fast).
Speedup "RelStorage with memcached" done right

RelStorage is a great option as backend for  ZODB. RelStorage uses shared Memcached as second level cache for all instances storing to the same database.

In comparision a classical ZEO-Client with its ZEO-Server as backend uses one  filesystem cache per running instance (shared by all connection-pools of this instance). In both (ZEO/ RelStorage) cases pickled objects are stored in the cache. ZEO writes the pickles to the filesystem which takes its time unless you're using a RAM-disk. So while reading back its probably in RAM (OS-level disk-caches), but you can not know. Having enough free RAM helps here, but prediction is difficult. Also the one cache per-instance limitation while running 2 or more instances for some larger site makes this way of caching less efficient.

Additionally sysadmins usally are hating ZEO-Server (because its exotic) and loving PostgreSQL (well documented 'standard' tech they know how to work with) - a good reason to use PostgreSQL. On the ZEO-client side there are advantages too. While first level connection cache is the same as a usal ZEO-client, the second level cache is shared between all ZEO-clients.

[apt-get|yum] install memcached - done. Really?

No! We need to choose between pylibmc and python-memcached. But which one is better?

Also memcached is running on the same machine as the instances! So we can use unix sockets instead of TCP/IP. But what is better? 

Benchmarking FTW!

I did some benchmarks. Assuming we have random data to write and read with different keys and also want to check if the overhead accessing non-existent keys has an effect. I quickly put together a little script giving me numbers. After configuring two similar memcached each with 1024MB, one with tcp and the other as socket I run this script and got the following result:

Benchmark pylibmc vs. python-memcached

In short:

  • memcached with sockets is ~30% faster than memcached with tcp
  • pylibmc is significant faster with tcp
  • python-memcached is slighlty faster with sockets

Now this was valid on my machine. How does it behave in different environments? If you want to try/tweak the script and have similar or different results, please let me now!

Overall RelStorage will be faster if configured with sockets. If this is not possible choosing the right library will speedup things a least a bit.

Picture at top by Gregg Sloan at Flickr