Sunday, July 25, 2010

11:06 PM
Memcached is a scalable caching solution developed by Danga interactive. One can do a lot ofcool things using memcached including spam control, online-offline detection of users, building scalable web services. In this post, I will demonstrate and explain how to implement fast scalable queues in PHP.


MEMQ: Overview
Every queue is uniquely identified by it’s name. Let’s consider a queue named “foo” and see how MEMQ will implement it inside memcached:

  • Two keys namely, foo_head and foo_tail contains meta information about the queue
  • While queuing, item is saved in key foo_1234, where 1234 is the current value of key foo_tail
  • While de-queuing, item saved in key foo_123 is returned, where 123 is the current value of key foo_head
  • Value of keys foo_head and foo_tail start with 1 and gets incremented on every pop and push operation respectively
  • Value of key foo_head NEVER exceeds value of foo_tail. When value of two meta keys is same, queue is considered empty.
MEMQ: Code

<?php

define('MEMQ_POOL', 'localhost:11211');
define('MEMQ_TTL', 0);

class MEMQ {

private static $mem = NULL;

private function __construct() {}

private function __clone() {}

private static function getInstance() {
if(!self::$mem) self::init();
return self::$mem;
}

private static function init() {
$mem = new Memcached;
$servers = explode(",", MEMQ_POOL);
foreach($servers as $server) {
list($host, $port) = explode(":", $server);
$mem->addServer($host, $port);
}
self::$mem = $mem;
}

public static function is_empty($queue) {
$mem = self::getInstance();
$head = $mem->get($queue."_head");
$tail = $mem->get($queue."_tail");

if($head >= $tail || $head === FALSE || $tail === FALSE)
return TRUE;
else
return FALSE;
}

public static function dequeue($queue, $after_id=FALSE, $till_id=FALSE) {
$mem = self::getInstance();

if($after_id === FALSE && $till_id === FALSE) {
$tail = $mem->get($queue."_tail");
if(($id = $mem->increment($queue."_head")) === FALSE)
return FALSE;

if($id <= $tail) {
return $mem->get($queue."_".($id-1));
}
else {
$mem->decrement($queue."_head");
return FALSE;
}
}
else if($after_id !== FALSE && $till_id === FALSE) {
$till_id = $mem->get($queue."_tail");
}

$item_keys = array();
for($i=$after_id+1; $i<=$till_id; $i++)
$item_keys[] = $queue."_".$i;
$null = NULL;

return $mem->getMulti($item_keys, $null, Memcached::GET_PRESERVE_ORDER);
}

public static function enqueue($queue, $item) {
$mem = self::getInstance();

$id = $mem->increment($queue."_tail");
if($id === FALSE) {
if($mem->add($queue."_tail", 1, MEMQ_TTL) === FALSE) {
$id = $mem->increment($queue."_tail");
if($id === FALSE)
return FALSE;
}
else {
$id = 1;
$mem->add($queue."_head", $id, MEMQ_TTL);
}
}

if($mem->add($queue."_".$id, $item, MEMQ_TTL) === FALSE)
return FALSE;

return $id;
}

}

?>



MEMQ: Usage

The class file provide 3 methods which can be utilized for implementing queues:


  1. MEMQ::is_empty – Returns TRUE if a queue is empty, otherwise FALSE
  2. MEMQ::enqueue – Queue up the passed item
  3. MEMQ::dequeue – De-queue an item from the queue
Specifically MEMQ::dequeue can run in two modes depending upon the parameters passed, as defined below:
  1. $queue: This is MUST for dequeue to work. If other optional parameters are not passed, top item from the queue is returned back
  2. $after_id: If this parameter is also passed along, all items from $after_id till the end of the queue are returned
  3. $till_id: If this paramater is also passed along with $after_id, dequeue acts like a popRange function
Whenever optional parameters are passed, MEMQ do not remove the returned items from the queue.

MEMQ: Is it working?

Add following line of code at the end of the above class file and hit the class file from your browser. You will get back inserted item id as response on the browser:



var_dump(MEMQ::enqueue($_GET['q'], time()));

Lets see how cache keys looks like in memcached:


Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

get foo_head
VALUE foo_head 1 1
1
END

get foo_tail
VALUE foo_tail 1 1
2
END

get foo_1
VALUE foo_1 1 10
1265540583
END

get foo_2
VALUE foo_2 1 10
1265540585
END




MEMQ: Benchmark

Below are the benchmarking results for varying load:


  1. Queuing performance: 697.18 req/sec (n=1000, c=100) and 258.64 req/sec (n=5000, c=500)
  2. Dequeue performance: 641.27 req/sec (n=1000, c=100) and 242.87 req/sec (n=5000, c=500)

MEMQ: Why and other alternatives

There are several open source alternatives which provide a lot more scalability. However, MEMQ was written because my application doesn’t expect a load in order of 10,000 hits/sec. Listed below are a few open source alternatives for applications expecting high load:


  1. ActiveMQ: A reliable and fast solution under apache foundation
  2. RabbitMQ: Another reliable solution based on AMQP solution
  3. Memcacheq: A mash-up of two very stable stacks namely memcached and berkleyDB. However, it’s installation is a bit tricky.

MEMQ: Mantra and Customization

At the base MEMQ implementation can be visualized as follows:


There is a race between two keys in memcached (foo_head and foo_tail). Both are incremented on every dequeue and queue operation respectively. However, foo_tail is strong enough and never allows foo_head to exceed. When value of keys foo_tail and foo_head are equal, queue is considered empty.
The above code file still doesn’t include utility methods like MEMQ::total_items etc. However, writing such methods should be pretty easy depending upon your application needs. Also depending upon your application requirement, you should also take care of overflowing integer values.

0 comments: