Cron Queues: Processing large amounts of data in cron
I've just discovered my latest favorite new feature in Drupal 7. Cron Queues. Previously when trying to process large amounts of data (like sending 10,000 emails) we were left with trying to get Batch API and cron to work together. Not the easiest thing in the world.
Recently we needed to run command line tasks regularly that could take quite a long time and I needed to get this to work with cron in Drupal 7. I knew there was a new queue system in Drupal 7 so I started reading up on it and looking at the code. I couldn't quite figure out how to get it to work with cron though until I looked at the cron code. Once I did, I realized just how awesomely brilliant it was. Here is how you can queue up data and process it with cron. I'm going to show this example as a "runner" module.
Step 1 - Create a Cron Queue and callback task
The first thing we need to do is create what is called a Cron Queue. These are hybrid queues with a callback function that will be called for each item in the queue. This can be done with hook_cron_queue. This tells cron to create a queue and what function to call for each item during cron. You can define multiple cron queues within hook_cron_queue_info().
/**
* Implementation of hook_cron_queue_info()
*/
function runner_cron_queue_info() {
$queues['runner'] = array(
'worker callback' => 'runner_run', // This is the callback function for each queue item.
'time' => 180, // This is the max run time per cron run in seconds.
);
return $queues;
}
Step 2 - Fill the queue with data
This is the step that took me a minute to figure out since I wasn't sure what to do during the hook_cron phase. The hook_cron_queue_info above will automatically create a queue with the key specified so we just need to get it and fill it with the items to process.
/**
* Implementation of hook_cron()
*/
function runner_cron() {
$items = array("Hello", "World");
// Put everything in a queue for processing.
$queue = DrupalQueue::get('runner');
foreach($items as $item) {
$queue->createItem($item);
}
}
Step 3 - Create the worker callback function
The last thing we need to do is create the function that will be called and process the data. This can do whatever we want it to do.
/**
* Worker Callback for the runner cron queue.
*/
function runner_run($item) {
print $item;
}
That's it!
You could have just iterated over each of the items and processed them in hook_cron() but this is safe for memory and time overruns if you dataset gets too large.
I'm very impressed with how much easier this is to implement than the same solution in Drupal 6. Kudos to chx, Crell, sun, dww, dries and all the rest who got this rolled here http://drupal.org/node/578676
Photo credit: http://www.flickr.com/photos/olly247/2831803988/