Asynchronous Processing with PHP on App Engine

OSCON 2013 Speaker Series

Note: Amy Unruh, Google Cloud Platform Developer Relations, is just one of the many fantastic speakers we have at OSCON this year. If you are interested in attending to check out Amy’s talk or the many other cool sessions, click over to the OSCON website where you can use the discount code OS13PROG to get 20% off your registration fee.

At this year’s Google I/O, we launched the PHP runtime for Google App Engine, part of the Google Cloud Platform. App Engine is a service that lets you build web apps using the same scalable infrastructure that powers many of Google’s own applications. With App Engine, there are no servers to maintain; you just upload your application, and it’s ready to go.

App Engine’s services support and simplify many aspects of app development. One of those services is Task Queues, which lets you easily add asynchronous background processing to your PHP app, and allows you to simultaneously make your applications more responsive, more reliable, and more scalable.

The App Engine Task Queue service allows your application to define tasks, add them to a queue, and then use the queue to process them asynchronously, in the background. App Engine automatically scales processing capacity to match your queue configuration and processing volume. You define a Task by specifying the application-specific URL of a handler for the task, along with (optionally) parameters or a payload for the task, and other settings, then add it to a Task Queue.

An Example

Let’s look at how easy it is to use Task Queues in PHP App Engine apps.

Suppose we are building a photo-sharing site, where multiple users can upload their images. Suppose that when a user uploads a picture to our site, we want to do some image processing, to generate and store several other sizes of the image. But we don’t need to keep our users waiting while this is going on. Instead, we’ll do it in the background.

To do this, we’ll use App Engine Task Queues. We’ll first define a task request handler that does the image processing. Suppose it’s called /image_processing.php, and takes two parameters: file_id, which is a string pointing to the uploaded image file, and the user_id of the user who uploaded the file. We’ll use the URL of this handler to define a Task Queue task as follows:

require_once 'google/appengine/api/taskqueue/PushTask.php';
use googleappengineapitaskqueuePushTask;

$task = new PushTask('/image_processing.php', ['file_id' => $fileid, 'user_id' => $userid]);

The first argument to the PushTask constructor is the URL of our task handler, and we’re passing our two parameters as the second argument. By default, tasks are run as POST requests, so these will be accessible as POST parameters.

Then, we’ll add our task to the app’s default task queue like this:

$task_name = $task->add();

When the image_processing.php handler runs—as a background task—it can access the params used in the task definition:

$file_id = $_POST['file_id'];

If there are only a few tasks in a given queue, a task in that queue will be run (nearly) right away, asynchronously from the client request/response cycle. More typically, there can be many tasks in a given task queue. App Engine automatically scales queue processing capacity to match your queue configuration and desired throughput, and deletes tasks after successful processing. If a task fails to execute, App Engine retries it based on criteria that you can configure.

Task Fanout and Task Scheduling

Next, suppose that for each user of the photo-sharing site, we want to regularly pull in photos from their social media feeds. With many users, who might each have many social media accounts, this is a lot of background processing! But we can do this processing with high throughput, using App Engine Task Queues and task fanout.

We first write a handler to initiate the feed ingestion, called /ingestion.php. In this handler, we launch an ingestion task for each user account. Suppose our user account-processing handler is called /user_processing.php.

The /ingestion.php code to launch all our user processing tasks might then look something like:

require_once 'google/appengine/api/taskqueue/PushTask.php';
use googleappengineapitaskqueuePushTask;

// ... get info on all the user accounts...
foreach ($user_accounts as $acct) {
    $task = new PushTask('/user_processing.php', ['user_account' => $acct]);
    $task_name = $task->add();
}

Here, we’re adding a large number of ‘user account’ tasks to the Task Queue (and, we could shard this step to add these tasks even faster). Multiple user account tasks will then be processed concurrently.

When the task for a given user runs, we will again launch a set of subtasks—this time, one for each of the user’s configured social media sources. So, the code in /user_processing.php might look like:

require_once 'google/appengine/api/taskqueue/PushTask.php';
use googleappengineapitaskqueuePushTask;

if (isset ($_POST['user_account']) {
  $user_account = $_POST['user_account'];
  // ... get the user's social media sources based on their account info...
  foreach ($sources as $source) {
    $task = new PushTask('/ingest_source.php', ['user_account' => $user_account, 
        'source' => $source]);
    $task_name = $task->add();
  }
}

Again, we’re adding many tasks to the queue, and multiple tasks will be processed concurrently. This fanout pattern lets us easily and quickly process a large number of tasks, and our App Engine app will scale in response, spinning up additional instances as necessary (according to the task queue configuration specs).

Finally, we’ll trigger the initiating handler—the one that starts the ingestion fanout—via a cron job that runs (say) every 2 hours. All that we need to do to set this up is to deploy a cron.yaml file with our app, with contents like this:

cron:
- description: ingestion cron task for users
  url: /ingestion.php
  schedule: every 2 hours

Once the cron is set up, we’ll be running the ingestion processing for all our users every 2 hours. App Engine will scale to handle the increased activity automatically, so we don’t have to worry that the background processing will impact the handling of our client requests.

Fanout

We’ve just scratched the surface of what you can do with Task Queues, which are one of the most powerful and flexible features of App Engine. We hope you’ll explore further, and we look forward to hearing more about how you’re using them in PHP.

[adrotate banner=”7″]

tags: , ,