Using WordPress, SQS, Lambda and Mailgun to send over 100,000 personalised transactional emails

8th March 2019

By Zac Scott

As part of a custom jobs board project that was built on top of WordPress, we had to build a system that was able to send over 100,000 personalised emails over a very short period of time all within the WordPress ecosystem. This presents some scaling and infrastructure challenges that we wanted to share.

The Problem

There are hundreds of thousands of users who all could opt in for a daily or weekly EDM containing new jobs that matched their preferences or search parameters.

That meant, every day, the system needed to search a database with thousands of both new and old jobs, and evaluate if that particular user was due to receive an alert with new matching jobs.

On top of that, these emails all needed to go out at approximately the same time. Because this email was containing newly matching jobs for a users search parameters, you wouldn’t want this to happen early in the morning before advertisers have posted their job ads.

A Typical WordPress Solution

A typical WordPress based solution to this problem would be to run this email generation process via a WP cron. This process would involve first generating the email content, personalised to the user, then sending it out to the user via WordPress’ mail function.

The problem with this approach is it is a synchronous process. Each of the emails would be generated and sent one after another, severely bottlenecking the process.

Meaning you could end up in a situation where the cron simply never finishes. A better solution would generate and send more than one email at a time, concurrently.

This might work for lower scale sends but it could not possibly keep up with the throughput which is expected of the system. I.e. 20k sends in an hour.

In fact this was the original solution that we were replacing. The old job board had been built with a singular cron that sent out these emails out. This worked well when first implemented but had since very much outgrown this simple approach.

Our Solution

To solve this problem we decided to leverage the concurrency offered by the use of an AWS Lambda function.

AWS Lambda functions follow a serverless model which allows one to run a script/program to execute simple jobs without the overhead of managing additional infrastructure / servers. Most importantly it simplifies running more than one of these functions at the same time to take full advantage of concurrency.

The process we decided on is as follows:

A WP CLI command generates the list of users which are due for an email and pushes their user ID to the SQS queue.

The AWS Lambda function is automatically triggered when an item is added to the SQS queue and pings a WordPress JSON endpoint to perform the search and generate the email, then push to Mailgun’s queue.

Up to 10 of these Lambda’s will be running at any given time. This is configurable but this was the number we settled on after some testing.

Once 6pm rolls around, Mailgun automatically delivers the email to the user.

The reason we use Mailgun for our email delivery system over SES which would be the natural fit, is that Mailgun has superior logging and analytics compared to SES, which means we didn’t have to build as much reporting as SES would need.

Below is an overview of this process:

To nail the 6 pm delivery time we decided on using the delivery time header offered by Mailgun. This can be easily done if you are using the Mailgun plugin by hooking into the headers like so:

add_filter( 'mg_mutate_message_body', function( $headers ) {
    
    $headers['o:deliverytime'] = $deliver_time_formatted;

    return $headers;

} );

The Outcome

The solution works wonderfully. The process of generating and scheduling the emails typically takes 2-3 hours and they are sent out by Mailgun right on 6pm.

Each day, our custom WP CLI command would automatically populate the SQS queue and the Lambda function would spin up 10 threads and work through the backlog:

Users added to SQS in bulk and processed over a few hours by the Lambda function

Here you can see the blue bars where Mailgun has accepted an email we have generated and the green bars where it has sent the emails out for us as the scheduled time.

Accepted emails and scheduled send in Mailgun