Lambda@Edge: putting Serverless even closer to your users

And how to use it to kill all your web servers

Lambda at Edge illustration

Lambda@Edge was first announced in preview by AWS back in December; it hit public release in July. We’ve been using it since the preview – here’s some of the things we’ve learned along the way.

What is Lambda@Edge?

Lambda is AWS’s “serverless” platform for running code as functions – instead of booting EC2 Instances (i.e. “servers”) and worrying about operating systems, supervisors, logging, deployment, scaling, or any of the other factors required to actually run your code, you can simply write the code for a function in your favourite programming language (JavaScript, Python, Java or C#), and AWS will take care of the undifferentiated heavy lifting of actually getting that code to run somewhere.

CloudFront is AWS’s content delivery network (CDN), which acts as a layer on top of your web site or assets, for distributing your content globally, so that assets (typically JavaScript, CSS or images) can be cached and served from a point of presence (or “edge location”) geographically close to your users. You can put CloudFront in front of any web origin you like, most commonly either web servers of your own or hooked directly into an S3 bucket.

Typically with a CDN like CloudFront, web content falls into two categories: static and dynamic. Static content includes assets like JavaScript or images which are the same no matter who is requesting them or how; dynamic content includes parts of your web site or app that vary based on application logic – whether or not the user is logged in etc.

In most cases you’ll set up your CDN to cache static content so it’s always readily-available at the edge closest to your user and doesn’t need re-fetching. Dynamic content, since by definition it’s not constant, can’t be cached, and the CDN has to hit your origin server for those requests. This extra request back to the origin adds extra latency, and effectively negates the benefit of having a CDN in the first place.

What Lambda@Edge allows you to do is to push some of the dynamic logic from your origin onto CloudFront’s edge servers. You can write some code, deploy it to Lambda@Edge, and have it interact with requests from your visitors at the edge location, geographically close to your users, potentially enabling you to serve some of your dynamic content requirements without incurring the full round-trip latency to your origin server.

There are some restrictions to exactly what you can implement with Lambda@Edge – you get a very limited amount of resources, so reimplementing your entire application’s logic at the edge probably isn’t feasible – but there is a wide variety of simple cases in which Lambda@Edge can help you move logic closer to your users, reducing latency and creating a faster experience.

How does it work?

Lambda at Edge illustration

Lambda@Edge effectively allows you to hook into requests as they pass through the CloudFront edge servers, and modify data in flight. There are four different stages in the request lifecycle that you can hook into: viewer request, which comes immediately after the request is received by the edge server, before hitting the cache layer; origin request which come when the edge server has decided it needs to fetch data from the origin server, but before it sends the request; origin response after the origin has sent its response, but before the response gets cached; and viewer response which is called on every request just before the response is sent to your user’s browser.

At each of these stages the request/response body and headers can be modified. Your Lambda function can also call out to external sources of data to inject directly into the content. Additionally, either of the request hooks can be used to directly generate a response rather than letting the request continue to the cache or origin server.

Viewer vs. Origin handlers

There’s an important distinction between the viewer and the origin handlers described above: the viewer handlers will be called for every single request received by the CloudFront distribution, before any caching behaviour. This is useful if, for example, you want to perform an A/B test based on a cookie value by modifying the requested URL, but it’s important to bear in mind that these functions come with a (small) overhead that will add to the overall latency of the request. The origin handlers, on the other hand, are only called when the CloudFront edge does not have the requested content in the cache, or decides that the content needs to be refreshed. So if the modifications you are making to requests or responses can be cached and reused, the origin handlers may be more appropriate.

What can I do with this?

There are an endless number of possible things that can be achieved with Lambda@Edge, and here at GoSquared we’ve barely scratched the surface. But here are our favourite two use-cases:

Adding security headers on S3 origins

One of the best things about CloudFront since its inception is its support for plugging directly into S3 for serving static assets. You can simply put CloudFront in front of an S3 bucket and have it serve all the objects within it, without having to worry about setting up any web servers of your own.

The downside that comes with S3 origins is that S3 doesn’t give you a huge amount of control over exactly how those objects are served – most notably the HTTP headers sent along with the response data.

Many browser security features are controlled by HTTP headers such as Access-Control-Allow-Origin, Content-Security-Policy and Strict-Transport-Security, so being able to set these headers on all your resources and assets is essential nowadays. With Lambda@Edge it’s now a simple matter of adding an origin response handler to add or modify these headers before the response gets stored in the cache.

We’ve also found the fine-grained programmatic control allowed by Lambda@Edge to be useful purely for neatness as well as security. The deploy process we use for some of our S3 assets leaves useful metadata attached to these assets, which are served up as X-Amz-Meta-* headers. This metadata is useful for us and our deployments, but serves no value to the end-user, so we take advantage of Lambda@Edge to clean up these headers from being sent out.

Here’s an example of a Lambda@Edge origin response handler we might have for this sort of header-modification.

exports.handler = (event, context, callback) => {
  const headers = event.Records[0].cf.response.headers;

  // Strip out unnecessary metadata headers
  Object.keys(headers).forEach(k => {
    if (/^x-amz-meta/i.test(k)) {
      delete headers[k];
    }
  });

  function setHeader(k, v) {
    headers[k.toLowerCase()] = [ { key: k, value: v } ];
  }

  setHeader('Access-Control-Allow-Credentials', 'true');
  setHeader('Access-Control-Allow-Origin', '*');
  setHeader('Access-Control-Allow-Methods', 'OPTIONS, GET');
  setHeader('Access-Control-Allow-Headers', 'Content-Type, Depth, User-Agent, X-File-Size, X-Requested-With, If-Modified-Since, X-File-Name, Cache-Control, Cookie');
  setHeader('Strict-Transport-Security', 'max-age=31536000');
  setHeader('X-XSS-Protection', '1; mode=block');
  setHeader('Cache-Control', 'public, max-age=86400');

  callback(null, event.Records[0].cf.response);
};

Previously we would use a tier of web servers in between the S3 origin and CloudFront to perform all this header-modification logic. Now we can offload everything to Lambda@Edge and completely remove a whole moving part from the flow. Woo serverless!

Serving pre-compressed versions of files

In the original days of CloudFront, it did not support any kind of modification to your response data at all. This included modifications such as using gzip Content-Encoding to reduce bandwidth. This was fine if you were using an origin server which supported dynamically serving compressed or uncompressed content based on the Accept-Encoding request header. The one small problem with this was that S3 didn’t support dynamic compression like this.

Back in December 2015 CloudFront added support for gzip compression at the edge, which solved this problem for the most part. But here at GoSquared we had additional needs.

The gzip compression algorithm can be tweaked based on how heavily you want data compressed. This comes with a tradeoff where better compression (i.e. fewer output bytes) usually uses more processing power. If this compression is happening on every request then usually you choose a compression level that strikes a balance between low bandwidth and low server load. However, CloudFront doesn’t allow you to choose the compression level for automatic edge compression.

The benefit of static content, though, is that if it doesn’t change very often, then the compressed version also doesn’t change very often. That means that if you’re able to pre-compress all your content ahead-of-time, you can dial all the numbers up to maximum and not worry about any per-request overhead. And with more recent gzip-compatible algorithms like Zopfli, it’s actually possible to save a noticeable amount of bandwidth, albeit with a comparatively large (often huge) amount of upfront CPU usage.

For most use-cases this won’t make much of a difference, but for certain very-frequently-accessed assets, such as GoSquared’s JavaScript tracking snippet, which is accessed billions of times every month, a 4-5% reduction in bandwidth adds up to a noticeable cost.

The way we go about this is to generate two versions of every file at build-time: the normal, uncompressed version, and a gzip-compatible compressed version, generated using zopfli with maximum compression settings. We then use an origin request handler to rewrite the requested object URL based on the Accept-Encoding header:

exports.handler = (event, context, callback) => {
  const request = event.Records[0].cf.request;
  const headers = request.headers;

  let gz = false;

  const ae = headers['accept-encoding'];
  if (ae) {
    for (let i = 0; i < ae.length; i++) {
      const value = ae[i].value;
      const bits = value.split(',').map(x => x.split(';')[0].trim());
      if (bits.indexOf('gzip') !== -1) {
        gz = true;
      }
    }
  }

  // If gzip is supported, use the pre-compressed version of the file,
  // which is the same URL with .gz on the end
  if (gz) request.uri += '.gz';
    
  callback(null, request);
};

Cheeky side-note if anyone at AWS is reading this: we’d love to be able to use this technique to support alternative compression algorithms like brotli, but CloudFront currently rewrites the Accept-Encoding header to strip out anything other than gzip. Consider this a feature request?

What else can Lambda@Edge do?

Our own use-cases for Lambda@Edge mostly take advantage of the origin request and response handlers, but there are endless possibilities for other powerful uses of Lambda@Edge.

For example, a viewer request handler can be used for issuing redirects in front of a static S3 site (we currently have a tier of nginx web servers for handling redirects for old URLs, Lambda@Edge would allow that logic to be offloaded to CloudFront).

Another example would be to use a viewer response handler to modify the content of a response based on parameters that aren’t relevant to the cache behaviour. For example, you may have an image asset that you want to serve with a Content-Disposition header if a ?download=true query-string parameter is present, but allowing CloudFront cache differently based on query string parameters would be inefficient.

How are you using Lambda@Edge?

Are you using Lambda@Edge yourself? Do you have an even better use-case that we haven’t thought of yet? Let us know!

Never miss a post