Module: cloudfront-s3-cdn
Terraform module to provision an AWS CloudFront CDN with an S3 origin.
Usage
For a complete example, see examples/complete.
For automated tests of the complete example using bats and Terratest (which tests and deploys the example on AWS), see test.
The following will create a new s3 bucket eg-prod-app
for a cloudfront cdn, and allow principal1
to upload to
prefix1
and prefix2
, while allowing principal2
to manage the whole bucket.
module "cdn" {
source = "cloudposse/cloudfront-s3-cdn/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
namespace = "eg"
stage = "prod"
name = "app"
aliases = ["assets.cloudposse.com"]
dns_alias_enabled = true
parent_zone_name = "cloudposse.com"
deployment_principal_arns = {
"arn:aws:iam::123456789012:role/principal1" = ["prefix1/", "prefix2/"]
"arn:aws:iam::123456789012:role/principal2" = [""]
}
}
The following will reuse an existing s3 bucket eg-prod-app
for a cloudfront cdn.
module "cdn" {
source = "cloudposse/cloudfront-s3-cdn/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
origin_bucket = "eg-prod-app"
aliases = ["assets.cloudposse.com"]
dns_alias_enabled = true
parent_zone_name = "cloudposse.com"
name = "eg-prod-app"
}
The following will create an Origin Group with the origin created by this module as a primary origin and an additional S3 bucket as a failover origin.
module "s3_bucket" {
source = "cloudposse/s3-bucket/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
attributes = ["failover-assets"]
}
module "cdn" {
source = "cloudposse/cloudfront-s3-cdn/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
aliases = ["assets.cloudposse.com"]
dns_alias_enabled = true
parent_zone_name = "cloudposse.com"
s3_origins = [{
domain_name = module.s3_bucket.bucket_regional_domain_name
origin_id = module.s3_bucket.bucket_id
origin_path = null
s3_origin_config = {
origin_access_identity = null # will get translated to the origin_access_identity used by the origin created by this module.
}
}]
origin_groups = [{
primary_origin_id = null # will get translated to the origin id of the origin created by this module.
failover_origin_id = module.s3_bucket.bucket_id
failover_criteria = [
403,
404,
500,
502
]
}]
}
Background on CDNs, "Origins", S3 Buckets, and Web Servers
CDNs and Origin Servers
There are some settings you need to be aware of when using this module. In order to understand the settings, you need to understand some of the basics of CDNs and web servers, so we are providing this highly simplified explanation of how they work in order for you to understand the implications of the settings you are providing.
A "CDN" (Content Distribution Network) is a collection of servers scattered around the internet with the aim of making it faster for people to retrieve content from a website. The details of why that is wanted/needed are beyond the scope of this document, as are most of the details of how a CDN is implemented. For this discussion, we will simply treat a CDN as a set of web servers all serving the same content to different users.
In a normal web server (again, greatly simplified), you place files on the server and the web server software receives requests from browsers and responds with the contents of the files.
For a variety of reasons, the web servers in a CDN do not work the way normal web servers work. Instead of getting their content from files on the local server, the CDN web servers get their content by acting like web browsers (proxies). When they get a request from a browser, they make the same request to what is called an "Origin Server". It is called an origin server because it serves the original content of the website, and thus is the origin of the content.
As a website publisher, you put content on an Origin Server (which users usually should be prevented from accessing) and configure your CDN to use your Origin Server. Then you direct users to a URL hosted by your CDN provider, the users' browsers connect to the CDN, the CDN gets the content from your Origin Server, your Origin Server gets the content from a file on the server, and the data gets sent back hop by hop to the user. (The reason this ends up being a good idea is that the CDN can cache the content for a while, serving multiple users the same content while only contacting the origin server once.)
S3 Buckets: file storage and web server
S3 buckets were originally designed just to store files, and they are still most often used for that. The have a lot of access controls to make it possible to strictly limit who can read what files in the bucket, so that companies can store sensitive information there. You may have heard of a number of "data breaches" being caused by misconfigured permissions on S3 buckets, making them publicly accessible. As a result of that, Amazon has some extra settings on top of everything else to keep S3 buckets from being publicly accessible, which is usually a good thing.
However, at some point someone realized that since these files were in the cloud, and Amazon already had these web servers running to provide access to the files in the cloud, it was only a tiny leap to turn an S3 bucket into a web server. So now S3 buckets can be published as websites with a few configuration settings, including making the contents publicly accessible.
Web servers, files, and the different modes of S3 buckets
In the simplest websites, the URL "path" (the part after the site name) corresponds directly to the path (under
a special directory we will call /webroot
) and name
of a file on the web server. So if the web server gets a request for "http://example.com/foo/bar/baz.html" it will
look for a file /webroot/foo/bar/baz.html
. If it exists, the server will return its contents, and if it does not exist,
the server will return a Not Found
error. An S3 bucket, whether configured as a file store or a website, will
always do both of these things.
Web servers, however, do some helpful extra things. To name a few:
- If the URL ends with a
/
, as inhttp://example.com/foo/bar/
, the web server (depending on how it is configured) will either return a list of files in the directory or it will return the contents of a file in the directory with a special name (by default,index.html
) if it exists. - If the URL does not end with a
/
but the last part, instead of being a file name, is a directory name, the web server will redirect the user to the URL with the/
at the end instead of saying the file wasNot Found
. This redirect will get you to theindex.html
file we just talked about. Given the way people pass URLs around, this turns out to be quite helpful. - If the URL does not point to a directory or a file, instead of just sending back a cryptic
Not Found
error code, it can return the contents of a special file called an "error document".
Your Critical Decision: S3 bucket or website?
All of this background is to help you decide how to set website_enabled
and s3_website_password_enabled
.
The default for website_enabled
is false
which is the easiest to configure and the most secure, and with
this setting, s3_website_password_enabled
is ignored.
S3 buckets, in file storage mode (website_enabled = false
), do none of these extra things that web servers do.
If the URL points to a file, it will return the file, and if it does not exactly match a file, it will return
Not Found
. One big advantage, though, is that the S3 bucket can remain private (not publicly accessible). A second,
related advantage is that you can limit the website to a portion of the S3 bucket (everything under a certain prefix)
and keep the contents under the the other prefixes private.
S3 buckets configured as static websites (website_enabled = true
), however, have these extra web server features like redirects, index.html
,
and error documents. The disadvantage is that you have to make the entire bucket public (although you can still
restrict access to some portions of the bucket).
Another feature or drawback (depending on your point of view) of S3 buckets configured as static websites is that
they are directly accessible via their website endpoint
as well as through Cloudfront. This module has a feature, s3_website_password_enabled
, that requires a password
be passed in the HTTP request header and configures the CDN to do that, which will make it much harder to access
the S3 website directly. So set s3_website_password_enabled = true
to limit direct access to the S3 website
or set it to false if you want to be able to bypass Cloudfront when you want to.
In addition to setting website_enabled=true
, you must also:
- Specify at least one
aliases
, like["example.com"]
or["example.com", "www.example.com"]
- Specify an ACM certificate
Custom Domain Names and Generating a TLS Certificate with ACM
When you set up Cloudfront, Amazon will generate a domain name for your website. You amost certainly will not want to publish that. Instead, you will want to use a custom domain name. This module refers to them as "aliases".
To use the custom domain names, you need to
- Pass them in as
aliases
so that Cloudfront will respond to them with your content - Create CNAMEs for the aliases to point to the Cloudfront domain name. If your alias domains are hosted by
Route53 and you have IAM permissions to modify them, this module will set that up for you if you set
dns_alias_enabled = true
. - Generate a TLS Certificate via ACM that includes the all the aliases and pass the ARN for the
certificate in
acm_certificate_arn
. Note that for Cloudfront, the certificate has to be provisioned in theus-east-1
region regardless of where any other resources are.
# For cloudfront, the acm has to be created in us-east-1 or it will not work
provider "aws" {
region = "us-east-1"
alias = "aws.us-east-1"
}
# create acm and explicitly set it to us-east-1 provider
module "acm_request_certificate" {
source = "cloudposse/acm-request-certificate/aws"
providers = {
aws = aws.us-east-1
}
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
domain_name = "example.com"
subject_alternative_names = ["a.example.com", "b.example.com", "*.c.example.com"]
process_domain_validation_options = true
ttl = "300"
}
module "cdn" {
source = "cloudposse/cloudfront-s3-cdn/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
namespace = "eg"
stage = "prod"
name = "app"
aliases = ["assets.cloudposse.com"]
dns_alias_enabled = true
parent_zone_name = "cloudposse.com"
acm_certificate_arn = module.acm_request_certificate.arn
depends_on = [module.acm_request_certificate]
}
Or use the AWS cli to request new ACM certifiates (requires email validation)
aws acm request-certificate --domain-name example.com --subject-alternative-names a.example.com b.example.com *.c.example.com
NOTE:
Although AWS Certificate Manager is supported in many AWS regions, to use an SSL certificate with CloudFront, it should be requested only in US East (N. Virginia) region.
If you want to require HTTPS between viewers and CloudFront, you must change the AWS region to US East (N. Virginia) in the AWS Certificate Manager console before you request or import a certificate.
https://docs.aws.amazon.com/acm/latest/userguide/acm-regions.html
To use an ACM Certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) region. ACM Certificates in this region that are associated with a CloudFront distribution are distributed to all the geographic locations configured for that distribution.
This is a fundamental requirement of CloudFront, and you will need to request the certificate in us-east-1
region.
If there are warnings around the outputs when destroying using this module.
Then you can use this method for supressing the superfluous errors.
TF_WARN_OUTPUT_ERRORS=1 terraform destroy
Lambda@Edge
This module also features a Lambda@Edge submodule. Its lambda_function_association
output is meant to feed directly into the variable of the same name in the parent module.
provider "aws" {
region = var.region
}
provider "aws" {
region = "us-east-1"
alias = "us-east-1"
}
module "lambda_at_edge" {
source = "cloudposse/cloudfront-s3-cdn/aws//modules/lambda@edge"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
functions = {
origin_request = {
source = [{
content = <<-EOT
'use strict';
exports.handler = (event, context, callback) => {
//Get contents of response
const response = event.Records[0].cf.response;
const headers = response.headers;
//Set new headers
headers['strict-transport-security'] = [{key: 'Strict-Transport-Security', value: 'max-age=63072000; includeSubdomains; preload'}];
headers['content-security-policy'] = [{key: 'Content-Security-Policy', value: "default-src 'none'; img-src 'self'; script-src 'self'; style-src 'self'; object-src 'none'"}];
headers['x-content-type-options'] = [{key: 'X-Content-Type-Options', value: 'nosniff'}];
headers['x-frame-options'] = [{key: 'X-Frame-Options', value: 'DENY'}];
headers['x-xss-protection'] = [{key: 'X-XSS-Protection', value: '1; mode=block'}];
headers['referrer-policy'] = [{key: 'Referrer-Policy', value: 'same-origin'}];
//Return modified response
callback(null, response);
};
EOT
filename = "index.js"
}]
runtime = "nodejs16.x"
handler = "index.handler"
memory_size = 128
timeout = 3
event_type = "origin-response"
include_body = false
}
}
# An AWS Provider configured for us-east-1 must be passed to the module, as Lambda@Edge functions must exist in us-east-1
providers = {
aws = aws.us-east-1
}
context = module.this.context
}
module "cdn" {
source = "cloudposse/cloudfront-s3-cdn/aws"
# Cloud Posse recommends pinning every module to a specific version
# version = "x.x.x"
...
lambda_function_association = module.lambda_at_edge.lambda_function_association
}