S3 Security Hardening For Private VPCs Using Terraform
June 19, 2023
Sometimes you need to store sensitive data in an S3 bucket and need to be sure it can not be accessed outside of your VPC. AWS offers a variety of ways to protect your data from unauthorized access. But it can be challenging to set up correctly. In the following article I will go over my preferred way of hardening S3 bucket access and provide a ready to use Terraform configuration for you to adapt to your needs.
Use case
Let’s say you are storing PII (Personal Identification Information) in an S3 bucket and need a Lambda function to process that data to generate reports. Additionally you have the following requirements:
- You want to make sure the data in the bucket is encrypted.
- You want to enable your security team to manage the encryption keys.
- You want to make sure that only the Lambda function can access the data in the bucket.
- You want to make sure that any traffic to and from the S3 bucket stays within the VPC and doesn’t traverse the internet.
This can all be achieved with AWS native controls. Let’s have a look, and please follow along with the full Terraform example here: https://github.com/skripted-io/aws-s3-hardening-terraform
Principle of least privilege
The core concept we are going to use is the principle of least privilege. This means that for any resource used, we are going to make sure that only those who need access (people or services), actually have access. In this example that means we only want a single, specific Lambda function to access our bucket.
The solution
The solution provisions a S3 Gateway Endpoint in a VPC to route traffic directly to S3 via AWS's own internal network, instead of over the internet. An S3 bucket is configured with a bucket policy to only allow traffic coming via the endpoint. The endpoint is configured to only allow traffic targeted at a specific bucket. And the endpoint only allows requests coming from a specific role, in our case, a Lambda role. Finally, we are encrypting the bucket with a custom KMS key. A key policy determines who can use and modify the key. Let's have a look at the details.
Lambda function
With this example, we are going to use a simple Lambda test function written in Python. It generates test data, writes it to a file, uploads it to S3 and downloads it again to show its output. If it works it means we have set up our privileges correctly.
IAM role
By design, Lambda functions need to be assigned an IAM role. You assign permissions to the role and Lambda acquires those permissions by “assuming” the role. This is a great strategy because it means you have to explicitly grant permissions. When done carefully, it fits in with our principle of least privilege.
The following code creates a role, assigns permissions to the role and tells the role it can be assumed by Lambda. In this case we assign two managed policies that allow Lambda to write logs and connect to our VPC. Next we create a custom policy that gives Lambda access to our S3 bucket.
The IAM role is crucial to further restricting access to our resources, as you’ll see later.
Finally we create the Lambda function and assign the role to it.
S3 Bucket creation
Of course we need to create our bucket as well. To start, we create an S3 bucket configured with “all public access blocked” and “bucket owner enforced”. This provides the highest level of security using the bucket’s own access controls.
You might notice there's no mention about blocking public access, that's because nowadays blocking public access is the default when creating a new bucket.
KMS key
Next we create a KMS key that can be used by S3 for server side encryption. It has a policy that allows the AWS root user (or a security team) to manage the key, and a policy to allow the Lambda role to use the key (but not manage it). This makes sure that the Lambda function can encrypt and decrypt the data in the bucket, but not grant someone else access to the key or accidentally delete the key. The latter would result in the data being lost.
S3 Bucket encryption
Now we need to tell S3 to apply server side encryption to our bucket, using the KMS key we created earlier. Because using a customer-managed KMS key costs money every time it’s requested, we enable the “bucket key” which stores a copy of the key close to S3. Now instead of having to access KMS for every operation, S3 can use the key that is stored in its cache and this saves us money.
S3 Gateway endpoint
In our situation, the Lambda function is running in a private subnet and doesn’t have access to the internet. This is by design because it allows for maximum security. However, since S3 is a “public service” it also means our Lambda can’t access our bucket. To overcome this we have to create a Gateway endpoint.
A Gateway endpoint is a direct connection between a VPC and the AWS S3 service using AWS’s internal network. When we create a gateway endpoint it automatically adds a route to the private route table so Lambda knows how to access S3. Additionally, we attach a policy to the endpoint saying that it can only be used by our Lambda role.
S3 Bucket policy
Lastly we create a bucket policy that tells the bucket to deny all traffic unless it is coming via the S3 Gateway endpoint we just created. Additionally the traffic must also be requested by our Lambda role.
Wrapping it up
There you have it, a solution that meets all of our requirement. Of course, in a real world scenario you probably want a second service to upload data to the bucket for the Lambda to process. In that case you can simply create another role for that service and adapt the policies accordingly.
It took me a while to figure this out and I hope it serves you well. Contact me if you have questions.
About the author
Join my mailing list
Stay up to date with everything Skripted.
Sign up for periodic updates on #IaC techniques, interesting AWS services and serverless.