Enabling Rapid Multi-Region Deployments With AWS CloudFormation and Lambda
In This Case Study
A large contracting company in the aerospace sector that works with scientists and engineers recently approached WWT to build an automated solution in two different cloud environments: AWS and Azure. They had a goal to move their existing development and production efforts for a multi-tier application from on-premises to Elastic Cloud Compute (EC2), Elastic Container Service and an Amazon Relational Database Service (RDS) Database. The customer wished to utilize the power of AWS to quickly enable the global reach of the application whenever required.
WWT and the customer met to discuss the solution's requirements and quickly identified multiple challenges. First, the solution needed to work similarly in both AWS and Azure. While the two cloud platforms share some similarities with services offered, there are nuances between the two that would require subtle adjustments. Second, multi-region automated deployment on demand was a requirement. When the customer no longer needed the solution, it would have to be removed entirely from the environment.
Due to the nature of the application and its dependencies, Infrastructure as Code (IaC) cannot handle this task independently. The solution would need to be paired with custom automation initiated by the IaC runtime to ensure that for create, delete and update events, the proper order of operations takes place to establish end-to-end automation. Lastly, the system's multiple components running in the on-premise data center were to be shifted to managed services in AWS. WWT and the product team would need to coordinate efforts to ensure we could meet the timelines as outlined by the customer.
The customer requested that we develop the solution in AWS CloudFormation. The first decision to be made was how to arrange the CloudFormation stacks. Although CloudFormation can handle dependencies well on its own, there was a lot of Order of Operations characteristics to consider when deploying this multi-tier application. WWT also recommended that we limit exposure of the application to the internet by following the Principle of Least Privilege. This entails ensuring all resources are private by default, access the internet via a NAT Gateway, only communicate to the resources they are intended to communicate with by leveraging narrowly scoped Security Groups and utilizing AWS Services for front-facing access to end-users, such as Application Load Balancers. By doing so, we limit the attack vectors on the application and protect the data transported to and from the application.
As you will see later, we also integrated into CloudFormation a Bastion host based on a Conditional parameter, such that the application administrators and sysadmins can access the systems securely if needed, and only if required.
The overall architecture consisted of 4 different CloudFormation stacks. These stacks utilized AWS Systems Manager Parameter Store to share data between the various stacks. We opted not to utilize AWS Nested Stacks for this deployment because of the complexities of operations that are required to be managed. Also, due to the various resources being created, some have a longer creation time than others, and by leveraging individual stacks, we can freely move resources around, fail fast and find the most optimal resource groupings in the fastest development cycles possible. Utilizing Parameter Store, various stacks can be worked on at different times without potentially interrupting the other stack members.
In addition, all resource names in CloudFormation were excluded such that the resources could be deployed and updated in any region. CloudFormation will utilize the logical ID of the resource and produce unique names for any resources it creates.
To ensure a successful end to end automation, there were prerequisite artifacts that needed to be in place before deploying the CloudFormation stack. These items were files that the CloudFormation stack (and resources created by it) would leverage to ensure the application deployment. In addition, a parameter was added to AWS Systems Manager Parameter Store with the location of this S3 bucket. This will be an input parameter to the deployment of the CloudFormation chain. These prerequisites, creating the S3 bucket and the parameter and subsequently loading the necessary files were created with an automation orchestrator, and the stack deployments were also orchestrated to ensure end-to-end automation.
The first stack deployed the baseline requirements. This included a VPC, Private and Public Subnets, Route Tables, NAT Gateways, Internet Gateways, Security Groups for the resources and Lambda backed custom resources. Lambda backed custom resources allow CloudFormation to use Lambda as part of its execution chain. When CloudFormation launches a custom Lambda resource, it must receive a signal back from Lambda to determine if it is still successfully creating the environment or if it needs to rollback. This stack's custom resources were used to download a Certificate file from the prerequisite secured S3 bucket and subsequently upload it for use on the front-end application load balancer.
Stack two focused on the database. This stack utilized AWS Secrets Manager to create a unique master password for the database and store it in Secrets Manager with rotation enabled. This allows the stack to generate a unique password for the RDS Database every time it is launched. In addition, this stack creates a Custom Lambda Layer utilizing a python package stored as a prerequisite in an S3 bucket to work with RDS after it is online.
The Lambda has a dependency trigger on the database and will only run after the Database signals CloudFormation that it is available. Once available, the Lambda backed custom resource deployed in this stack utilizes the Lambda layer and configures the schema of the RDS Database by setting up the configuration for multiple tables and views by fetching a schema file on S3 and applying it inside of the database.
The third stack is focused on the automation of ECR and ECS. An AWS CodePipeline is created to build the initial docker image to place in ECR. CodePipeline utilizes CodeBuild and a build spec, stored as a prerequisite in S3, along with some other files required in the Docker build. Once the CodeBuild run is complete, it publishes a container image to ECR so that ECS can utilize it.
The application front end EC2 servers are deployed in this stack as well. They utilize different bootstrap scripts depending on the server and leverage an Autoscaling group to balance the systems across multiple Availability Zones for High Availability and resiliency. These EC2s use secrets generated by the bootstrapping scripts for secure communication, and these are stored as part of the bootstrap process into Secrets Manager.
In addition to storing this secret in Secrets Manager, the EC2s require access. Because CloudFormation cannot "store" a private EC2 Key Pair, a Lambda backed custom resource function was created to generate an EC2 Key Pair and read the API response to store the private key information as another secret in Secrets Manager. Only individuals with the ability to decrypt these secrets in Secrets Manager will access the EC2 instances.
These EC2s are in private subnets, so this stack also creates a conditional Bastion server with an additional parameter specifying the CIDR range allowed to communicate to it. The CIDR range is inserted into a Security Group with access to the Bastion Host. This Bastion Host is only utilized if there is an issue with the end-to-end automation, so a systems administrator or developer could enter the environment and evaluate/troubleshoot.
In addition to the Lambda custom resource above that creates EC2 Key Pairs, another Lambda custom resource was created to ensure that all of the images launched when the stack is launched uses a specific version of the official CentOS AMI. This Lambda searches the AMI Marketplace for the CentOS official image in any region it is deployed in. Finally, another Lambda custom resource is created that empties the ECR Repositories and all S3 Buckets upon Stack Deletion. By default, CloudFormation will not remove resources such as ECR or S3 if they contain objects. Because we want this stack to completely spin up and tear down unattended, we need to utilize a custom resource to clean these objects upon stack deletion.
We have now established the networking baseline, the database layer, the front end layer and creating a container image pipeline and automated code build to set up the application layer. We have also utilized several Lambda backed custom resource functions to help facilitate the end to end automation for creation, deletion, and security of the environment.
The last stack deploys the application layer and another AWS CodePipeline to allow changes to the container image to be automatically pushed to ECR and updated in ECS. This stack creates an ECS Cluster, task definitions and service utilizing the initial container build. In addition, it creates an AWS CodePipeline and CodeBuild that can push updates to the container image to ECR to be picked up by ECS. We then set up scaling configuration for ECS such that the application layer could scale to meet anticipated demand.
This combination of stacks can build an entire environment and remove the environment completely unattended via stack launches. Maintaining all of this can be difficult, so specific naming conventions were established to ensure understanding and recognition of one stack's output variables to be used as input variables to another stack. This enabled administrators of the stacks to easily identify which stack "owns" which outputs and if adjustments are needed, they could be tracked down and updated expediently.
With AWS CloudFormation, you can simplify deployments of your environments. Even with complex Order of Operations, CloudFormation can utilize the power of Custom Lambda Resources to coordinate stack events and extend the functionality of CloudFormation, depending on the stack event (Create, Update, and Delete). In addition, AWS security tools and automation integrate nicely with CloudFormation, CodeBuild and RDS, allowing for efficiency and reliability.
The customer achieved their goal of speeding time to implementation of this solution, having a secure environment in 1-to-many regions, and having the environment be completely isolated yet accessible in a highly secure fashion.