Scaling Options in AWS.
Overview
Scaling resources helps in coping with any increase in demand for resources. The scaling plan allows us to quickly configure scaling actions to a group of AWS resources. Under Scaling options, we have dynamic rising and predictive scaling. Tagging AWS resources help in applying mounting options to resources quickly.
What is a Scaling Plan?
Scaling is needed to ensure the increase in the need for resources is fulfilled. Suppose we have an application server running that experiences a tremendous increase in traffic overnight. To successfully serve every user request without experiencing server crashes, we need to increase the number of application servers running.
We need to add more EC2 instances to the load balancer if running our server on AWS. Not only does the EC2 instance count needs to be increased, but some other aspects, like routing, database storage, etc., also need to be taken care of.
The scaling plan helps us to configure scaling actions to a group of AWS resources without much time involved. For example, we can add tags to our AWS resources and auto-scaling groups based on the environment in which they are, like the development environment, testing environment, production environment, etc. These tags are helpful if there is a need to increase resources only in the production environment.
Types of Scaling Plans
There are two types of scaling options. Let's discuss each of them in brief.
Dynamic Scaling
- Under this scaling plan, targets have been created that track the scaling policies of the AWS resources.
- With any changes in resource utilization, the resource capacity is also adjusted by changing the scaling policies.
- The changes in resource capacity are made to use the resources, such as to maintain the target utilization.
- We can compare the adjustments to the resource capacity similar to the thermostat, which adjusts to maintain the room temperature.
The diagram below helps to understand the difference in resource capacity while using dynamic scaling and without using one.
Predictive Scaling
- Under this scaling option, machine learning prediction is used based on the previous data, and future predictions are made of resource utilization.
- Based on the predictions made by machine learning algorithms, scaling actions are scheduled. Scheduling the actions prior helps us to make ourselves ready for any increase in resource capacity in the future.
- Predictive scaling works similarly to weather prediction. Based on the prediction, if it predicts rain, we cancel our plan to avoid unwanted circumstances.
- Similar to dynamic scaling, predictive scaling also aims to maintain target resource capacity to maintain utilization.
The diagram below helps to understand how predictive scaling helps schedule desired actions.
Supported Resources
Under AWS Auto Scaling, the following resources support scaling options:
- Amazon Aurora – Amazon Aurora is used in the scaling option. Aurora DB cluster consists of Aurora Read replicas. When the need arises, the number of read replicas in the Aurora DB cluster can be scaled up or down.
- Amazon EC2 - We can use the Amazon EC2 Auto Scaling group to launch new EC2 instances to increase the number of servers. Similarly, the EC2 instances can be terminated to decrease the number of servers running under the auto-scaling group.
- Amazon Elastic Container Service – Under the auto-scaling plan, the task count of Amazon ECS can be increased or decreased to cope with the scaling plan.
- Amazon DynamoDB – The provisioned reading and writing capacity of the DynamoDB table can be added or removed under scaling options.
- Spot Fleet – Under scaling options, the target capacity of the spot fleet can be increased or decreased by increasing and decreasing the number of running EC2 instances.
Scaling Plan Features and Benefits
The scaling plan offers the following features and benefits.
- Resource Discovery - Resource discovery is a feature of AWS Auto Scaling that helps find resources in the application. The Automatic resource discovery can be scaled.
- Dynamic Scaling - The capacity of scalable resources is adjusted by scaling plans using the Amazon EC2 Auto Scaling and Application Auto Scaling. These services manage variations in traffic or workload. Dynamic scaling can be standard utilization or throughput, custom metrics.
- Built-In Scaling Recommendations - AWS Auto Scaling offers to-scale solutions to optimize performance and costs. It maintains the balance between performance and costs.
- Predictive Scaling - Scaling for Auto Scaling groups is also supported by scaling plans. This enables the Amazon EC2 capacity to scale more quickly during regularly occurring spikes.
How Scaling Plans Work
Now, it's time to understand how scaling plan work. We can configure a set of guidelines for scaling the resources using the AWS scaling plans feature. If we use AWS CloudFormation, we can create scaling plans for multiple sets of resources per application. AWS Auto Scaling console offers suggestions for scaling tactics that are customized to each resource. The scaling strategy is supported by the plan, which includes dynamic and predictive scaling techniques.
What is a Scaling Strategy?
The scaling strategy instructs AWS Auto Scaling to utilize resources in the scaling plan as efficiently as possible. These scaling strategies create a balance between availability and cost optimization.
We can also create our custom strategy per the metrics and thresholds we define. We can establish different strategies for resources or resource types.
Best Practices for Scaling Plans
Now, we understand the concept of a scaling plan and its features. Let's learn about some of the best practices for scaling plans.
-
When we create a template, it enables comprehensive monitoring to receive CloudWatch metric data for EC2 instances at a one-minute frequency to ensure a quicker response to load changes. If we set scaling on metrics with a five-minute frequency, it will cause a slower response time. For EC2 instances, essential monitoring is enabled, which means instance metrics are provided every five minutes. We can enable detailed monitoring for metric data for instances at a one-minute frequency for an extra fee.
-
It is one of the best practices to enable Auto Scaling group metrics. Otherwise, the capacity forecast graphs displayed once the Create Scaling Plan process is finished do not display actual capacity data.
-
Another practice is to check the instance type used by the Auto Scaling group. Amazon EC2 instances, like T3 and T2 instances, provide a baseline level of CPU performance with a burstable performance at a higher level when required. We could run the danger of surpassing the baseline and then running out of CPU credits, which would limit performance, depending on the target utilization given by the scaling plan.
Getting Started with Scaling Plans
Let's learn how to create a scaling plan for the application. Before that, we should review the application thoroughly as it runs on AWS.
- Check for the earlier created scaling policies.
- The optimum goal utilization for each scalable resource in the application, as determined by the resource's overall performance.
- We should know how long a server takes to start up and be configured.
- Check the metric history to determine whether it is long enough to be used with predictive scaling.
These are a few points that should be understood for the application to make the scaling plan more effective.
Let's become more familiar with scaling plans with hands-on experience.
Pre-requisite
A Prerequisite before creating the scaling plan is that we should have an Auto Scaling group. Let's create an Auto Scaling group to practice utilizing a scaling plan.
Follow these steps:
-
Log in to the AWS account. Click on this link to open the EC2 instance console page and select Auto Scaling Groups from the navigation pane. Click on Create Auto Scaling group.
-
Enter the Auto Scaling group name and click Create a launch template.
-
A new tab opens the Create Launch template page. Enter the Launch template name.
-
Choose AMI as Amazon Linux 2 and t2.micro as the instance type.
-
Choose a Key pair to log in and choose the default security groups. We can also create a new key pair and security groups. Click on Click launch template.
-
It will successfully create a launch template.
-
Go back to the Creation page of the Auto Scaling groups. Choose the created launch template and click on Next.
-
On the next page, choose VPC and select the Availability Zones and Subnets.
-
Leave all other fields as default and click on Skip to review. We have some optional pages also, but for this demo, we are leaving it as default.
-
Check all the configurations on the review page and click on Create Auto Scaling group.
-
Auto Scaling group is successfully created.
Create Scaling Plan
Now, Let's create the Scaling plan. Follow these steps:
-
Click on this link to open the AWS Auto Scaling console. Click on Get started.
-
Select the Choose EC2 Auto Scaling groups method and choose the created Auto Scaling groups. Click on Next.
-
Enter a name for the Scaling plan details.
-
Choose Optimize for availability in Auto Scaling groups. Click on Next.
-
Click on Next.
-
Next, check all the configurations on the review page and click on Create scaling plan.
-
Scaling plan has been created successfully. Click on the scaling plan name.
-
We will see the metric CPU utilization.
The scaling plan has been successfully launched. Let's clean up the Scaling plan.
-
Select the scaling plan and click on Delete.
-
Again, click on Delete.
-
We successfully deleted the scaling plan.
Conclusion
- Scaling Options helps to configure scaling actions to a group of AWS resources without much time involved.
- Types of scaling options are:
- Dynamic Scaling
- Predictive Scaling
- AWS services use scaling options like Amazon Aurora, Amazon EC2 Auto Scaling, Amazon Elastic Container Service, Amazon DynamoDB, and Spot Fleet.
- Resource discovery, built-in scaling recommendations, and dynamic and predictive scaling are the features and benefits of a scaling plan.
- We use the scaling strategy to work with the scaling option.