Auto Scaling Groups (ASG)

Overview

Auto Scaling Groups (ASG) are used to serve the increase and decrease in traffic. ASG is a logical group of EC2 instances on top of which our application is hosted. The number of EC2 instances is increased or decreased depending on traffic demand. You can define the maximum and minimum limit to which your application servers can be scaled up or down to avoid any additional cost beyond your budget.

How Do Auto Scaling Groups Work?

Need of Auto Scaling Groups

Let's try to understand the need for autoscaling with the help of the example. Suppose we have our application running on a single server as a startup. Now, we posted some videos on our application that went viral overnight.

Now the questions which arise are:

Are we prepared to serve billions and millions of requests?
Can our single server serve millions of requests without getting crashed?
How can we avoid losing customers?

Auto-Scaling helps in solving these issues related to the unpredicted increase in traffic. Under auto-scaling, the number of application servers is automatically increased when the traffic increases and decreases when the traffic decreases.

In AWS, the Auto-scaling group is the logical group of Amazon EC2 instances used to manage traffic requests.

Working of Auto Scaling Groups

Let's try to understand the working of auto-scaling groups by breaking down the process into small chunks.

While launching instances for your application server, you define the required capacity for your auto-scaling groups. The required capacity is the desired capacity needed by your application to serve all incoming requests.
Periodic health checks are done on these application servers to ensure they are active. Suppose any application server is found with a negative result for the health checks. In that case, that particular application server is terminated and replaced with a new one with our application code running on it.
Some templates are used for launching a new instance whenever needed.
Auto-scaling groups are capable of scaling dynamically. You can define the upper and lower limit of scaling, i.e., the maximum number of instances it can scale up to and the minimum number of instances it can scale down.
Defining the maximum limit helps you to avoid any unwanted cost which goes beyond your budget, whereas defining the minimum limit enables you to keep a minimum number of instances running every time as per your target.

Auto Scaling Groups and Availability Zones

The traffic may come to your application server from various regions. So launching application servers in multiple Availability zones is helpful both from the perspective of serving the traffic and maintaining backups.

If the auto-scaling group of a particular Availability zone becomes healthy, then the Autoscaling group of the other AZ can come helpful and serve the requests while the previous one is under maintenance.
We have a load balancer that enables the distribution of traffic across all AZs. For the traffic to be routed to an AZ, we need to enable routing options for that AZ in the load balancer.

There are some limitations while selecting Availability zones for your Auto scaling groups:

We need to have at least two availability zones enabled for the load balancer to distribute traffic.
Once we have enabled an availability zone for the load balancer, we can disable it. All we can do is add a new AZ.
While enabling AZ, we also need to select one subnet for each AZ. We can select one and only one subnet for each AZ.
Once an AZ or subnet is added to the gateway load balancer, we cannot change it.

Auto Scaling Groups with Multiple Instance Types

While creating the Auto scaling groups, we can choose the type of EC2 instances to save costs while simultaneously serving our needs.

We can use spot instances available at 90% less cost. Spot instances are spare EC2 instances, the price of which varies depending on demand.
We can also use Amazon saving plans to reduce the cost of on-demand instances.
Some allocation strategies can be used to reduce costs and increase availability.

Allocation Strategies

The Autoscaling group can make use of some allocation strategies to reduce costs. Let's discuss in brief some of the common ones.

On-Demand Instances
- Under this, the EC2 instance types are listed in order as per the template defined by you.
- It allows the first type of instance if available, otherwise second, third, or fourth, etc.
- It allows instances on-demand as per the availability to the Auto scaling groups.
Capacity Optimized Spot Instances
- Under this, instances are added to the Autoscaling group from the spot instance pool.
- They provide optimal capacity whenever needed.
- They can predict which types of instances will be available over a more extended period and allots that particular type of instance to the Autoscaling group accordingly.
Lowest Spot Instances
- Under this policy, instances are added to the Autoscaling group from the spot instance pool defined by the user.
- If the spot instance pool is not defined, it looks for the other pools with the lowest price when the scaling is done.

Tagging Auto Scaling Groups and Instances

You can add tags to Auto scaling groups, which easily helps change a particular group.

Tags can be based on the environment in which the Auto scaling groups will work. We usually have development, testing, and production environments, and based on that, we can add tags so that making changes to the groups becomes easier.

Elastic Load Balancing and Auto Scaling Groups

Amazon Elastic Load Balancing (ELB) helps distribute the traffic/requests from the application server. It can be integrated with the Auto scaling groups to distribute traffic equally among all the instances of the Autoscaling group.

The load balancer and the auto-scaling group must be in the same region to work as a single unit distributing the incoming traffic. The target for the load balancer needs to be an instance and not an IP if we want to integrate it with the Autoscaling group.

The following types of load balancers can be attached to Auto scaling groups:

Application Load Balancer: For HTTP/HTTPS-based traffic, an application load balancer helps with routing and load balancing. Application load balancer works best for virtual private cloud.
Classic Load Balancer: For TCP/SSL or HTTP/HTTPS traffic, a classic load balancer helps with routing and load balancing. It can work well with both VPC and EC2.
Network Load Balancer: For TCP/UDP layer four traffic, a network load balancer helps distribute the traffic.
Gateway Load Balancer: It behaves like a firewall, analyzes the incoming traffic, and distributes them to the EC2 instances of the Autoscaling group.

Autoscaling with Spot by NetApp Elastigroup

Elasticgroup uses artificial intelligence to predict which spot instance pool would be optimal based on capacity and cost. It helps in driving better decisions while choosing the spot instance pool.

Some of the features of the elastic group which make it the cherry on the cake when used with Auto scaling group include:

Rebalancing: It uses artificial intelligence to predict any future interruptions of spot instances and rebalances the workload by launching new spot instances.
Advanced Autoscaling: The scaling policies are updated automatically based on the situation's demand.
Optimized Capacity and Cost: Elasticgroup uses AI predictions to suggest the best and most optimal instances. These instances are well suited to saving us costs while at the same time providing suitable capacity.
Visibility: The data related to resource utilization, spot availability, health checks, etc., are all transparent and can be accessed by anyone, even a beginner.

Conclusion

When the number of traffic requests made to the application server increases by manifolds, the number of application servers needs to be improved to avoid any unwanted errors.
The scaling of the servers can be taken care of automatically using Auto scaling groups (ASG).
Auto-scaling groups are a logical group of EC2 instances, which scales up and down automatically with an increase or decrease in incoming requests.
Spot instances, spare EC2 instances, can be used to decrease the cost involved while working with Auto scaling groups.
Elastic load balancers are used to distribute traffic across the application servers.
Auto-scaling groups can be integrated with elastic load balancers to distribute traffic among the instances of Auto scaling groups.
Application load balancer, classical load balancer, network load balancer, and gateway load balancer can be used with Auto scaling groups to distribute traffic.
Elasticgroup uses artificial intelligence to predict which spot instance pool would be optimal based on capacity and cost.