One of the advantages of cloud people often talk about is – pay for what you use. AWS gives us the ability to deploy cloud resources for the time we need them and pay just for the period we used those resources. But in any given project, managing resources manually to scale up and down resources based on demand can get a bit difficult. Why not let AWS manage the autoscaling for you? In this post, we discuss AWS EC2 Auto Scaling.
AWS EC2 Auto Scaling manages the scaling out and scaling in EC2 instances based on the triggers we define. For example, we may want more EC2 instances to be created during times of high demand, and the extra EC2 instances thus created may be destroyed when the demand goes down. To work with Auto Scaling, we need to create Launch Templates, Auto Scaling Groups and define Scaling options.
A launch template defines a template for EC2 instances which will be created when there is a need. The template takes parameters like AMI ID, instance type, key pair, security groups, etc. so that it can successfully launch the EC2 instance automatically. Most of these options are the same information you would rather provide while creating an instance on your own. During the event of scaling out, we normally need more number of instances that are configured the same.
Earlier, launch configurations were being used. We can use them even today but AWS strongly recommends the use of a launch template instead. Launch templates are more versatile as compared to launch configurations. Launch configurations let us make use of the latest EC2 features which is not possible when using launch configurations.
To use launch templates or configurations, we create Auto Scaling groups. Auto Scaling groups are containers to hold all the EC2 instances launched using the launch template. Auto Scaling group manages the instance lifecycle in terms of scaling-out and scaling-in. The number of instances to be created is determined by setting:
- Minimum number of instances
- Desired number of instances
- Maximum number of instances
At any given moment, the Auto Scaling group will strive to maintain the desired number of instances. In the event when some instances turn unhealthy or are terminated for some reason, the Auto Scaling group creates more instances to achieve the desired capacity state by making use of launch templates. It is possible to change the values of minimum, desired and maximum number of instances required.
Auto Scaling group (ASG) can be configured to make use of various types of purchase options for creating EC2 instances – Spot and On-Demand instances. Spot instances are made available from the spare capacity of EC2 instances and have low cost as opposed to On-Demand instances. In case if Auto Scale group runs out of Spot instances, it creates additional On-Demand instances to maintain the capacity. There are more options to manage costs by defining multiple launch templates, maintaining the ratio of various instance types, making use of reserved instances, etc.
An Auto Scaling group can be created across the AZs, and this ability can be used to balance capacity. The group distributes the instances equally across all availability zones. In case, if one of the AZs goes down, the Auto Scaling group creates additional instances in other AZs to maintain the desired capacity. When the AZs are available again, new instances are created in it again and additional instances created to balance the capacity are destroyed. This is called capacity rebalancing.
To balance the load amongst the instances which are part of ASG, load balancers are used. There are mainly 3 types of load balancers – Application load balancer which works on the application layer, Network load balancer which works on TCP/UDP layer, and Classic load balancer. By default, ASG determines the instance’s health by doing EC2 checks. Load balancers provide additional health capabilities which may be used to be sure about the decision to create new instances.
The lifecycle of an instance is managed by ASG based on the configurations and triggers. EC2 instances which are created manually with the same configuration can be added and removed manually to the ASG. This process is called attaching and detaching the instances. This type of scaling out and scaling in is called manual scaling. If we have an instance running, we can associate it with an existing ASG or create a new ASG.
However, most of the time dynamic scaling is used which can be based on several factors. ASGs can be dynamically scaled based on target tracking scaling policies. The policies are defined by metrics that monitor factors like CPU utilization or application load balancer. Statistics used in metrics can be used to define thresholds, which when breached can be used to trigger scale-out or scale-in actions of ASG.
Mainly, dynamic scaling is performed in 2 ways – simple and step scaling. In simple scaling, when thresholds are breached, resources are added or removed to match the desired capacity. However, once the scaling activities are performed, to go back to the desired state, simple scaling expects another threshold breach to take action. This can get undesirable at times. In the case of step scaling, the ASG looks for the threshold breaches for the last few steps. This works similar to CloudWatch alarms. Based on the number of breaches specified in the last few steps, scaling in or scaling out actions are taken. If there are no breaches, then ASG automatically resizes the group towards desired capacity.
ASGs implement a concept of cooldown timers to take any scaling action. Cooldown periods make sure the activities of previous activities are settled before proceeding towards the next scaling activity. In the case of manual scaling, the cooldown period is ignored.
Auto Scaling also provides us with lifecycle hooks. Lifecycle hooks help us perform certain activities on the instances which are being added to or removed from the ASG to scale-out or scale-in. While adding or terminating the instances, lifecycle hooks enable pausing of these instances for a while. This allows us some time to complete some actions before the instances are added or terminated.
Instances do not remain in a paused state forever. By default, the pause timer is set to 60 mins. The cooldown period is not part of this pause. The cooldown period begins when the pause timer times out, or instances are un-paused via CLI/API call to proceed further.
That was a brief introduction to AWS EC2 Auto Scaling. Of course, the posts in these blog series are not meant to recreate AWS documentation. These posts intend to provide a flying overview of what this service is about. If you like the content, consider subscribing, following, and sharing this blog post!