In the first half of this article, we set up an EC2 instance on Amazon AWS, deployed our LAMP-based micro-site on it, tested it, and created an AMI image of the web application. If you’re following along and have an EC2/AMI ready, continue below to configure auto-scaling, otherwise review Part 1.

Part 2: Setting Up Auto Scaling

Within the overall umbrella of Amazon Web Services are dozens of individual technologies that you can use together to provision, launch, monitor and manage scalable web applications. Setting up intelligent auto scaling (AS) on AWS requires several of them, including:

  • Amazon Machine Image (AMIs) – snapshot templates defining a launchable EC2 server instance
  • Elastic Load Balancer (ELB) – a virtual load balancer platform with configurable events
  • CloudWatch (CW) – tools to monitor and check your EC2 instances
  • Command line tools – simple Java programs that call out to the AWS API using your credentials. Unfortunately, Amazon hasn’t added all of the autoscaling configurations to the online AWS Console yet, so until further notice, you’ll have to use a few command-line scripts to finish out the autoscaling configuration. Download the Auto Scaling command line tool from the AWS developer portal and run them on the command line to configure your autoscaling setup.

Why Auto Scaling?

Online marketers spend hours, days, weeks and even months, planning marketing campaigns, both online and offline, to drive traffic to websites, and IT provisioning is difficult even when you know in advance when the traffic is coming. But what if you don’t know when a huge traffic spike will hit your server? The better a social or viral marketing campaign is, the more likely it could result in irregular traffic patterns or server load spikes at unexpected times. The flexibility of AWS autoscaling frees you from having to accurately predict and provision servers in advance of huge traffic spikes.

How it works

In general, Auto scaling with Amazon Web Services works like this:

  • You define an AMI instance and create an Auto Scaling Group to launch instances into.
  • You use CloudWatch to monitor your server(s) instance(s), and when certain configurable events happen, you can launch more instances based on the AMI template you define.
  • EC2 instances launch behind the Elastic Load Balancer (ELB) you define.
  • The ELB will send traffic in a round-robin pattern between all the instances assigned to it, and you can control in real time how many instances you want to launch to cover sporadic bursts of high-volume traffic, and keep at least one or two running during traffic lulls. If any of your EC2 instances fails to respond, the ELB will detect it and launch a replacement. When web traffic dies down, you can terminate instances automatically, too.
  • CloudWatch lets you configure alarms that trigger auto scaling policies to launch additional EC2 instances into your auto scaling group when network traffic, server load, or other measurable statistic, gets too high—say, 80% usage. The number of servers you add is based on whatever your policy states—1, 3, 10 more servers—it’s up to you. Each server is a duplicate instance of the AMI you define in your auto scaling config. You can even use Amazon Simple Notification Service (SNS) to send yourself an email or text message when an auto scaling event occurs.
  • Your ELB automatically spreads out the incoming visitors between all the servers in your Autoscaling Group. You can set a minimum and maximum number of instances in your group, offering you peace of mind that your site will not crash due to the influx of visitors, and also to limit the impact on your billing statement. You are also able to tell AWS to decrease the number of instances when network traffic drops below, say 20% usage for a measurable amount of time, to scale back the number of servers in your web server farm.

How many servers will I need?

That’s the toughest question to answer—a lot of variable factors are involved. It depends on the volume of traffic you receive, the type of EC2 instances you use, and the complexity of your application. For our simple PHP application, we estimated that a single t1.micro instance, Amazon’s smallest and least expensive EC2 option, should easily handle between 50 to 75 simultaneous users. We determined this based on the available amount of RAM available in a t1.micro instance, and comparing that to the average amount of memory taken by a typical PHP request on our application. We then did some actual load-testing and benchmarking with the command-line tool, siege. We’ll get into the details of that later. Ultimately we decided that we wanted no fewer than 2 servers and no more than 100, or support for up to 7,500 simultaneous users, based on using t1.micro’s in our autoscaling configuration.

What to monitor

AWS CloudWatch lets you monitor several different EC2 server performance metrics in real time, including…

  • CPU Utilization (%)
  • Memory Utilization (%)
  • Network Out Utilization (MB)
  • Memory Used (MB)
  • Memory Available (MB)
  • Swap Utilization (%)
  • Swap Used (MB)
  • Disk Space Utilization (%)
  • Disk Space Used (GB)
  • Disk Space Available (GB)

…and many more. It’s up to you what to monitor, but the metrics most useful for knowing when you should scale up and add another server or scale down by terminating a server are probably CPU utilization, memory utilization or network utilization.

It should also be noted that Amazon provides plenty of basic monitoring metrics for free. Basic monitoring has a 5 minute refresh interval. If monitoring every 5 minutes isn’t fast enough for your application, you can also look at the detailed monitoring option, which costs only fifty cents per metric per month. Detailed monitoring fires events at 1-minute intervals. Here’s a list of the EC2 metrics you can monitor using CloudWatch. If you don’t find a metric that will suit your application, you can even submit (via the Amazon AWS API) a custom metric from your app that CloudWatch should monitor.

How to configure auto scaling

Before we get started, let’s look at the two prerequisites you need to have in place before creating an auto scaling configuration.

Prerequisite 1: Choose an AMI to use. If you haven’t created an AMI from one of your running EC2 instances, go back to Part 1 and create an AMI now, or click over to your AMIs page on the AWS Console to retrieve the AMI ID to be used as a template, and write it down. You’ll need an AMI ID in Step 1.

Prerequisite 2: Fire up an ELB. The ELB name that is displayed on the AWS Console will also be passed to the command we run in Step 2. We used the AWS Console to create an ELB, and simply accepted the defaults on each of the Elastic Load Balancer setup screens. Once your ELB is up, you will most likely create a CNAME record at your DNS provider pointing your landing page or vanity domain to the DNS name given in the AWS Console. Visit the Elastic Load Balancing at Amazon AWS page for additional information.

Okay, here we go! As we mentioned above, not all of the functions needed to implement autoscale are implemented in the AWS Management Console yet. So, roll up your sleeves and fire up Terminal (Mac) or CMD (Windows). We’ll be using a few different command line tools to finish our autos caling configuration.

Step 1: Create a launch config. The first command to setting up autoscale is as-create-launch-config. Using this command, you tell AWS:

  • a unique name for the configuration,
  • which AMI ID you want to use as your template for creating more EC2 instances,
  • the EC2 instance type (the size and power of the server) to launch using your AMI,
  • your access key,
  • and a security group to deploy the instances into.

The API replies with: “OK-Created launch config.”

$PROMPT> as-create-launch-config {your_launch_config_name} --image-id {your_ami_id} --instance-type t1.micro --key {your_access_key} --group {your_group_name}
Return message: OK-Created launch config

Step 2: Create an auto scaling group.  Use the as-create-auto-scaling-group command to define the properties for your group of servers. Auto scaling groups are the core component of an auto-scaling configuration. This command takes the launch_config_name you defined from the step before as a parameter, the name of the ELB you want to use, and most importantly, lets you define the minimum and maximum number of servers you want to have in your cluster. In the example below, we define a group with a minimum of 2 servers and a maximum of 10.

$PROMPT> as-create-auto-scaling-group {your_scaling_group_name} --launch-configuration {your_launch_config_name} --availability-zones us-east-1d --min-size 2 --max-size 10 --load-balancers {your_load_blancer_name} --health-check-type ELB --grace-period 300
Return message: OK-Created AutoScalingGroup

The grace period is the number of seconds that AWS will wait after an autoscaling event occurs before possibly triggering another autoscaling event. This is an important consideration that prevents AWS from adding too many servers too quickly. AWS responds with “OK-Created AutoScalingGroup.”

Step 3: Create auto scaling policies. Once we have our EC2 AMI, an AS launch config, and an AS group defined to deploy our  instances into, we’re ready to define the auto scaling policies that will actually cause more (or fewer) EC2 instances to be launched and attached behind the ELB.

The command used to change the number of servers in the group is the as-put-scaling-policy command. With auto scaling, you use EC2 monitoring within CloudWatch to trigger a certain policy, but before we can do that, we need to define the actual policies that will be triggered. You can use this command to manually trigger scaling events as well, for testing before your traffic burst arrives, and in doing so, you can not only see the effect of scaling up and down, but you can watch AWS work its magic by refreshing your Instances view—new server instances appear in the AWS Management Console as your traffic increases beyond the thresholds you set.

The as-put-scaling-policy command takes the auto scaling group name we defined in step 1, a name for the policy, such a “scale-up” or “scale-down,” the type of scaling change the policy defines, and a cooldown period. Again, the cooldown period is used to prevent AWS from executing multiple policies within a very short time.

$PROMPT> as-put-scaling-policy --auto-scaling-group {your_scaling_group_name} --name scale-up --adjustment 1 --type ChangeInCapacity --cooldown 300
Return message: arn:aws:autoscaling:us-east-1:751374139099:scalingPolicy:e31ae79c-4210-42ad-8d86-60210aaf7a20:autoScalingGroupName/sg-breezes-gma:policyName/scale-up

Above you can see the basic upscale policy we defined, named “scale-up,” a ChangeInCapacity policy to add 1 server and wait 3 minutes before another policy can be triggered. Below is the reverse operation, or a “scale-down” policy to remove 1 server from our group.

$PROMPT> as-put-scaling-policy --auto-scaling-group {your_scaling_group_name} --name scale-dn "--adjustment=-1" --type ChangeInCapacity --cooldown 300
Return message: arn:aws:autoscaling:us-east-1:751374139099:scalingPolicy:07a0f71c-d214-4497-973f-c4cdcb15851f:autoScalingGroupName/sg-breezes-gma:policyName/scale-dn

In both cases, AWS replies with a return message including the unique auto-generated name of our two new auto scaling policies. We’ll use those unique policy identifiers to connect to our CloudWatch events in the final step.

Step 4: Link a CloudWatch event to an auto scaling policy. At the moment we have everything we need for an intelligent autoscaling configuration except one thing—the intelligence! The smarts come from choosing a CloudWatch event, such as 80% CPU utilization of an EC2 instance in our group, and wiring up that condition to automatically trigger the scale-up policy we defined. We’re also going to want to do the same in reverse for scaling back down at 20% CPU utilization.

The command to do this comes from the CloudWatch command line tools, and is called mon-put-metric-alarm. This command takes several parameters:

  • a name for the alarm that you choose
  •  a description for what the alarm is monitoring,
  • the namespace for the alarm (in this case, AWS/EC2)
  • the name of the [namespace] metric that you want to monitor
  • the statistic type of the monitoring metric, such as Average or Percent,
  • a period or time interval,
  • a threshold for the statistic you choose,
  • a comparison operator, such as greater than or lesser than
  • a dimension, which is the ID of an EC2 instance to monitor
  • and the number of evaluation periods during which the metric you choose has to consistently return over or under the average or percent unit you define

As you can see, there’s a lot to this command, but once we look at every parameter, you can see that without each of them, you wouldn’t have the ability to control auto scaling changes with enough granularity. The name and description are shown back to you later when using the mon-describe-alarms command. The statistics you’re watching, and the thresholds and time intervals, are important to test for your particular application. For example, we chose to monitor average CPU utilization for a period of 60 seconds, and an evaluation period of 3 intervals (or 3 minutes), for an event of 80% or greater level. Here’s the command to achieve this.

$PROMPT> mon-put-metric-alarm --alarm-name sample-scale-up --alarm-description "Scale up at 80% load" --metric-name CPUUtilization --namespace AWS/EC2 --statistic Average  --period 60 --threshold 80 --comparison-operator GreaterThanThreshold --dimensions InstanceId=i-37b12752 --evaluation-periods 3  --unit Percent --alarm-actions arn:aws:autoscaling:us-east-1:751374139099:scalingPolicy:78d05062-0eda-436c-864e-d93776461eba:autoScalingGroupName/sg-sample-group:policyName/scale-up
OK-Created Alarm

In English, the above command says, “If the average CPU utilization of instance i-37b12752 is measured at 80% or greater 3 times over 3 minutes, then trigger our scale-up policy.”

Here is the reverse mon-put-metric-alarm command we used to terminate one of the servers if the CPU utilization drops below an average of 20% over 3 minutes.

$PROMPT> mon-put-metric-alarm --alarm-name sample-scale-dn --alarm-description "Scale down at 20% load" --metric-name CPUUtilization --namespace AWS/EC2 --statistic Average --period 60 --threshold 20 --comparison-operator LessThanThreshold --dimensions InstanceId=i-37b12752 --evaluation-periods 3 --unit Percent --alarm-actions arn:aws:autoscaling:us-east-1:751374139099:scalingPolicy:78d05062-0eda-436c-864e-d93776461eba:autoScalingGroupName/sg-sample-group:policyName/scale-dn

For more information and examples, refer to the Auto Scaling section on the Amazon developer documentation.

Testing with siege

As mentioned above, we used the command line tool siege to work through the configuration setup and to verify whether our policies were working as we wanted. Using siege on a different server or EC2 instance, you can easily simulate tons of website traffic for a short period of time. Siege does this by creating dozens or even hundreds of concurrent HTTP requests to your URL for the duration you specify. This gives you a chance to see what will happen to your auto scaling policies when real users flood your web server with traffic and CloudWatch alarms start triggering.

Siege can be installed with the package manager on your system. We simply ran

sudo apt-get install siege

on our Ubuntu/Debian system, and that was it. has a great article on installing from source and using siege.

Siege is simple to use—just give it the number of concurrent connections you want to create (-c), the length of time (-t) to run the test, and your URL, as shown:

siege -c25 -t10M

One thing to note here is that CloudWatch basic monitoring refreshes every 5 minutes, and our auto scaling policies above require a metric to be met for 3 consecutive minutes, so we had to run siege tests for at least 6 to 10 minutes to ensure that our policies had enough time to trigger at least twice. While siege was running, we refreshed the CloudWatch tab in AWS Management Console to verify that more servers were indeed getting launched.

Time-lapse showing the effect of Siege testing on auto scale

AWS Auto Scaling Animation

  1. Two instances running prior to launching Siege
  2. Siege test starts; new instances automatically launched as the scale-up policy is triggered
  3. Additional instances launched to handle continued Siege test
  4. No more instances needed, load is handled and stable
  5. Siege test is completed
  6. Shutting down an instance as bottom threshold is met
  7. Two instances now terminated after scale-down policy, returns system to initial state

Making changes

It should have come as no surprise that we would need to make changes to our landing page and micro-site at the last minute, right before our client’s scheduled appearance on a national TV show. To make changes to the landing page, we need to upload some new files to our EC2 instance. No big deal, right?

Under a normal web hosting scenario, this is no problem, but when you have an AMI defined to be the source template for an auto scaling configuration, and the entire micro-site content is baked into the AMI, it’s an issue because the moment one of our events is triggered and CloudWatch triggers our auto scaling policy to launch a new instance, it’s going to be copied from our now-stale AMI with the outdated content.

It’d be nice to simply create a new AMI with the changes and re-run the as-create-launch-config script where we define the AMI to use, however trying that gives the following error:

configuration already exists with the name sample-launch-config-name

So that won’t work. Next, we thought maybe could just delete the auto scaling group, using the as-delete-auto-scaling-group command, and AWS asks, “Are you sure you want to delete this AutoScalingGroup?” When we replied “Y,” we got this error:

as-delete-auto-scaling-group:  Malformed input-You cannot delete an AutoScalingGroup while there are instances still in the group.

Well that’s good to know—you cannot inadvertently delete an auto scaling group while instances are running inside it. What instances are running inside it? You can use the as-describe-auto-scaling-instances command for that. This command is nearly identical to ec2-describe-instances, but instead of showing all your instances, it lists the ones running inside each autoscaling group you have configured.

In order to actually terminate the instances, though, we have to change the minimum number of instances allowed. Remember, when we ran the as-put-scaling-policy earlier, we defined minimum instances as two. If we terminate the instances in the group, AWS will just launch more to replace them to meet the minimum. So, next, we had to change the minimum number of servers in our auto scaling group to zero. This is most easily done with the as-update-auto-scaling-group command, as shown:

$PROMPT> as-update-auto-scaling-group sample-sg-name --min-size 0
OK-Updated AutoScalingGroup

Once we did that, we could terminate our instances using the as-terminate-instance-in-auto-scaling-group command. Finally, we could run as-delete-auto-scaling-group followed by as-delete-launch-config.

Granted, this manual work to build up and tear down an auto scaling configuration is kind of a pain point, but in our case we didn’t have time to use a proper deployment script, as is normally the case for cloud deployments. On the other hand, it did force us to learn all the command line tools to reverse, undo, tear down and delete an auto scaling configuration and document them here.

Auto scaling command summary

To build up an auto scaling group

  1. as-create-launch-config
  2. as-create-auto-scaling-group
  3. as-put-scaling-policy (for scaling up)
  4. as-put-scaling-policy (for scaling down)
  5. as-execute-policy (for testing the policies)
  6. as-update-auto-scaling-group (for changing policies)
  7. mon-put-metric-alarm (for triggering policies)

To tear down an auto scaling group

  1. as-update-auto-scaling-group (to set a minimum of 0 instances)
  2. as-describe-auto-scaling-instances (to see the instance IDs)
  3. as-terminate-instance-in-auto-scaling-group (terminate each ID)
  4. as-delete-auto-scaling-group
  5. as-delete-launch-config

Reference documentation

We have seen tremendous evolution in the server paradigm, with web-based architectures maturing into service orientations before finally evolving into true cloud-based architectures. Today, containers represent the next shift. Check out the latest on the Google Container Engine here.

  • Cloudy

    How do I make this autoscaling solution work on a private cloud or on another public cloud?

    • Geoff Hoffman

      All of the autoscaling features should work the same regardless of the instance(s) you are monitoring. In the case of an Amazon VPC, or Virtual Private Cloud, you’ll have internal IPs you assigned to your VPC instances instead of public EIPs. Here’s the VPC documentation

    • The theory is similar, but the practice will be different. Not every cloud provider provides something like Amazons Autoscaling features, or CloudWatch. For the most part you’ll have to do it yourself (you may find that necessary with Amazon anyway, it depends on your traffic style.) Every serious cloud provider will present an API and it should be relatively straight forward to build tools to monitor and expand your infrastructure as you need.

    • Boyan

      This is trickier for sure. Here is a quick rundown … even though my reply to you is 11 months later 🙂

      – You have your VPC ready.

      – When setting up the launch config, make sure you set a security group from your VPC
      – When setting up the auto scaling group make sure you set the above launch config
      – When setting the availability zones of the auto scaling group make sure you select the availability zones of the subnets of your VPC
      – When setting the auto scaling group, you need to say which subnets you want it for.
      – When setting the load balancer in the auto scaling group make sure that the load balancer was created in your VPC.

      Hope this helps!

  • VanB

    Great article! At the end, you mentioned “a proper deployment script, as is normally the case for cloud deployments.” Can you recommend some additional resources that describe more proper deployment methods to an autoscaling group? Thanks!

    • Geoff Hoffman

      Thanks for the question, VanB. 

      One common way to manage cloud deployments a little more elegantly under an autoscale scenario is by configuring a bash script to load at system startup that checks out the deploy tag from your source code repository. Instead of actually baking the content of your site into your AMI image, you instead create an intelligent AMI that automatically loads the content from your repository once it boots up. In this way, as new servers come online, they automatically pick up the latest changes from your source code repository. Here is a discussion on how to do this on various systems. 

      Depending on the programming language you’re most comfortable with and what you’re trying to achieve, you might find some of the following tools have examples you can tweak to suit your needs. Jenkins, Chef, Puppet and Fabric are all used for various deployment automation and management tasks.

  • Did you find Cloudwatche’s “Greater than x for more that 1 minute” fast enough for your needs? I have a site running at the moment that regularly gets Facebook links to pages with hundreds of thousands of fans.

    This often leads to ~20 concurrent users normally.Then ~2000 concurrent users for 1/2 minutes.

    This usually crashed any instances that are setup killing all cloudwatch metrics and the whole site is blown up of 10 minutes. So far I’ve yet to find a configuration that works for this senario.

    • Geoff Hoffman


      Your particular scenario reveals one of the limitations with CloudWatch basic monitoring (5 minute intervals) and even Cloudwatch Advanced monitoring (1 minute intervals). As you point out, there are cases when you need to ramp up in seconds instead of minutes. 

      One thing you could try would be to experiment with your own monitoring solution. Use a t1.micro instance with a CRON job running to check the instances in your auto-scaling group at 10 second intervals using your own custom bash script. Detect a load spike? Fire off your own as-put-autoscaling-policy message as appropriate. 
      Unfortunately, though, it takes instances a minute or two to spin up. I don’t think there’s anything you can do about the boot-duration of a VM. It obviously would help if you know in advance of when you’re going to be sharing a link; even 3 or 4 minutes warning would be enough for you to spin up a few instances behind your ELB to handle the spike, but if you’re talking about social sharing that you cannot predict, I would just recommend using larger instance sizes or experiment with caching. 

      Be sure to serve all your static images, CSS and JS files from CloudFront. This removes them from your app server having to handle those requests. Then use a front-end caching strategy for the HTML page(s) your fans want to be able to hammer in seconds. For example, if you are able to cache the page that your 2,000 users want immediate access to, using an in-memory front-end cache implementation with Memcache or Redis could save your web server completely by not having to actually serve those requests at all… or at least you could remove the vast majority of processing that has to happen per request. Depending on your application, this could boost your appserver throughput concurrency by hundreds of times.

  • James Willmott

    How do you go about scaling wordpress multisite with an RDS? I’ve never had more than one server connecting to RDS, so not sure on the process to make wordpress write to the RDS and scale. Can you have multiple servers reading and writing to a single RDS database at once? I would imagine that would cause a data integrity nightmare or is this not a problem?

    Thanks for the post…great article, about to setup this solution tonight.

    • Geoff Hoffman

      Hey James,
      You should be fine to connect multiple web servers to a single RDS connection. That’s one of the nice things about Amazon’s RDS service – you don’t need to set up your own MySQL cluster. It’s the fastest and easiest way to scale any web application that uses a database, whether it’s WordPress or not. Let us know how it goes! Good luck –

  • Mark Estocapio

    May I know how much you are currently paying with your current setup? And you are using 100 micro instance and 1 ELB?

    • Geoff Hoffman


      It wouldn’t be fair to quote prices because it can vary a lot, the rates change frequently, and your application usage might be wildly different from ours. For our particular case, we used t1.micro instances and tested our policies up to a maximum of 10 or 12 actual running instances, and then let them autoscale back down to 1 by default. We never got anywhere near the 100 servers I had set for our autoscaling group. This worked fine for our purposes because the landing page we created was very simple, didn’t use a framework like WordPress or Magento, and very basic database connectivity.

      What I can tell you is that for the five or six days we had multiple t1.micros running to set up, test and serve our client’s landing page to coincide with their appearance on TV, the total bill for all of that activity was $1.07! In short, we ended up using about 760 instance-hours of t1.micro usage, and 99% of that usage fell under Amazon’s free usage tier.

      If you want to estimate AWS pricing for your application, you should definitely use the Simple Monthly Calculator provided by

  • Asal

    great article, but i have one question please, i have some changes that have to be applied on the newly created instance. for example the hostname. how can we solve this please?

  • chand

    nice post, I Have one problem. I have configured the autoscaling correctly. Its creating and deleteing the instances correctly but, all the servers are out of services. can u tell me the code will transfers among the services or not. Please help me how to debug its working or not

    Here is my configuration

    [root@ip-10-130-18-230 html]# as-create-launch-config best-buy-config –image-id ami-02fbbd50 –instance-type t1.micro –region ap-southeast-1 –group an_securitygroup –key chand_key –monitoring-disabled

    OK-Created launch config

    [root@ip-10-130-18-230 html]#

    [root@ip-10-130-18-230 html]# as-create-auto-scaling-group best-buy-group –launch-configuration best-buy-config –region ap-southeast-1 -availability-zones ap-southeast-1a,ap-southeast-1b –min-size 0 –max-size 10 –load-balancers test-bestbuy –health-check-type ELB –grace-period 120

    OK-Created AutoScalingGroup

    [root@ip-10-130-18-230 html]# as-put-scaling-policy ScaleUp -auto-scaling-group best-buy-group –adjustment=1 –type ChangeInCapacity –region ap-southeast-1 –cooldown 180


    [root@ip-10-130-18-230 html]# as-put-scaling-policy ScaleDown -auto-scaling-group best-buy-group –adjustment=-1 –type ChangeInCapacity –region ap-southeast-1 –cooldown 180


    [root@ip-10-130-18-230 html]# as-execute-policy ScaleUp –auto-scaling-group best-buy-group –region ap-southeast-1

    OK-Executed Policy

    [root@ip-10-130-18-230 html]#

    but the newly created server is out of sevice and code is also not come. i am using ebs instance.

    Please can u help….

    • Geoff Hoffman

      I’m not sure what you mean by “out of service” — but whether you create a server instance from the AWS Management Console (Web interface), or from command line API call, or from an auto-scale policy, you should be able to connect to the instances using SSH like any other. A couple things I noticed in your sample code above:

      1) You have –min-size 0 which implies that if you have no traffic, then Amazon will shut down all your instances!

      2) You have –monitoring-disabled which may indicate that your autoscaling policy will never fire if monitoring is disabled.

      3) Check into the AMI you’re creating multiple instances from (ami-02fbbd50). If you log into your AWS Management Console, can you create a single instance manually from this AMI and connect to it via SSH?

      4) Check into your security group and ensure that ports 80 and 22 are open.

      5) One final thing to try is to fire your Autoscaling policy manually, and check within the timeout duration you specified that this instance is actually created. It can seem confusing and counter intuitive, but it’s possible to configure the autoscaling Up and Down policies with conflicting grace periods. For example, make sure your grace period is long enough to give the instance time to start up, and also not so short that the moment it finally starts up an instance, it doesn’t try to terminate it before you have a chance to verify that it’s working. You might want to increase from the 120 and 180 you have now for testing purposes.

      There are lots more things to check on the AWS Autoscaling Developer Documentation.

    • Boyan

      All of Geoff’s comments are good. I was also thinking one more thing… “Out of Service” sometimes appears on the load balancer. Meaning… you have successfully created an EC2 instance from this auto scaling group and it has attached itself to the load balancer, but the load balancer says “Out of Service” on it, and thus no traffic goes to it. Two things that come to mind:

      1. The load balancer sets the status “Out of service” according to its health check. Make sure that the URL that the health check is checking works.

      2. This might not be related (I don’t fully remember) but make sure the load balancer is set up in the same availability zones as the auto scaling group (ap-southeast-1a,ap-southeast-1b). I just remember having issues with this.

  • Paul

    Just curious – what type of instance would you go with if you were running wordpress – still the micro? And also, why do you limit it to 100 instances to be spawned? Why have a limit? For example, what if you possibly would have gotten over 7500 users at once? Thanks a lot!!

    • Geoff Hoffman

      As you might expect, Paul, it depends. If your WordPress site has a ton of plugins on it, then I would probably opt for at least m1.small instances, or even m1.medium. As you are probably aware, WordPress takes more resources than a straight PHP site due to a lot of code being used for plugin loading, template tags, and backward compatibility.

      You should experiment with the different metrics you can monitor, too. For example, you may want to monitor network out or memory versus CPU usage.

      The other thing I would strongly consider is using Amazon RDS instead of MySQL on each instance. In other words, connect all your instances to a single RDS database and then each instance only has to process PHP code… all the database queries will be better handled by Amazon’s gigantic MySQL cluster.

  • Thank you very much for this. I have two questions:

    1) I set my min-size to 1, as I want low load situations to only have the original instance (from which I created my AMI) running. I’m just guessing that this min-size is referring to the total number of instances, not the number of clones running, or do I have that wrong? If I do have that wrong, that makes sense, because as soon as I set my alarms, it immediately started a cloned instance which hasn’t terminated since it started, and the cpu load on my original instance is definitely under 20%.

    2) The Alarm Status (the column, in my instances list) shows “ALARM” for my original instance, since the < 20% cpu utilization alarm ran. Is there some way for this to clear out, since this shouldn't really be something to be alarmed about? 😉 I wasn't able to figure anything out.

    • After some more tinkering around, it seems like min-size refers to how many cloned instances of your AMI are running, so my original assumption was wrong.

      Also, I can never seem to get the system to automatically launch more instances, beyond the first it launches when I set min-size to 1. I used siege to pound on the load balanced group (and verified this by seeing the requests in the access_logs on both instances), and the cpu usage spiked to 100% on both and stayed there until siege stopped running (for 10 minutes), but new instances were never automatically started. I did test both policies and they launch and terminate extra instances just fine.

      One thing I was a little unclear on was the InstanceId for the –dimensions parameter when defining the alarms – is this supposed to be the instance id for your original instance (from which I created my AMI)?

      • Boyan

        Hi Chris,

        min-size should refer to the minimum instances to ever exist. So for example, if you have policies and alarms that decrease the instances, it will never decrease under the min-size. Desired Capacity sets how many cloned instances of the AMI you have running.

        How did you set up the scaling up policy? Is it by CPU usage? I’ve gotten this to work before, maybe I can help you.

        Also, to one of your previous questions, yes setting “AutoScalingGroupName={your_scaling_group_name}” checks the CPU for the entire group and not just one instance.

        • Iñigo

          One thing I was a little unclear on was the InstanceId for the –dimensions parameter when defining the alarms – is this supposed to be the instance id for your original instance (from which I created my AMI)?

          Can someone reply to this question?

          • If you set the InstanceID then you will need to specify the ID for one particular EC2 instance. It can be any EC2 ID. Of course, to make sense it would probably have to be an EC2 that is being hit with load so that auto scaling can respond to it. If you specify the ID of your original instance from which you created the AMI… does that EC2 instance have any load on it? What I usually do with this dimensions parameter is I specify the ID of my first auto scaled EC2 instance, since that instance is in my auto scaling group and in my load balancer and will be hit with load. (Though generally I don’t use this parameter, I use AutoScalingGroupName).

          • Iñigo


    • Ok, I got it working! Turns out I had forgotten to specify the –alarm-actions argument, so once the alarm went off it was never triggering my policies. Once I did that it autolaunched and autoterminated the instances as it should have. Also, setting min-size to 0 just means there won’t be any clones launched until the scale-up policy gets triggered on the original instance.

      Also, after doing some googling I found other people are using –dimensions “AutoScalingGroupName={your_scaling_group_name}”, instead of using InstanceId. I wonder if this averages out the policies upon all the instances in the group, as opposed to just the original? Was the InstanceId you were using for the load balancing group, or your original instance?

      • Ok, I was wrong about min-size=0. It doesn’t seem to launch any new instances when it’s set to 0. Looks like you have to always have at least one cloned instance running.

    • I’m having the same issue … after reading the comments here I think it’s because I had an EC2 instance already running while I was creating the policies and didn’t add/register it to the ELB using ‘elb-register-instances-with-lb’ so as far as I was the auto-scaling was concerned it had no instance running and always started an instance in addition to the one it couldn’t see.

      Just woke up and got this idea but haven’t actually tried it. Makes sense to me thought. The command is something like this …

      $ elb-register-instances-with-lb autoscalelb –instances i-xxxxxxxx

  • Chencho

    Great post.
    I have a few question, maybe you can reply to it
    If i have a one instance with php+mysql… if i do an autoscaling and then have 2 instances with load balancing, if some one register at instance2, when instance2 ended… ¿can i see data into instante1?
    Or i have to put mysql into RDS, and store files into S3 or something like this? Because EBS can only be attached into one instance, not is?
    I’m thinking about one user register and he/she avatar, for example.

    • Geoff Hoffman

      Thanks Checho! Yes, I would recommend that to migrate from a single server running your full LAMP stack, you will probably want to experiment with using only Apache & PHP on your web servers that will be a part of the Autoscaling group, and migrate to use RDS database. If you configure your app server instance to connect to RDS instead of local MySQL database. Also, for avatar storage you can change from filesystem images to a blob field in MySQL — that way avatars can be shared across all web server instances (You can use CloudFront to cache the images too, basically a PHP file that generates an avatar from your database, to avoid unnecessary database calls). The same goes for sessions. If you’re using the default PHP Session file storage, you’ll want to switch to use a database for session storage as well – that way it won’t matter which web server instance handles the request — they can round-robin between all your autoscale instances and still get access to the same session token.

  • Ben

    Awesome post – really helped me get up and running. Note that instead of steps 1 through 3 in your tear-down, you can do “as-delete-auto-scaling-group $scaleGroup –force–delete” to have it delete all instances and the group. Much easier.

    • Ben

      Oops. –force-delete (not –force–delete)

    • Geoff Hoffman

      Thanks for the kudos, Ben! We worked pretty hard on making this post contain everything you need to try AWS Autoscaling on your own. Glad you found it useful — and you’re right, the –force-delete option is much easier than a manual teardown! For the purposes of illustration, though, I still think it’s good to know that a teardown consists of all the buildup steps in reverse. Thanks for the tip!

  • s3v3n

    Thank a lot for hits tut. But i have a problem my instance will not shut down. I get a Error Icon in cloudwatch. New instance will be run, but shut down will not work. please help me to solve this issue.

    State Details:State changed to ‘ALARM’ at 2013/02/01 15:20 UTC. Reason: Threshold Crossed: 1 datapoint (4.452) was less than the threshold (35.0).

  • Bassam Gamal

    Thanks very much, this very descriptive and very well explainded. But, I wonder what connects the ELB with Auto Scalling as when I implemented this, the auto scaller always starts new instances that aren’t related to the instances I assigned to the ELB.
    What am I missing? should I make the instances of ELB offline?

    • Geoff Hoffman

      You specify the ELB in the as-create-auto-scaling-group with the –load-balancers $ELB_NAME argument. If you have instances already running that are not linked to the ELB, you can use the use the elb-register-instances-with-lb command to attach them. I don’t remember doing anything special: it just worked. If I recall correctly, we created our ELB, launched an EC2 instance attached to the ELB, and then created the AMI, and when our auto-scaling policy fired, it automatically created a new instance based on the AMI, and it was automatically attached behind the ELB. Check your AMI and ELB are in the same Availability Zone, and also that you don’t have the maximum number of instances attached already. I believe it’s 25.

      • Bassam Gamal

        Yeah that’s working now 🙂
        But what if I’ve some important data on the instance and I don’t want it to be closed immediatly, can I do the following
        – Alarm fires 3 policies
        – One sens SNS notification
        – One to cooldown for 30 Min
        – The other to terminate the machine “Is it possible to just stop it only??”

        • Geoff Hoffman

          You definitely need to design your application so that every instance you launch to handle requests is disposable. By that I mean, you should be storing important data on RDS, ElastiCache, DynamoDB, or a separate EC2 instance running Redis or MongoDB that is not included in your scale-up scale-down policy configuration. A common example of something in a typical 1-server configuration that needs to be addressed in an autoscale environment is session storage. If you are using PHP’s native session driver, for example, it stores a session token file on server instance A for request 1. The next request might be served by server B behind the ELB! Now suddenly your user is logged out, because there is no session token stored on server B, so important data, uploaded images, sessions, etc., should be handled accordingly. An easy fix to the session storage problem is to use the database session driver and hook it up to RDS. Hope that helps!

      • Boyan

        I just want to confirm one thing mentioned above: You need to set the auto scaling group’s availability zones to match the load balancer’s availability zones. And if you don’t do this right no errors appear anywhere… just… nothing works 🙂 It can be frustrating.

  • Robert Podosek

    Instead of tearing down all your policies and auto-scaling groups it’s much easier to define a brand new launch config that contains your new AMI. Then you can simply run as-update-auto-scaling-group my-asg –launch-configuration my-new-launch-config. That way you don’t have to reset up your scaling group, policies, or monitors.

    • Geoff Hoffman

      Thanks for the tip, Robert! That’s seems like a time saver for sure. My only question about using the as-update-auto-scaling-group command is whether your instances are slowly replaced, or how the update is actually performed since there’s bound to be outdated AMIs already in the autoscaling group. Are all the outdated instances instantly terminated?

  • Smith

    I can’t create launch config because

    as-create-launch-config: Malformed input-Request timestamp is too skewed. Timestamp must be within 900 second of sever time.

    Help me pls…

  • Lets say, to get updated content , i will create a AMI and launch a new instance via AMI. But If User A is directed to First instance and User B is directed to second Instance. How does the Data Sync? Mydatabase and everything is in that instance, its not on a separate instance.

    • Geoff Hoffman

      Hey Jonathan,

      When migrating a single-server LAMP app to an autoscaling deployment often takes some planning and changing some of the configuration a bit. For example, you’ll need to store shared data, particularly sessions, on a separate instance, or better yet, use Amazon’s shared MySQL cluster, called RDS.

      By default, PHP stores sessions on disk as files, but that won’t work in your example, because User A’s session token got written to the first server, while User B’s is on server 2. But on their second or third request, the user’s request can be served by a different server instance, so they’ll instantly be logged out. That’s why you’ll want to use the database session driver, and use the same connection info from all your instances to a single database server or RDS.

      Your AMI should only have Apache & PHP on it, and each instance will have the same, shared MySQL server or RDS connection info, baked into the AMI. If users can upload images, you’ll need to move them from the instance to a shared server (rsync) or S3 bucket (cron script), and map that origin as your CloudFront endpoint for caching.

      There are, of course, other ways of achieving the same end result, but the above suggestions are the most common, easiest techniques within Amazon Web Services.

      Hope that helps!

      • Hey Geoff,

        What about sticky sessions? Check here …

        If you enable sticky sessions in your ELB the ELB will use a cookie to send the user to the same instance that served them when they return.

        • Jon

          but what if the instance has gone down in that time….( I guess the stick session would know that and they’d be sent to a new one…)

      • Jon

        Thanks Geoff… that was the answer I was looking for… now off to read up about RDS…

  • Daniel del Castillo

    First at all, thanks for such great article.

    I am trying to follow it step by step on an Amazon VCP, so I have created an internal ELB and added just one EC2 instance. The problem is, as soon I apply step 2 , creating the autoscaling group,(–min-size 1 –max-size 4 ), new instances are created and terminated every 3 minutes or less, so I have to update to –min-size 0 and delete the group to finish it. The first time I thought it was because the index.html ping path was not reachable and the EC2 healthy check on the ELB was triggering the creation of new instances. But after modifying the security policy for the instance to accept http:80 and checking that instance status on ELB was ok I tried again but got the same result. Moreover, ELB was not showing up the new instances on the console.

    • Geoff Hoffman

      Hmm, if you’ve created the auto scaling policy correctly, under very little load, you should have just one instance as the only one in the group. What metric are you watching with your auto scale up policy? Try monitoring network out or cpu usage, for example.

      • Daniel del Castillo

        That is the problem, I have no auto scale policy in place, just one internal ELB, one EC2 instance, one AMI ,one “as launch config” and one “as group”. Just after creating the group the instances begin to autocreate and autoshutdown. How can I check what or why the instance was created exactly?

        • Daniel del Castillo

          Ok. I think it was all my misunderstanding. I was assuming that the auto-scaling group was applied to my running EC2 instance so I was expecting new instances added to it but know I realized that the group has nothing to do with the running instance, so it keeps creating the minimum instance because the ELB was linked to it.

          • Boyan

            Hi Daniel, yes the auto scaling is not applied to running EC2 instances, it generates new ones from an AMI. I was getting the same error as you where instances would auto create and auto shutdown every few minutes. In my case my load balancer was not created in my VPC. As soon as I created my load balancer in my VPC, then set that new load balancer to my auto scaling group, everything worked great.

  • panchicore

    siege is cool

  • Thanks for the article, I’ve put together a list of all the common Auto Scaling Group commands that we use all the time.

    Let me know what you think

  • Boyan

    Hi Geoff, Steve,

    I wanted to send you a personal message but I couldn’t figure out how.

    I just want to say that this is a great guide. This guide is how I learnt to use auto scaling on my projects last year, and from then I just got better and better at it. I went as far as creating a GUI to handle all this (instead of using command line tools) and earlier this year I even made it publicly available. All thanks to this guide!

    Thank you very much!

  • Great article – but how do you handle shared dynamic resources? I.E Uploaded content/files handled through a CMS. Would you use a content server sharing content over SSHFS or SAMBA that all instances link to – or would the load on that server become excessive? Come to think of it could a content server store your PHP files as a single source with that shared, so the instances all use the same files but handle processing themselves. Just wondering if there are best practices or if I need to experiment.

    • Cameron McKay

      Put them on S3?

  • Great article – but how do you handle shared dynamic resources? I.E Uploaded content/files handled through a CMS. Would you use a content server sharing content over SSHFS or SAMBA that all instances link to – or would the load on that server become excessive? Come to think of it could a content server store your PHP files as a single source with that shared, so the instances all use the same files but handle processing themselves. Just wondering if there are best practices or if I need to experiment.

  • Anant


    I am stuck with one issue while auto scaling, my EC2 Instance has XAMPP and tomcat, on scaling up the newly created instance does not automatically start tomcat and I even have to manually do port mapping under lampp folder using

    root@domu-xx-xx-xx lampp]# /sbin/iptables -t nat -I PREROUTING -p tcp –dport 80 -j REDIRECT –to-port 8080

    root@domu-xx-xx-xx lampp]# /sbin/iptables -t nat -I PREROUTING -p tcp –dport 8085 -j REDIRECT –to-port 91

    root@domu-xx-xx-xx lampp]#/sbin/iptables-save

    How can I automate a) tomcat start up and b) Above port mapping so that auto scale instance is up & running without any manual intervention.

    Any help is appreciated.

    Thanks !

  • Niall

    Hi Geoff, this guide has been really
    helpful with a Project at work. Thank you.

    one question that I can’t find an answer to which is how to prevent user
    sessions being disconnected during a
    scale down policy.

    For example if a customer logs into one
    of your many web instances via ELB and sends a query to your Backend DB/Worker
    instances which takes a while to form the result, what happens if the down scale
    policy kicks in and starts to reduce that web instance the customer is on due
    to inactivity?

    Is there a metric that can help to keep
    that instance alive until its received the result and also stop that instance
    from receiving anymore customer requests in the mean time?

    • aater

      Yes, lets talk about it, many of my students and professionals come to me for tips on this, so I have setup an office hour discussion to talk about anything related to cloud and Autoscaling EC2 Instancese. I am a Faculty at UT Austin, Texas, join me in office hour discussions with skype on tuesday, we have only 6 seats.

      • Niall

        Thanks for the invite Aater. Your meeting is 5 am my time so I won’t be able to make it unfortunately. I haven’t been able to find a solution to my problem above yet, from a application side a poison pill may be the best bet but there should be something within aws to address this from the infrastructure side?

        • aater

          There is a possible route out of this, lets discuss more about it on skype, just register for this meeting,, and I will get it rescheduled once exchange connect.:

  • Alex

    many thanks for your article.
    I have a few queries about it : with the command “mon-put-metric-alarm”, you are adding monitoring to one running instance (–dimensions InstanceId=….) .
    -Does this mean that the autoscaling is reacting on the behavior of this instance only ?
    If there are 2 servers running, the first monitored server is running fine (<80% CPU) , but the second server is running at 95%CPU, will autoscaling start a third server automatically or not ?
    -What would happen if an incident occurs on the first server ? I guess that the autoscaling group will start another one to replace it ( if below minimum of servers), but which instance will be monitored ?

    Thank you.

  • Gabriel Holder

    For some reason after I create a policy using this tutorial, AWS will start launching servers even AFTER I specified the conditions in step 4.
    Any reason why this is happening?

  • Fabrizio

    During the step of group creation, I get this error:

    as-create-launch-config: Malformed input-File not found: /home/myhome/.aws/credentialsfilepath.txt

    what this file needs to contains?
    Thank you.

  • Hi, not sure if you will be able to help but it’s worth asking 🙂

    Using either of the following commands –

    as-create-auto-scaling-group MyGroup –launch-configuration MyLC –availability-zones eu-west-1 –min-size 1 –max-size 1

    as-create-auto-scaling-group MyGroup –launch-configuration MyLC –availability-zones eu-west-1a –min-size 1 –max-size 1

    I get the same error –

    Fatal Error] :-1:-1: Premature end of file.

    as-create-auto-scaling-group: The service couldn’t process your request at this time. Please try again later.
    Reason: Premature end of file. AWSRequestId:No request id received

    No ideas why, can’t seem to make the tools work for me at all. Just trying to create the most basic of basic scaling groups with x1 instance (i.e no scaling) to cater for single instance resilience.

  • Esteban

    Hi, great Post!, I’m new at amazon’s ec2 platform and I find it really interesting. I’m currently deploying a web application that frequently inserts data into mysql. My question is: if I launch multiple instances with auto scaling, do I need to setup mysql server first?, I mean, to replicate the data over the images that are currently running? (like a master – slave cluster). Tanks!

  • Darren

    Why so complicated with AWS. Take a look at HybridCluster

  • sunny


    Can you please suggest me a way by which i can set my auto-scaling policy to scale up whenever the http requests exceeds the threshold value (say 10,000 http requests) ?
    I do not find any such option in the console to set alarm on this. I set an alarm to scale up an instance when CPU load increases by 70%. But in my case CPU load in not exceeding much but my website is facing problem due to more concurrent http request. Please suggest me a way by which i can solve this problem.

    Thanks in advance.