This is part two of a series designed to get your auto scaling environment running.
In the last part of this series, we did a bunch of manual key mashing to take our first snapshot. This gives us the foundation we need to automate the the process. In this part we will review the scripts required to make auto scaling work as expected. Also, at the end of this post, I’ll share the Chef recipe used to install all the scripts described here.
snapshot.py– a python script to snapshot a volume on deploy
deploy:snapshot– a capistrano task used to call
prep_instance.py– a python script to mount a volume from the most recent snapshot and tag the instance.
utils.rb– a ruby script used during
cap deployto get instance dns names by tags.
production.rb– a the file used by capistrano multistage to get a list of servers.
chef_userdata.sh– a userdata script for bootstrapping chef.
userdata.sh– a userdata script that does not include chef bootstrapping.
autoscaling– a chef recipe used to set up all the scripts above.
Before you start
Let’s review the tool set we’ll be working with. So far, we’ve used Bash and the AWS Command Line Tools and we’ve done just fine. We’ll still use bash to stitch together our scripts, but in the next few steps we’ll be using both Python and Ruby to accomplish our goals. I find Python to be more expressive and capable than Bash when dealing with lots of variables that need to be type checked and have default values. Plus Ruby is a good fit for Capistrano. So, we’ll be using the boto libraries on the server side, and the AWS SDK for Ruby on the client (Capistrano) side.
I use Chef to manage my dependencies, but if you’re doing this by hand, the AMI on which these scripts will run must have the boto libraries pre-installed. To do so, you can issue the following statements.
1 2 3
Also, boto expects on a
.boto file to exist in the home directory for the user who executes these scripts. We’ll set the
BOTO_HOME variable in our driver script later on in this post.
The rest of this part will describe the file you need and what they do.
The Python script we review here will:
- Given an instance ID, look up the volume attached to a device and take a snapshot of it.
- Tag the snapshot, so that future scripts can query the tags.
This script is called at deploy time so that the most recent code is always ready to mount on an auto scaling instance.
parsed_args method at the top of the script does a decent job of describing its default values. You’ll probably want to change the
--tag argument to match your organization’s needs.
main method we do all our work. The line:
drives this little app. We search for instance IDs that match that of the calling box.
Then we iterate over the volumes, searching for the mount point (device) we set up earlier. Once found, we tell the script to create the snapshot and add the tag.
And that’s it. We’ll use this script later on in our automation.
The Capistrano task below calls
snapshot.py on deployment.
1 2 3 4
Notice the presence of the
BOTO_CONFIG environment variable. The boto library provides documentation for the appropriate keys to add to this INI-style file.
Finally, remember to add the snapshot task to your
after_deploy hooks in your Capistrano
The script we’ll review here will, given a tag, search for the most recent snapshot, create a volume and mount it. Furthermore, the script will apply tags to the instance itself. We’ll use these tags in our Capistrano ruby script.
As with the other Python script, there is a
parsed_args method that defines the default values we’ll need. The
help section of each describes each default. The pair that need a bit more explaining are
device_value. If you recall in Step 4 of part one of this series, device names can differ from AWS to your OS. These two arguments compensate for this fact.
Some interesting parts of the code include
wait_volume. Both deal with the fact that calls to create volumes, snapshots, and to attach devices are async. So, we must poll the API waiting for the status we expect. For example, in the snippet below, our script sleeps for up to 60 seconds until the status we want appears. If not, it throws an exception.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
This tool grabs all instance DNS names from AWS. We use this in the Capistrano multistage
production.rb to get an array of DNS names. It is pretty self-explanatory. Since this script will be distributed to your developers, it would probably be a good idea to lock the credentials down to read-only. You will have to require this in your
deploy.rb like so:
Here’s the file itself. This makes deployment nice because it dynamically grabs EC2 Instances tagged with the Role and Environment you specify along with an
instance-state-name of running. This guarantees that you’re pushing out to all the servers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
The Capistrano multistage extension allows you to specify a file for each deployment target. This script replaces
production.rb and calls out to
utils.rb to get dns names.
1 2 3 4 5 6 7 8 9 10 11
This file will be passed to an autoscale launch config.
The shebang line uses the
-ex args to instruct bash to exit on error and to be very verbose when executing. This is super-handy for debugging your user data script.
exec call redirects standard out and error to three different places.
We slightly shorten the DNS name and assign it to the
1 2 3
If you’re not using Chef, you can skip the following bits. If you are using Chef you can boostrap the node this way: delete the
.pem, set up a
first-boot.json file, and pass the
EC2_HOST variable to the
client.rb file so your Chef node name is useful.
This script also assumes that the Chef libraries are already installed and have been bootstrapped once before.
1 2 3 4 5 6 7 8
And finally we call
Ultimately, this is the script that does all the work. It mounts drives as described in Step 4 of part one and then calls
prep_instance.py from above.
Although this script is mighty important, we’ve covered all the details elsewhere. Look it over and you’ll recognize parts.
autoscaling chef recipe
We’ve reviewed a lot of scripts here in this document. You may be wondering where to put them all. Chef to the rescue! Even if you’re not using Chef, the default recipe from my recipe creates a great guide for placing these files where you want them.
Here’s an example from the
default.rb. In this case
/root/.boto is where we’re going to place the boto.cfg.read_only.erb file.
mode functions should make sense to you.
1 2 3 4 5 6
This pattern is repeated throughout the document.
You’ve reached the end of this part. So far, you’ve reviewed all the scripts you’ll need to auto scale your environment. In part 3 we’ll look at some bash scripts for setting up your autoscaling rules, and review where all these scripts go.