The year is 2021 and you’re still building virtual machines. Yes, you read that right, but it’s not always a bad thing. There are few scenarios where you might consider using a virtual machine over a container (or function.) Maybe AWS doesn’t provide a hosted solution for the technology you want to use, maybe you want to do multi-cloud and the cloud provider’s offerings are too different to manage. Packer can help you build and maintain custom virtual machine images.
Lesson
What is Packer?
Why Should I Care About Packer?
Security Hurdles
Packer Files And Components
Packer Commands
Packer Builds
What is Packer?
Packer is brought to you by Hashicorp, the same people who brought you Terraform. The link between these two products might be a little loose, but can become a superpower when combined. Packer also uses Hashicorp’s HCL2 (Hashicorp Configuration Language V2) which should feel similar to writing Terraform code. Packer allows you to build configurations on top of existing virtual machine images. In our case, we’re talking about adding additional configuration to Amazon Linux 2 AMIs. Packer builds AMIs by provisioning an instance on your behalf, uses ssh to remote into the instance and configure it based on your specifications. When the configuration completes, packer shuts down the instance, turns it into an AMI and then does a bit of clean up.
Why Should I Care About Packer?
Investing in Packer will bring you closer to immutable infrastructure which is a fancy way of saying you’ll have the ability to destroy an instance and not freak out about it. Rebuilding a broken/missing/deleted instance is fast and easy because you baked most of your configuration into a custom AMI. I say most of your configuration because you should not store sensitive information like secrets in your custom AMI.
Security Hurdles
The default behavior for Packer is to provision a keypair (for ssh access), instance, and security group on your behalf during the Packer build process. This all looks great on the surface, but take a closer look at the security group and you’ll notice that it opens up the instance to the public Internet which doesn’t feel very secure. The Packer build process can take anywhere from 5-30 minutes depending on the amount of custom configuration you put into your build. A more secure way to do this is by using a bastion instance to tunnel through to get to the private instance for configuration. The cost of using this method is additional configuration and a bastion instance to maintain. An even more secure way to accomplish this is by leveraging AWS’s Systems Manager Session Manager to connect into the instance for configuration. Session Manager is an amazing and underutilized tool for managing EC2 instances in multiple ways.
Packer Files And Components
Packer uses HCL2 just like Terraform. If you’ve written Terraform code, you know the files end in .tf, so Packer files end in .pkr. LOL that’s not the case. Packer files end with .pkr.hcl. I recommend taking some time to think about how you want to organize your Packer files, rather than throwing everything into something like main.pkr.hcl. The two major components of Packer are builds and sources, so that might be a good line in the sand for file organization.
Source: a code block which tells Packer where to start which is most likely a cloud provider and how to connect to the instance/droplet/vm for additional configuration
Build: a code block which tells packer to invoke a defined source block and run additional configuration on that intance/droplet/vm via provisioners. You can use bash as a provisioner, but Packer also supports configuration management tools like ansible
Note: It’s a good practice to included the timestamp in the Packer AMI name to establish a unique naming convention. This will prevent any naming collision between builds and help you diagnose issues when things go wrong.
The build block invokes a source or multiple source blocks and then runs additional configuration based on the defined provisioner sub-blocks
Packer Commands
Packer is similar to Terraform in that executed commands search the current working directory for configuration files and it uses the command plus subcommand format to run. Here are some of the basic commands to get going:
packer init – intitializes packer plugins. This is similar to how Terraform intializes the configured providers
packer validate – validates packer configuration files. This is similar to Terraform’s validate subcommand and checks for syntax/configuration issues.
packer build – kicks off the packer build process. The build command is like running Terraform apply with the -auto-approve flag to bypass the user provided input.
Packer Builds
Now that we’ve gone over the basics, it’s time to get our hands dirty and start building some AMIs. We’ll be using the same configuration used for awscli to provision a packer image with AWS System Manager Session Manager (yes it’s a long and ridiculous name for an otherwise cool tool.)
Pre-Flight
Create a new IAM user with “Access Key – Programmatic Access” then press Next
Select the “Attach existing policies directly”, Select “Administrator Access” and then press Next
NOTE This is for demonstration purposes only and an account with a locked down policy should be created for production applications.
Press Next at the tags screen
Press Create User button and then store the Access and Secret keys in your password manager
You thought there would be no terraform in this post, I did too. It turns out that you need a few things for this to work. You need a VPC, an IAM policy to allow access to session manager on an instance and a role/profile to attach it to said instance. For a closer look at what is being provisioned, check out packer.tf.
Run terraform init
Run terraform apply -auto-approve
Set the VPC ID from the terraform output as an environment variable that packer can read: export PKR_VAR_vpc_id=$(terraform output -raw vpc_id)
Set the Subnet ID from the terraform output as an environment variable that packer can read: export PKR_VAR_subnet_id=$(terraform output -raw private_subnet_id)
Verify that both variables were set:
echo $PKR_VAR_vpc_id
echo $PKR_VAR_subnet_id
This Terraform will create a small increase on your AWS bill, be sure to remove these resources after you’re done testing.
Packer
Now on to the main course, you are now ready to build your first Amazon AMI with Packer. From the basic directory, run the following:
Intialize the AWS Packer plugin: packer init plugins.pkr.hcl
Kick of image creation: packer build .
During the build process the output will show you what it’s doing, here is a general overview:
Validating input parameters
Generating keypair for build
Launching an instance
Waiting for SSH to respond over AWS Session Manager
Running any defined provisioners
Stopping instance
Creating AMI from configured instance
Deleting instance
You might get an error like the one below, but the build will complete successfully and produce an AMI:
==> amazon-ebs.linux: Bad exit status: -1
==> amazon-ebs.linux: Cleaning up any extra volumes...
==> amazon-ebs.linux: No volumes to clean up, skipping
==> amazon-ebs.linux: Deleting temporary security group...
==> amazon-ebs.linux: Deleting temporary keypair...
Build 'amazon-ebs.linux' finished after 6 minutes 25 seconds.
==> Wait completed after 6 minutes 25 seconds
==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs.linux: AMIs were created:
us-east-1: ami-0e1fa6b889312345
Now you can see the AMI in the AWS console, or you can check it out via awscli: aws ec2 describe-images --filters "Name=tag:Service,Values=burrito"
If you wanted to reference this AMI in terraform, you can use a data source lookup to fetch the AMI ID and pass it to an AWS instance resource:
data "aws_caller_identity" "current" {}
data "aws_ami" "burrito" {
most_recent = true
filter {
name = "Service"
values = ["burrito"]
}
owners = [data.aws_caller_identity.current.account_id] # your account id
}
resource "aws_instance" "burrito" {
ami = data.aws_ami.burrito.id
instance_type = "t3.micro"
tags = {
Name = "web0-burrito-prod"
}
}
After you’re done testing/building, dont forget to run terraform destroy in the workspace to remove the VPC and IAM resources
In Review
Packer is a great tool for pre-baking images so they can be provisioned more quickly and easily replaced. Using Packer with AWS Session Manager feels like a welcome cheat code and I hope this tutorial helps you on your cloud journey.
In this article we are going to talk about two open-source infrastructure-as-code tools that we use at Flugel. These tools are Packer, to build machine images for different platforms, and Terraform, to manage infrastructure resources.
By using the two in combination it’s possible to create infrastructure-as-code solutions that automatically build and run custom machine images, provisioning an EC2 instance on AWS using a custom AMI, for example.
For this article we’ll examine one particular use case: we’ll provision an EC2 instance that allows us to benchmark an HTTP endpoint. To accomplish this, we’ll first use Packer to create an AMI with the HTTP test and benchmarking tool siege installed. Then, using Terraform, we’ll provision an EC2 instance using this AMI.
Packer is an open-source tool by Hashicorp that automates the creation of machine images for different platforms. Developers specify the machine configuration using a JSON file called template, and then run Packer to build the image.
One key feature of Packer is its capability to create images targeted to different platforms, all from the same specification. This is a nice feature that allows you to create machine images of different types without repetitive coding.
There are various options for installing Packer depending on your platform and preferences. Go to Packer’s Getting Started page for detailed instructions. Keep in mind that using the precompiled binary is the simplest option.
The Template File
As we’ve said, templates are JSON files that define the steps required to build a machine image. Packer uses the information specified in the template to create the images.
The components most often configured through the template files are the builders and the provisioners.
Builders are components that are able to create a machine image for a single platform. Each template can define multiple builders to target different platforms. There are plenty of builders from which to choose. In this article we use the AWS AMI builder to create an AWS AMI.
Provisioners are components of Packer that install and configure software within a running machine prior to creating a static image from that machine. Packer has many provisioners you can use, such as file, to copy files to the machine, shell, to execute commands in the machine, and many more.
Let’s introduce the following template file and then analyse it. We are going to use this template to build an AWS AMI that contains the siege HTTP benchmarking tool, using Ubuntu 18.04 as the base image.
{
"variables": {
"ami_name_prefix": "{{env `AMI_NAME_PREFIX`}}"
},
"builders": [{
"ami_description": "An AMI with HTTP benchmarking tools, based on Ubuntu.",
"ami_name": "{{user `ami_name_prefix`}}-{{isotime | clean_resource_name}}",
"instance_type": "t2.micro",
"region": "{{user `region`}}",
"source_ami_filter": {
"filters": {
"architecture": "x86_64",
"block-device-mapping.volume-type": "gp2",
"name": "*ubuntu-bionic-18.04-amd64-server-*",
"root-device-type": "ebs",
"virtualization-type": "hvm"
},
"most_recent": true,
"owners": [
"099720109477"
]
},
"ssh_username": "ubuntu",
"type": "amazon-ebs"
}],
"provisioners": [{
"inline": [
"echo 'Sleeping for 30 seconds to give Ubuntu enough time to initialize (otherwise, packages may fail to install).'",
"sleep 30",
"sudo apt-get update",
"sudo apt-get dist-upgrade -y"
],
"type": "shell"
},
{
"type": "file",
"source": "{{template_dir}}/urls.txt",
"destination": "/home/ubuntu/urls.txt"
},
{
"scripts": [
"{{template_dir}}/install-tools.sh"
],
"type": "shell"
}
]
}
The above template contains three main sections: variables, builders and provisioners.
Variables Section
In order to avoid hardcoded values, in the variables section you can define variables that are to be used in other template sections. Users can override variables’ values, passing them as options to Packer when building the images. This allows users to customize the building process without changing the template file.
In our template we defined just one variable ami_name_prefix that will be the prefix of the AMI name. We used the env function in the variable definition to set its default value to the environment variable AMI_NAME_PREFIX.
Builders Section
In the builders section we specified one builder of type amazon-ebs. This builder creates an AMI backed by an EBS volume. These are the steps run by this builder:
Create a new EC2 instance from the base AMI.
Run the provisioners on the EC2 instance.
Create the new AMI from the running EC2 instance.
Destroy the EC2 instance.
Different builders have different parameters. The amazon-ebs builder requires that we specify:
ami_name: The name of the resulting AMI.
ami_description: A description for the AMI.
source_ami_filter: How to find the base AMI.
instance_type: Instance type to use in the building process, can be different from the type you plan to use when running the AMI.
region: Name of AWS region in which the AMI must be generated.
ssh_username: SSH username to connect to the building instance.
Provisioners Section
Finally, the provisioners section contains the steps needed to install the needed packages and files in the base AMI to transform it into our custom one.
The above template specifies these steps:
Upgrade Ubuntu packages.
Copy urls.txt that contains a list of urls we can feed to siege.
Run install-tools.sh script to install siege and other packages.
Both urls.txt and install-tools.sh are inside the same directory as the template, so we can use template_dir Packer function when passing their paths to the provisioners.
The urls.txt simply lists the URLs to be fed to siege to benchmark these URLs:
install-tools.sh installs bash-completion and siege packages using APT.
#/!bin/bash
set -e
sudo apt-get install -y bash-completion siege
Building the AMI
Prior to building the AMI it’s a good idea to validate syntactical correctness of the template file. We can do this using Packer’s validate command. Validate verifies that the template is a valid JSON file and satisfies Packer’s template schema.
With your AWS credentials configured and the template validated, you are ready to use Packer to build the AMI. Run the following command to start building the AMI.
Note that we passed http-benchmarking as the AMI prefix name. The full AMI name, as defined in the template, will be built using this prefix and the current date.
The process can take several minutes. After building the AMI, Packer shows a message containing the AMI id:
==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:
us-west-2: ami-04236c7ea2f337c88
You can locate and run the AMI from the AWS console under the EC2 service. In the next section we’re going to use Terraform to provision an EC2 using the AMI we’ve just created.
Provisioning an Instance using Terraform
Terraform is another open-source tool by Hashicorp. It provides a domain-specific language that enables developers to manage infrastructure resources using declarative configuration files.
Developers use the Terraform language to configure the infrastructure by specifying resources and their desired state. Then Terraform uses this information to make the API calls to create or update the infrastructure. To make the API calls Terraform uses components called providers that are responsible for the interaction with external APIs. Terraform already includes providers for most cloud platforms and IaaS platforms.
The main constructs of the Terraform language used by developers are:
resource: To specify an infrastructure element and its configuration, an EC2 instance, for example.
provider: To configure provider parameters, such as version, region and credentials.
data: To fetch some information from a provider, for example to find an AMI id by its name.
variables: Like Packer variables, used to avoid repeating code or hardcoding values. Users can override variables’ values by passing them as options to Terraform.
output: Used to expose information about the resources managed by Terraform to external systems, for example to expose the IP address of an EC2 instance managed by Terraform.
After each run Terraform saves the current infrastructure state in a state file. This state allows Terraform to map real-world resources to the resources specified in the configuration. This information is used in successive Terraform runs to detect drifts between the actual infrastructure and the configuration.
By default, the Terraform state is stored locally, but Terraform supports different options to store it remotely, which is the best approach when working in a team.
Configuration to provision an instance using our custom AMI
In this section we are going to review a Terraform configuration that provisions an EC2 instance using the AMI we built using Packer. This configuration also allows connections to port 22 to allow SSH connections.
Terraform uses all .tf files in the directory as the infrastructure configuration. This convenience allows you to organize the configuration code into multiple files, depending on your needs.
This file defines infrastructure parameters which should be easily overridden by the user. These are: how to find the AMI by its name and owner, what instance type to use, keypair, CIDR from which to allow SSH connections, and a tag to be applied to all the resources to make them easy to spot in the AWS console.
The following code defines the variables:
variable "ami_name_filter" {
description = "Filter to use to find the AMI by name"
default = "http-benchmarking-*"
}
variable "ami_owner" {
description = "Filter for the AMI owner"
default = "self"
}
variable "instance_type" {
description = "Type of EC2 instance"
default = "t2.micro"
}
variable "keypair" {
description = "Key pair to access the EC2 instance"
default = "default"
}
variable "allow_ssh_from_cidrs" {
description = "List of CIDRs allowed to connect to SSH"
default = ["0.0.0.0/0"]
}
variable "tag_name" {
description = "Value of the tags Name to apply to all resources"
default = "http-benchmarking"
}
Here we have specified default values; later we’ll see how to override them.
The main.tf file
Despite the name main there is nothing special about this file. We use this file to initialize the AWS provider and declare a local value common_tags, to avoid repeating the expression.
This is main.tf code:
provider "aws" {
version = "~> 2.0"
}
locals {
common_tags = {
Name = "${var.tag_name}"
}
}
The ami.tf file
This file uses a Terraform data element to fetch AMI information from the AWS provider. We use it since we need the AMI id to create the instance, but we have only the AMI name and owner.
In the following code we created an aws_ami data element and named it ami.
data "aws_ami" "ami" {
most_recent = true
owners = ["${var.ami_owner}"]
filter {
name = "name"
values = ["${var.ami_name_filter}*"]
}
filter {
name = "root-device-type"
values = ["ebs"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
Here, the AWS provider fetches the information for the AMI matching the specified filters.
The security.tf file
In this file we defined a Security Group resource in AWS that allows incoming connections to port 22 from the CIDR blocks specified in the variable allow_ssh_from_cidrs.
Note that we specified the AMI id fetched in ami.tf using the data element. We also used the autogenerated security group name for the security group defined in security.tf.
The outputs.tf file
Finally, we declared two outputs to make the instance IP address and ID available outside Terraform.
output "ip" {
value = "${aws_instance.instance.public_ip}"
}
output "ec2instance" {
value = "${aws_instance.instance.id}"
}
Provisioning the infrastructure
In this section we are going to see how to override variables’ values, use Terraform to validate the configuration, check differences between actual resources and the configuration, and apply the configuration to provision the infrastructure.
First, we are going to use a .tfvars file to specify the value of each variable that we want to override. Naming this file terraform.tfvars causes Terraform to load it automatically.
We have set the filter for the AMI name, instance type, keypair and CIDR to allow SSH connections only from our computer, and tag_name to identify our resources.
Keep in mind that the keypair has to exist in the same AWS region where you are creating the instance.
Now we can use Terraform’s validate command to verify that the configuration is valid:
$ terraform validate
Success! The configuration is valid.
Next, using Terraform’s plan command we can look at the changes that Terraform is going go make to the infrastructure. The output is like a “diff” between the configuration and actual state.
To actually make the changes to the infrastructure, you need to run Terraform’s apply command. It will show you the changes again and ask for confirmation.
When Terraform finishes making changes it will show the values of the outputs that we defined in the configuration. It will look like this:
Outputs:
ec2instance = i-00ba4d7f559e0cdfc
ip = 35.166.54.104
Now the instance is up and running; we can connect to it using SSH.
Connecting to the instance
With both the IP address displayed in the Terraform outputs and your key pair, you can connect to the instance using SSH.
$ ssh -i mykeypair.pem ubuntu@35.166.54.104
Here we have connected with key pair mykeypair.pem and instance IP address 35.166.54.104.
Once connected to the instance you can execute any command on it. Running ls we can verify that the urls.txt file was copied during the AMI build using Packer.
ubuntu@ip-172-31-43-205:~$ ls
README urls.txt
Finally we can use the instance to run the HTTP benchmarks using siege and the urls.txt file:
ubuntu@ip-172-31-43-205:~$ siege -f urls.txt
We have built the infrastructure that allows us to run these benchmarks independently of our office/home connection speed.
One great benefit of doing it using infrastructure-as-code tools is that we can automate the process completely and repeat it if we need to run the benchmarks again.
Freeing up resources
When not using the instance, you can use Terraform’s destroy command to free all the resources it has created. This is useful to avoid wasting money in AWS.
$ terraform destroy
Terraform will show you the changes it’s going to apply and ask for confirmation.
Since terraform did not create the AMI it won’t delete it when you run the destroy command. So, to delete all the resources, you need to delete the AMI from the AWS console.
Wrapping up
We used HTTP benchmarking as our example for this article, but most of the steps we’ve gone through can be applied to other use cases.
We’ve seen how, by combining Packer and Terraform, it’s possible to create infrastructure-as-code solutions to build and provision custom machines. These tools provide a layer of abstraction over cloud platforms and IaaS APIs, making developers’ work more efficient.
Having the infrastructure defined as code has many advantages; you can:
Reuse code
Automate provision of your infrastructure in your CI/CD pipelines.
Track changes using common versioning tools, such as git.
Destroy and re-create your infrastructure with confidence.
Deploy multiple times from the same definitions.
On a final note, this article describes just some of Packer’s and Terraform’s features. Check their official documentation to learn more about them and how they can be handy when implementing infrastructure-as-code solutions.
This tutorial covers how run Mondoo security scans during HashiCorp Packer builds of Amazon EC2 AMIs.
CAUTION
This tutorial will provision resources that qualify under the AWS Free Tier. If your account doesn’t qualify under the AWS Free Tier, Mondoo is not responsible for any charges that you may incur.
The Packer Plugin Mondoo makes it easy to integrate Mondoo security scanning with HashiCorp Packer builds. This integration lets you find and fix vulnerabilities and misconfigurations before provisioning them in your environment.
The plugin calls Mondoo Client after running packer build and authenticates with Mondoo Platform to run any policies enabled in the account Mondoo Client is registered to.
A Packer template is a configuration file that defines the image you want to build and how to build it. Packer templates use the HashiCorp Configuration Language (HCL).
Create a new directory named mondoo_packer. This directory will contain your Packer template for this tutorial.
$ mkdir mondoo_packer
Navigate into the directory.
$ cd mondoo_packer
Create a file aws-amazon2.pkr.hcl, add the following HCL block to it, and save the file.
This is a complete Packer template that you will use to build an AWS Amazon 2 Linux AMI in the us-east-1 region. In the following sections, you will review each block of this template in more detail.
The template provides a few variables that can be configured to the template before building an AMI. For more information on how to override Packer template variables see Template User Variables in the Packer documentation.
Before you can build the AMI, you need to provide your AWS credentials to Packer.
The template above has an aws_profile variable that lets you configure a profile from your AWS CLI credentials file, which is usually found in ~/.aws/credentials.
variable "aws_profile" { type = string description = "AWS profile to use. Typically found in ~/.aws/credentials" default = "default"}
These credentials have permissions to create, modify and delete EC2 instances. Refer to the Packer documentation to find the full list IAM permissions required to run the amazon-ebs builder.
TIP
If you don’t have access to IAM user credentials, use another authentication method described in the Amazon AMI Builder documentation.
By default the template will create the AMI using a default naming prefix of mondoo-amazon-linux-2-secure-base. You can override this with the image_prefix variable:
variable "image_prefix" { type = string description = "Prefix to be applied to image name" default = "mondoo-amazon-linux-2-secure-base"}
score_threshold – This configuration sets an int score threshold for security scans. If the scan produces a score that falls below the threshold, the build will fail. If your build runs multiple Mondoo policies, then it is the aggregated score that is evaluated. For more information see Policy Scoring in the Mondoo documentation.
on_failure = "continue" – This configuration ensures that the Packer build will not fail even if the scan produces a score that falls below the score_threshold. Mondoo will send the scan report to your account in Mondoo Platform for your reference, and the build will complete.
annotations – This configuration lets you create custom metadata for builds so that you can track your assets.
➜ mondoo_packer packer init aws-amazon2.pkr.hclInstalled plugin github.com/hashicorp/amazon v1.1.0 in "/Users/youruser/.packer.d/plugins/hashicorp/amazon/packer-plugin-amazon_v1.1.0_x5.0_darwin_arm64"Installed plugin github.com/mondoohq/mondoo v0.3.0 in "/Users/youruser/.packer.d/plugins/mondoohq/mondoo/packer-plugin-mondoo_v0.3.0_x5.0_darwin_arm64"
Packer will download the plugin you’ve defined above. In this case, Packer will download the Packer Amazon plugin that is greater than version 1.1.0 and Packer Plugin Mondoo greater than version 0.3.0.
You can run packer init as many times as you’d like. If you already have the plugins you need, Packer will exit without an output.
Packer has now downloaded and installed the Amazon plugin and the Mondoo plugin. It is ready to build the AMI!
In this tutorial, you built an Amazon AMI and scanned it for vulnerabilities with Mondoo. Although we ran the one policy, you should now have a general idea of how Packer Plugin Mondoo works, and you should be ready to add any additional policies for your builds.
The GitHub repository for Packer Plugin Mondoo contains additional templates for building Ubuntu and Windows images.
Refer to the following resources for additional details on the concepts covered in this tutorial: