Viewing entries tagged
aws

Terraform + Chef Quick Start

April 14, 2016

I wrote an example of how to use Terraform with Chef (specifically chef-solo) to provision an environment in AWS EC2. While Terraform does include a built-in Chef provisioner, it requires running the Chef Server. This example instead ships cookbooks to the nodes with the Terraform file provisioning functionality, and uses a simple bash script to install and run chef-solo.

To run the example, you will need an AWS account with API credentials (Access Key ID and Secret Access Key).

git clone git@github.com:mjuuso/provisioning_example.git
cd provisioning_example
./run.sh <aws_access_key> <aws_secret_key>

The script will make sure you have all the prerequisites for running the example (terraform, curl, ssh and ssh-keygen installed), generate SSH keys, run terraform and then verify that the load balancer works as expected.

This will create you the following resources in AWS:

two application instances (t2.micro) running a sample Golang application server
a load balancer instance (t2.micro) with nginx proxying requests to the application nodes in round-robin fashion
an SSH key pair (generated by the wrapper script)
two Security Groups -- one for the load balancer, one for the application nodes
a Virtual Private Cloud (VPC)
two subnets within the VPC, on availability zones eu-west-1a, eu-west-1b
an Internet gateway, a routing table and routing table associations

The example could be trivially extended to include a tier of database servers, or even an autoscaling group for the application servers.

For the code, check out https://github.com/mjuuso/provisioning_example.

Find something to improve? Pull requests are very welcome!

4 Comments

1 Comment

Rethinking infrastructure

February 22, 2016

When I joined Artirix, the company had just started the journey towards Continuous Delivery. A large part of it is the cultural transformation to DevOps: improving workflows and collaboration between Dev, Ops and QA. But in addition to having the best rowers beautifully in sync, we also need a streamlined and leakless boat. Therefore an equally large part of our journey is optimising and modernising our technical environment - patching the holes that slow our boat down.

Identifying problems

One of the largest challenges we have is configuration drift, ie. the gap between actual live production configuration state, and the state our configuration management tool depicts. Ideally those states are identical. But in the real world, the two configuration sets are often wildly different. In our case this is certainly true, and can be accounted to the way we historically do configuration management. We run Puppet in a masterless configuration with Capistrano, which means that configuration changes are applied manually on demand. This is both slow (Cap rsyncs the full Puppet repository to every node and applies it locally) and non-enforcing. It's too tempting to do a quick change manually, and update configuration management to reflect that later (or, never). While a masterless setup might scale better in some circumstances, the benefits just don't outweigh the disadvantages for us.

“At some point it became easier to just SSH into production, change a setting, and go on with your life.”

Creating environments is a particularly painstaking chore for us. It takes days - sometimes even weeks - to create a new project environment. We need to provision and bootstrap new instances, modify and rewrite Puppet manifests to fit this special case, set up monitoring and backups, and add separate environment configuration in about a dozen components. It's tedious, error-prone and repeating work - in other words, a prime candidate to be automated.

Another problem with our current environments is that by design, each one is a unique snowflake. For our internal and testing environments, we tend to pack every component neatly in one box. This lowers hosting costs, but makes the environment very different from a production environment, which might span dozens of separate instances. This poses a clear problem with a Continuous Delivery pipeline: even if code functions as expected on every environment prior to production, the final step (deployment to production) is still a risky leap into the unknown.

Setting targets

As we embarked on the journey to transform our infrastructure, the first target we set was simple:

“We want to be able to run Chaos Monkey and feel good about it.”

Simply put, Chaos Monkey, developed by Netflix, is a tool that causes random failures in groups of systems. For an organisation that's used to constant firefighting, the change of mindset required for deliberately starting fires is dramatic (even though it's not unheard of for firefighters to moonlight as arsonists). The reasoning for the change, however, is simple: complex systems fail inevitably. If our infrastructure is resilient enough to withstand Chaos Monkey without catastrophic failure, it implies we have reached a certain milestone in the maturity of infrastructural design and configuration management.

We also set some other goals:

We want to kill configuration drift
We want our infrastructure to be versioned, and thus represented as code
We want our environments to be identical across stages
We want Dev and QA to be able to create and destroy environments on demand
We want to save our clients money by using resources more efficiently

Finding solutions

A lot of our problems will be solved with immutable infrastructure. Instead of managing long-lived servers, resources should be considered disposable. Instead of reconfiguring running servers and deploying new versions of software, we should start from scratch whenever possible. To achieve this, our weapon of choice is Docker with Kubernetes. Obviously, this requires refactoring all of our components to work inside containers - a task that is definitely as interesting as it sounds.

For tackling the snowflake-environment problem, we're relying heavily on autoscaling. The fundamental idea is that every environment should start the same, while production environments will automatically scale to match the load imposed on them. This will come with two benefits: our environments are very much alike across stages, and we can introduce cost savings and flexibility for our clients by automatically scaling the environments up or down depending on the load.

For creating environments on demand, we're building a simple tool with 5 basic functions:

Wrap Terraform to create Kubernetes clusters on AWS EC2
Create environments (namespaces) with specific component (Docker image) versions and services in the Kubernetes clusters
Present stdout from containers, and logs gathered by fals
Automate data migration between environments (MySQL/MariaDB, Redis, S3)
Manage hostnames for environments with AWS Route 53

Effectively a hybrid between an internal PaaS and CaaS, this will allow Dev and QA to create disposable environments on their own. Among other cool stuff, it will ultimately allow us to do things like blue/green deployments to production - increasing our confidence in deployment even further.

What will our future look like?

While our transformation is still very much in progress, I see these improvements as a big win for the business as a whole: it will increase our confidence in deployment, bringing us a step closer to the benefits of true Continuous Delivery. It will eliminate error-prone manual work, and decrease time to market for new features and projects - a real competitive advantage.

We're not only patching the holes that slow our boat down. We're adding motors.

1 Comment

Coming your way: Unikernels

February 17, 2016

At Artirix, our DevOps team is constantly working on improving our infrastructure. I admit, we are selfish human beings - we want to do less firefighting, never wake up to 4am PagerDuty incidents again, and stop doing repetitive work. Luckily, though, all this comes with the added benefit of increased customer value.

We are currently running a somewhat traditional setup in AWS (pictured above). While we agree that containers is the next logical step to take, we’ve had increasing interest in looking one step ahead: unikernels. We see this as a way to further simplify our infrastructure while making it more resilient.

Unikernels are operating systems that are custom-built to run a single application directly on the hypervisor.

So, what are unikernels anyway? Start by thinking about your typical server in the cloud. It's running your application, along with a plethora of supporting services enabled by default. Everything is stacked on top of an operating system that is designed to support a wide variety of platforms, devices and software. Therefore, by design, the operating system contains a lot of components that make absolutely no sense in the cloud. When was the last time you needed a mouse (or a floppy) driver module loaded into your kernel?

In a sense, the rise of containerisation only makes this worse. Consider the following stack for running a containerised application in AWS:

It may seem overly complex - probably because it is. When deploying a carelessly built container, you are effectively installing another operating system on top of an operating system. Now, consider the same application running as a unikernel:

The beauty of unikernels is simplicity. Only the things your application needs are baked into the operating system. There are no floppy or mouse drivers, unless you explicitly decided to build them in. Your application has only the necessary components to fulfil its function and run as a virtual machine directly on the hypervisor.

A unikernel contains the least amount of moving parts needed to do the job.

Needless to say, this has serious security implications as well. The attack surface of a unikernel is typically only a fraction of what a containerised application running on, say, Ubuntu Linux would have. By requiring you to explicitly build in functionality, unikernels provide exceptional visibility to what your stack is actually running.

There are several promising projects around unikernels. We’ve experimented mainly with OSv and rumprun. Take a look at MirageOS, LING, HaLVM and Clive as well.

While unikernels are certainly interesting, there are many obstacles to work around before they reach the maturity required for widespread usage. However, with the recent purchase of Unikernel Systems by Docker, we might see quick developments in unikernel monitoring, debugging and deployment - which are all needed to efficiently utilise the technology.

We’ve started a Meetup group in London for people interested in Unikernels. Become a member and check out the next meetup at http://www.meetup.com/London-Unikernels-Meetup/

p.s. Just kidding, the cover picture is not really of our AWS setup. We are a bit more modern than that. The picture is of ENIAC, built in the 1940s, courtesy of the U.S. Army.

1 Comment

Blog

Terraform + Chef Quick Start

Rethinking infrastructure

Identifying problems

Setting targets

Finding solutions

What will our future look like?

Coming your way: Unikernels