When Ada Lovelace and Charles Babbage began their work on the first programmable computer, the Countess would spend long hours translating mathematical algorithms onto punchcards to be interpreted by Mr. Babbage’s new “difference engine.” Cards and notes would flow from Ada’s desk onto the top of her lap as she began committing code well before Chuck’s computer had even been constructed. One hundred and sixty-two years later, when Babbage’s Engine finally shipped, Ada’s code was finally deployed onto the difference machine. The code executed flawlessly until it hit a fatal NullPointerException, at which point Ada’s ancestors could only shrug and insist that the museum curators “must have deployed it incorrectly — it ran just fine on her lap top.”
Ever since the advent of the computer, developers have been trying to find more effective ways of packaging software for delivery and avoiding the “but it works fine on my laptop” problem. A myriad of differences between a developer’s machine and the server it runs on can render otherwise perfectly running code useless due to alternate operating systems, libraries, even time zones. While Java and other software platforms attempted to construct applications that developers could “write once, run anywhere” large differences in platform versions still made this a difficult feat.
Packaging Apps in Containers
The teams working here at High Alpha use a wide variety of platforms, tools, and frameworks to build products — anything from R scripts to Spring Bootstacks. To allow a variety of applications to run across multiple cloud providers, we often leverage Docker containers to package up code that has been built & thoroughly tested in a continuous integration environment. This container will then execute on any server (real or virtual) that is running as a Docker host.
To scale your application and run containers on multiple servers, you will need a container orchestration tool. A favorite (and emerging standard) orchestration platform is Kubernetes, which can run on Google Cloud, Azure, and Amazon Web Services, and even physical servers. An application can be tested and packaged into a container, shipped as a Kubernetes Deployment, and then scaled across any number of servers without subtle differences between development laptops and the production servers causing issues.
If you have a single application that doesn’t rely on other services, deploying containers directly into Kubernetes works fantastically well. Once you begin embracing microservices, or if you want to place your infrastructure (such as a database or cache server) in a container, you need to find a way to not only scale your application but tell Kubernetes how these containers depend on one another.
Admitting Your Dependencies
Resolving dependencies between applications seems easy at first… your services depend on your database, and your web stack depends on your services. As you start to partition your database and separate your services, however, you quickly descend into “dependency hell.” Services may exhibit problems with circular dependencies, you may need some containers to start before others (e.g. the database needs to start before services are ready), and web applications may need to be exposed only behind a proxy or firewall.
Dependency managers have resolved these conflicts for a long time within operating systems (such as the Debian Package Manager) and build environments (such as Apache Maven) — why not do something similar for containers? Enter Helm, a container dependency and package management for Kubernetes. Now a sponsored project under the Kubernetes umbrella, Helm can simplify packaging and resolve dependencies between containers. You can also more easily tweak configuration parameters at deploy time, so your memory limits for developers can be different than the limits in production without altering the container itself.
Helm creates and manages three big pieces of your deployment:
- Charts which define how an application is run, how the container is managed, and what other containers it might depend on.
- One or more repositories, or static web sites that will store and organize your charts and retain previous versions.
- Releases, which are managed by a service running within your Kubernetes cluster and determine what version of a chart you are currently running.
Developers are expected to create the charts, repositories will store the charts, and deploying a release will cause the chart to be executed. If you have managed dependencies with Cocoapods or deployed packages with apt, this flow should look familiar; instead of deploying pod libraries or Debian packages we are deploying packaged containers.
Charts — Building Package Configuration
A chart is actually a set of files and directories that describe your application. Helm’s Charts Guide provides the exhaustive instructions for building charts, but the simplest application has the following files:
Chart.yaml: The human-readable description of your application, including version numbers that you increment for each build.
values.yaml: Variable names and default values that can be used throughout your chart (nifty!)
requirements.yaml: A file that lists what your application depends on, so Helm can reach out and install any other containers you may need to use.
templates/deployment.yaml: The templates directory contains all of your actual Kubernetes configurations. They can pull in variables from values.yaml, or you can specify values at the command line. In this instance, we are creating a Kubernetes Deployment file that can be managed by Helm.
Repositories — Publishing a Packages to a Static Site
Maven and Apt repositories aren’t anything magical — they are standard HTTP servers hosting static content. The magic comes from the way in which the static content is packaged; Maven and Apt format files and directories in a very particular way so it is easy to find and download packages to resolve dependencies.
Helm repositories work in very much the same way — you can host them using any technology that serves as a standard web server. One popular (and low-cost) way of hosting public repositories has become leveraging GitHub’s Pages service, which lets you host static content with version control. Paired with a custom domain and user access controls, serving Helm charts from GitHubcan be a very easy task for public repos — however you can also leverage a number of cloud storage or private hosting options.
Once you are ready to begin publishing your charts, they need to be packaged and added to a Helm index file. A repository’s index.yaml file is a simple list each application and each available version for the application, as in:
apiVersion: v1 entries: myapp: - apiVersion: v1 appVersion: 0.4.6 created: 2018-04-11T20:53:21.001270672-04:00 description: File crawler that feeds the full-text index digest: cda95229b8794286440d2aa51 home: https://github.com/myorg/myapp/wiki name: myapp urls: - myapp-0.4.15.tgz version: 0.4.15
Helm can automagically generate both the index and the packaged charts for you once you create your charts and download any necessary dependencies. On a command line, update your dependencies with:
$ helm repo update $ helm dependency update
Once this is complete, create your packaged files:
$ helm package -d ~/your_website_directory
…and update your index:
$ helm repo index ~/your_website_directory
Upload the generated packages and updated index file to your web hosting provider — then the charts are ready for use!
Releases — Installing Helm
Helm uses a helper service called “Tiller” inside of Kubernetes’ system namespace that manages versions, rollbacks, and package metadata. The Helm command line interface talks to Tiller to determine if you are deploying an obsolete version to Kubernetes or in order to determine what containers may need to be re-installed in case of a rollback. Installing Tiller into your cluster is a fairly straightforward process; after installing Helm on your development machine or management console Tiller can be pushed to the current cluster with:
$ helm init
After this initialization is complete, you should be able to see Tiller running in your cluster:
Helm is now up and running! Next we need to deploy your own application into the cluster.
So that your Helm deployment tool can see your own Helm repository hosting your application charts, you need to add your own repo to Helm’s repository list. This requires a label and a URL, as in:
helm repo add myrepo http://helm.myamazingstartup.egg
Finally you are ready to have Helm manage your deployments and rollbacks — using your personal Helm repository named “myrepo,” deploy your application named “myapp” as a release named “myrelease”:
helm install myrepo/myapp --name=myrelease
If you do not specify the release name Helm will create a random name for you, which generally is something ridiculous such as “floppy-fish” or “eager-badger.” To ensure you can more easily track releases and to sound a bit less absurd in meetings, specify your own release name for deployments.
While the repository and application name is fairly self-evident, the “release” concept of Helm can trip a lot of new users up. Versioning, application names, and dependencies must remain consistent within a given release. Think of a “release” as a segmented portion of your cluster that ensures you don’t accidentally deploy an old version of your application over top of a new version, and that a rollback in your development environment doesn’t end up rolling back production. Releases are a method of enforcement by Tiller that ensures you can operate multiple versions safely and concurrently in a single cluster.
Best Practices for Business Apps
Once Helm is managing your deployments, there are a few gotchas that apply to managing containers for your in-house development:
- Keep subordinate charts to a minimum. You can nest charts within charts to create dependencies — but it’s best to avoid this if you can. Maintain your applications and charts separately.
- Leverage the “release” functionality of Helm — create a naming convention that helps teams build their own releases and spin up test environments quickly.
- Prune your repository often. If you publish your builds with each commit or merge as part of continuous integration (and you definitely should), your repository index will get very large, very quickly.
- Don’t delete a release just to get around upgrade issues. Sometimes developers will get frustrated with Tiller maintaining a consistent version history with deployments, and delete entire releases in the Kubernetes cluster rather than fix version numbers or rollbacks. When it comes time to release to production, Tiller may catch these errors and refuse to apply the deployment.
Don’t Deploy Angry
If your business has already made the leap to Kubernetes, Helm is a great next step. By creating your own repository, managing environment-specific values through charts, and managing proper version control through releases you can deploy and roll back deployments with ease. Managing dependencies manually isn’t just maddening — it creates a huge amount of risk through human error. Just as package management has protected sysadmins for decades, Helm can protect the rollout of your own in-house apps.