Embracing remote Helm charts and Library charts

We've got new Helm charts! In this article, Maurits van Mastrigt, DevOps Engineer at funda, will share how we are improving our use of Helm charts for Kubernetes deployments. Bye bye wild growth of copy/paste charts, and hello new and shiny versioned charts!

A few years ago, most of funda’s services were migrated to Kubernetes. Think of services for the accounts/logins, search listings and map markers. We built a Helm chart for services like these, which allowed us to migrate services quickly, but also gave us the flexibility to deal with the 60+ services and all their small variations.

We simply copied the entire chart to another service, made the necessary changes, deployed it to Kubernetes and, if everything worked, we went on to the next one. The migration went smoothly and we successfully completed it within a three-month timeframe.

Read also: How we use modular pipelines to empower our development teams

However, over time we realized that this approach has a few downsides:

Hard to maintain. Found a bug and need to apply a small fix in every chart? Good luck opening 60+ pull requests (I’ve been there). The decentralization of the charts increased the time required for regular maintenance.
Wild growth of variations. With no clear baseline chart, it was being copied over from the most recent services. These services sometimes implemented things that were specifically needed for that service and not for the new one. This resulted in many small variations of the same chart. An example of this is what we call the “multi service” chart, which allowed multiple Kubernetes Deployments to be rolled out as a single Helm release. This was later on used as the default, even if it only rolled out a single Kubernetes Deployment.
Lack of change management. The pragmatic approach and wild growth of variations made it nearly impossible to apply versioning and keep documentation/ changelogs up to date. When developers ran into issues, we always had to look at that specific chart to see what was going wrong, which meant it took longer to troubleshoot.

With these lessons learned, we decided to look into improving our Helm charts.

Remote Helm charts

Luckily, Helm already provides a built-in solution for most of these problems: remote Helm charts. This means hosting a packaged and versioned Helm chart on a remote server (called a repository) and letting Helm download the chart based on a version constraint, for example: ^1.0.0.

If you’ve worked with Helm before, you’ve probably already used one of these remote Helm charts, e.g. Bitnami’s Wordpress chart. Running the following commands will download the Wordpress chart and output the Kubernetes manifest YAML:

helm repo add azure-marketplace https://marketplace.azurecr.io/helm/v1/repo
helm repo update
helm template my-release azure-marketplace/wordpress --version '^15.2.13'

The output can then be configured by setting values --set foo=bar or providing values files: --values values.yaml. The chart templates determine how these values will be used. Now that the chart is remote and the templates are no longer close to the values files, the documentation becomes more important for developers. For example, this list of Wordpress chart parameters shows exactly which values can be used.

As you can probably tell, this tackles most of the downsides of our initial approach: charts are centralized and versioned. Centralizing the charts makes them easier to document and maintain. So no more small changes to a lot of charts; instead, we add features to a single chart and make them configurable. Remote Helm charts also allow the SRE team to take back control over chart development. This and semantic versioning, plus keeping changelogs, help us apply better change management.

Library charts

Along the way we also found out about another great Helm concept: library charts. These are Helm charts that do not render output YAML by themselves, but only provide named templates (we just call them functions). Library charts can be used as chart dependencies and allow you to split up a chart into multiple parts. We used this to get rid of the duplicate code in our three remote charts.

We found an awesome “common” chart that helps us render most of the native Kubernetes resources. It saved us quite a bit of time, which we have used to develop our own library charts:

funda-library: for funda-specific functions, like the ones that output standardized resource names or help generate domain names according to our internal convention.
azure-library: for Azure resources that are provided by Azure Kubernetes Services (AKS).
traefik-library: for Traefik resources, which allow us to configure request routing.

All in all, these library charts separate the concerns quite nicely. One downside of library charts is the added overhead of having to individually release them.

Our approach

It has been a couple of years since our migration to Kubernetes. During this time we've learnt a lot about the different use cases for the charts, but also found out about the issues developers are having with the charts. With these things in mind, we have developed three remote charts:

service: this generic chart allows us to deploy most services. We use the term "service" for applications that continuously run. It uses sensible defaults, so that the developers have as little to configure as possible.
cronjob: the name speaks for itself. This chart creates a Kubernetes CronJob that runs code on a fixed schedule. We've moved this to a separate chart, because some services need CronJobs, but not all CronJobs need services.
traefik-external-route: for routing to something outside of the Kubernetes cluster, like a virtual machine or storage account for example.

These charts are all structured in the same way:

Default values file with sensible defaults, like standardized port numbers and autoscaling settings.
Values templates that allow us to apply logic to given values and provide new values/configuration.
Render templates that use those calculated values to render the classes.
Class templates that output the Kubernetes resources.
And lastly, the “init” template to tie everything together.

This is what the init template looks like for the service chart:

---
{{- /* Merge the local chart values and the library chart defaults */ -}}
{{- include "bjw-s.common.loader.init" . }}

{{- /* Update the values */}}
{{- $_ := mergeOverwrite .Values (include "service.values.init" . | fromYaml) -}}
{{- $_ := mergeOverwrite .Values (include "service.values.common" . | fromYaml) -}}
{{- $_ := mergeOverwrite .Values (include "service.values.datadog" . | fromYaml) -}}
{{- $_ := mergeOverwrite .Values (include "service.values.traefik" . | fromYaml) -}}
{{- $_ := mergeOverwrite .Values (include "service.values.azure" . | fromYaml) -}}
 
{{- /* Render the templates */}}
{{ include "bjw-s.common.loader.generate" . }}
{{ include "service.render.native" . }}
{{ include "service.render.azure" . }}
{{ include "service.render.traefik" . }}
{{ include "service.render.datadog" . }}

The approach is similar to the open source “common” chart.

Something we kept in mind during chart development is that we use defaults where possible, but still allow everything to be configured. This allows developers to use the charts with minimal configuration, while still allowing them to customize where needed.

Key takeaways

These are the five main takeaways of our Helm chart improvement project:

Using remote charts allows us to apply chart versioning. Combining this with version constraints means we can easily roll out patches. We no longer have to create pull requests in each of the service repositories to get those patches applied, so bug fixes will be automatically included in next deployments. To make sure we don’t introduce regressions during releases, we started using Helm unit testing.
Splitting up the templates into values, render, class and init templates is an effective way of adding structure to the chart. It improves understandability and maintainability.
Library charts have helped clean up the chart codes and have also made it easier to understand and maintain them. As a result, we have been able to keep our chart templates DRY (Don't repeat yourself).
Chart documentation is key. We use Helm docs to generate the README with a list of chart parameters. By using FAQs with code snippets, developers can easily find out how things can be configured. We've been pairing up with developers to go through our migration guide and migrate their first service; after that, they are quickly able to do others by themselves.
We’ve collaborated with developers to determine sensible defaults, and fine-tuned the charts based on their feedback. The average service’s values file went from about 200 lines to around 20 lines.

We are really happy with the success of our new Helm charts and eagerly anticipate diving deeper into their potential, uncovering the opportunities and innovations that await us. Stay tuned as we continue this ongoing exploration.

If you have any questions, thoughts or suggestions about this blogpost, please let Maurits know by emailing maurits@funda.nl.