How Kubernetes Initializers work

If I were to point out one reason why Kubernetes is taking off, I would probably say because of its awesome community. The second reason would be the flexibility of the Kubernetes API and how easy it is to write custom extensions or plugins on top of it. In this article, I’ll dig deep in a new concept: Initializers, which is a dynamic and pluggable way of modifying Kubernetes resources before they are actually created.

Initializers are already here as an alpha feature in Kubernetes 1.7. For example, we use Initializers at Google Container Engine to extend the Kubernetes feature base and you can do the same by implementing new initializers suiting your needs, too.

Until now, Kubernetes only had Admission Controller plug-ins to intercept resources before they are created. For example, you can have an admission plug-in enforcing all container images to come from a particular registry, and prevent other images from being deployed in pods. There are quite many admission controllers providing functionality such as enforcing limits, applying pre-create checks, and setting up default values for missing fields.

The problem with admission controllers are:

They’re compiled into Kubernetes: If what you’re looking for is missing, you need to fork Kubernetes, write the admission plugin and keep maintaining a fork yourself.
You need to enable each admission plugin by passing its name to --admission-control flag of kube-apiserver. In many cases, this means redeploying a cluster.
Some managed cluster providers may not let you customize API server flags, therefore you may not be able to enable all the admission controllers available in the source code.

Dynamic/external Admission Controllers are proposed to address these problems. Currently there are two types of such plugins: Initializers and web hooks. Initializers are similar to admission controller plug-ins, because you can intercept the resource before it is created. They’re different than admission controller plug-ins, because they are not part of Kubernetes source tree, or compiled into it; you need to write a controller yourself.

# What can you do with Initializers?

When you intercept Kubernetes objects before they are created, the possibilities are endless: You can mutate the objects in any way you like, or prevent the objects from being created.

Here are some ideas for initializers, each enforce a particular policy in your cluster:

Inject a proxy sidecar container to the pod if it has port 80, or has a particular annotation.
Inject a volume with test certificates to all pods in the test namespace automatically.
If a Secret is shorter than 20 characters (probably a password), prevent its creation.

If you’re not planning to modify the object and intercepting just to read the object, webhooks might be a faster and leaner alternative to get notified about the objects. Make sure to check out this example of a webhook-based admission controller.

Some of the functionality I listed above, such as injecting a sidecar container, or a volume can be achieved using Pod Presets, with less flexibility. If Pod Presets actually work with you, you probably should not bother with developing an Initializer.

# Anatomy of Initialization

Configure which resource types need initialization: InitializerConfiguration objects let you configure which initializers should be assigned to which type of resources.

For example, you can create one that says add “myproxy” initializer to objects of type apps/v1beta1.Deployment and v1.DeamonSet. You can create as many InitializerConfigurations as you want and they apply to all namespaces.
API server will assign initializers to the new resources: When you submit a Deployment object to the apiserver, it will update Deployment’s metadata.initalizers.pending and add the “myproxy” value there. This field shows of initializers currently assigned to the resource.

To be accurate, it’s not the apiserver adding the initializers. There’s an admission controller plugin called “Initializers”, which makes this whole initialization story possible. It’s enabled by adding the --admission-controller=Initializers flag to the kube-apiserver.
You will write a controller to watch for the resources: This custom controller you developed and deployed to the cluster uses the Watch API to listen for new resources, capture them and make the modifications you need.
Wait for your turn to modify the resource: Once your controller intercepts an object through the Watch API, it should only modify the object if it sees its name on the first element of the initializer list (metadata.initializers.pending[0]). Otherwise, it means it is some other initializer’s turn to modify the resource and it should skip modifying for now.
Finish modifying, yield to the next initializer. Once you finish modifying the resource, your controller should remove its name from the metadata.initializers.pending list of the object, and save the object back to the API server.
No more initializers, resource ready to be realized: When Kubernetes API server sees that the object has no more pending initializers, it considers the object “initialized”. Now the Kubernetes scheduler and other controllers can see the fully initialized object and make use of them.

You can have multiple initializers running on your cluster simultaneously. Each of these custom controllers will get notified about modifications to the resources (e.g. Pods) they subscribed to watch, but they would wait their turn to modify the object until they see their name up in the list.

# Initializers: Under the covers

You can develop and deploy initializers without knowing about how they are actually implemented under the covers in Kubernetes API server. Let me discuss a bit more in detail how it is implemented in the Kubernetes API:

The Kubernetes community has started looking at providing such extension points in form of hooks early on. First proposed here, the feature looked somewhat similar to how it is actually implemented. The later proposal clarifies and explains how the machinery for this feature is expected to work. Definitely read the linked proposal and the pull request if you’re trying to understand how it works under the covers.

In a nutshell, when a Pod resource is submitted to the API and is assigned a list of pending initializers, it will actually not be scheduled until the initialization is complete. Kubernetes scheduler, the scheduler is yet another controller, watching for Pods to show up in the API server and assigns each of them to a Node (learn more about scheduling).

So, how come the Scheduler and other controllers can’t see this object before it’s initialized, even though the object is saved on the API server (and in the etcd database) and is visible to some other controllers (i.e. your initializers)?

The answer is a request parameter called includeUninitialized. This parameter defaults to false and therefore the API hides the uninitialized objects from the default clients (e.g. kubectl) and controllers (e.g. the scheduler) in requests like WATCH or LIST. The initializers you develop must set the ?includeUninitialized=true query parameter to observe these objects.

Initializers block Create requests. When a request to CREATE an object is submitted to the API server, it does not return right away and the request blocks until the initialization is complete. If you are using kubectl and object gets stuck in uninitialized state, you’ll notice kubectl times out after 30 seconds.

# Advantages and disadvantages

Admission Control plug-ins are compiled into Kubernetes API server. To add a new plug-in, you need to fork the Kubernetes source-tree and develop your plugin on top and keep maintaining a fork. On the other hand, Initializers are developed outside Kubernetes source tree. You can easily develop one using the Kubernetes API clients. They run on the cluster just like any other workload, and you have less things to worry about.

Initializers are extremely flexible: Once you get a hold of an object before it is actually created, the sky’s the limit. Note that, this flexibility also makes it possible to easily shoot yourself in the foot. You should limit each initializer to do one task, and don’t step on each other’s toes.

Writing an Initializer is easy: Check out this example by Kelsey Hightower, which adds a sidecar container to a Pod based on an annotation. It’s about 200 lines of Go code and yet it automates a non-trivial task very well. You can go ahead and develop one yourself right now. But how is developing a production-grade initializer, and running it in production? It can be a little bit more challenging.

Uptime of Initializers is a big deal: When an Initializer goes offline, it will still get assigned to initialize new resources. These resources will get stuck in “uninitialized” state indefinitely unless the initializer comes back. This can have real live-site implications: If you have an initializer for pods, and if your initializer goes offline during a scale-up event, the new pods will not be created, which may cause auto-scale operations to fail and lead to outages.

It’s still early: Initializers are currently at alpha in Kubernetes v1.7 at the time of writing and it’s targeting beta in v1.8. You need to enable alpha flags in your cluster to start using this feature today. Also note that many of the things I explain this article may not apply to the stable version of the feature.

# Developing your own Initializer

Easiest way to get started is to fork this initializer example by Kelsey Hightower, which is written in Go and adds a sidecar container to Deployment objects based on presence of an annotation.

In the source code you should note a few things: List/Watch functions specifying IncludeUninitialized=true and targeting all namespaces, the informer and its resync period, how the API object is cloned/mutated, and how the update is performed using a PATCH

Things to watch out while developing an initializer:

Make sure your initializer doesn’t go down. I mentioned this earlier, the best you can do is to make sure you have a liveness probe, and monitoring/alerting on it. This is especially required if you have initializer(s) for all pods.
Chicken-and-egg problem: If you have an initializer for pods/deployments, deploying the initializer may block because it can’t initialize itself. You need to manually specify the pending initializers list to empty array value ([ ]) in the pod manifest.
Initializers are normal workloads, but you should deploy to a separate namespace than your normal workloads. The built-in kube-system namespace might be a good place to host your initializers.
Initializers may receive incomplete objects: For example a Pod that is not yet scheduled may have some fields missing, such as “nodeName” or “status”. You should the test initializer with this assumption.
Initializers may be applied in a different order: The list of pending initializers may be ordered differently every time, your implementation should work fine with this.
Make sure your initializer handles the object quickly. The initialization blocks the request creating the resource.

# Get an Initializers-enabled cluster today

Until the Initializers feature becomes stable (which makes them enabled by default), you can get an alpha cluster from Google Container Engine by running:

gcloud container clusters create my-cluster \
    --enable-kubernetes-alpha \
    --cluster-version 1.7.2

You can use this cluster to deploy your first initializer. Note that alpha clusters will delete themselves after 30 days.

# Further reading

If Initializers are an intriguing topic for you, check out the resources linked below. Initializers are probably the most practical and easiest way to extend the Kubernetes API. They are quite flexible and I am interested in hearing what sort of ideas you will come up with.

Thanks to Aparna Sinha and Chao Xu for reviewing drafts of this article.