Securing Kubernetes Cluster Networking

Network Policies is a new Kubernetes feature to configure how groups of pods are allowed to communicate with each other and other network endpoints. In other words, it creates firewalls between pods running on a Kubernetes cluster. This guide is meant to explain the unwritten parts of Kubernetes Network Policies.

This feature has become stable in Kubernetes 1.7 release. In this guide, I will explain how Network Policies work in theory and in practice. You can directly jump to kubernetes-networkpolicy-tutorial repository for examples of Network Policies or read the documentation.

# What can you do with Network Policies

By default, Kubernetes does not restrict traffic between pods running inside the cluster. This means any pod can connect to any other pod as there are no firewalls controlling the intra-cluster traffic.

Network Policies give you a way to declaratively configure which pods are allowed to connect to each other. These policies can be detailed: you canspecify which namespaces are allowed to communicate, or more specifically you can choose which port numbers to enforce each policy on.

You cannot enforce policies for outgoing (egress) traffic from pods using this feature today. It’s on the roadmap for Kubernetes 1.8.

In the meanwhile, the Istio open source project is an alternative that supports egress policies and much more, with native Kubernetes support.

# Why are Network Policies cool

Network Policies are fancy way of saying ACLs (access control lists) used in computing for many decades. This is Kubernetes’ way of doing ACLs between pods. Just like any other Kubernetes resource, Network Policies are configured via declarative manifests. They are part of your application and you can revise them in your source repository and deploy them to Kubernetes along with your applications.

Network Policies are applied in near real-time. If you have open connections between pods, applying a Network Policy that would prevent that connection will cause the connections will be terminated immediately. This near real-time gain comes with a small performance penalty on the networking, read this benchmark to learn more.

# Example Use Cases

Below is a brief list of common use cases for Network Policies. You can find more use case examples with sample manifests at the kubernetes-networkpolicy-tutorial on GitHub.

# How is Network Policy enforced

The Network Policy implementation is not a Kubernetes core functionality. Although you can submit a NetworkPolicy object to the Kubernetes master, if your network plugin does not implement network policy, it will not be enforced.

Please see this page for examples of network plugins that support network policy. Some examples of network plugins supporting policies are Calico and Weave Net.

Google Container Engine (GKE) provides alpha support for Network Policies by pre-installing Calico network plugin in the cluster for you.

Network Policies apply to connections, not network packets. Note that connections allow bi-directional transfer of network packets. For example, if Pod A can connect to Pod B, Pod B can reply to Pod A back on the same connection. This doesn’t mean Pod B can initiate connections to Pod A.

# Anatomy of a NetworkPolicy

NetworkPolicy is just another object in the Kubernetes API. You can create many policies for a cluster. A NetworkPolicy has two main parts:

Target pods: Which pods should have their ingress (incoming) network connections enforced by the policy? These pods are selected by their label.
Ingress rules: Which pods can connect to the target pods? These pods are also selected by their labels, or by their namespace.

Here is a more concrete example of a NetworkPolicy manifest:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: api-allow
spec:
  podSelector:
    matchLabels:
      app: bookstore
      role: api
  ingress:
  - from:
      - podSelector:
          matchLabels:
            app: bookstore
  - from:
      - podSelector:
          matchLabels:
            app: inventory

This sample policy allows pods with app=bookstore or app=inventory labels to connect to the pods with labels app=bookstore and role=api. You can read this as “give microservices of bookstore application access to the bookstore API”.

# How are Network Policies evaluated

Although the design document and the API reference for Network Policies may seem complicated, I managed to break it down to a couple of simple rules:

If a NetworkPolicy selects a pod, traffic destined to that pod will be restricted.
If there is no NetworkPolicy defined for a pod, all pods in all namespaces can connect to that pod. This means that by default with no Network Policy defined for a specific pod there is an implicit “allow all”.
If traffic to Pod A is restricted and Pod B needs to connect to Pod A, there should be at least a NetworkPolicy selecting Pod A, that has an ingress rule selecting Pod B.

Things get a bit complicated when cross-namespace networking gets involved. In a nutshell, here is how it works:

Network Policies can enforce rules only for connections to Pods that are in the same namespace as the NetworkPolicy is deployed in.
podSelector of an ingress rule can only select pods in the same namespace the NetworkPolicy is deployed in.
If Pod A needs to connect to Pod B in another namespace and network to Pod B is enforced, there needs to be a policy in Pod B that has a namespaceSelector that selects the Pod A.

# Are Network Policies Real Security?

Network Policies restrict pod-to-pod networking, which is part of securing your cluster traffic and applications. They are not firewalls that perform deep packet inspection.

You should not solely rely on Network Policies for securing traffic between pods in your cluster. Methods such as TLS (transport layer security) with mutual authentication give you ability to encrypt the traffic and authenticate between microservices.

Take a look at Google Cloud Security Whitepaper (emphasis mine):

Defense in depth describes the multiple layers of defense that protect Google’s network from external attacks. Only authorized services and protocols that meet our security requirements are allowed to traverse it; anything else is automatically dropped. Industry-standard firewalls and access control lists (ACLs) are used to enforce network segregation. All traffic is routed through custom GFE (Google Front End) servers to detect and stop malicious requests and Distributed Denial of Service (DDoS) attacks. Additionally, GFE servers are only allowed to communicate with a controlled list of servers internally; this “default deny” configuration prevents GFE servers from accessing unintended resources. […]

Data is vulnerable to unauthorized access as it travels across the Internet or within networks. […] The Google Front End (GFE) servers mentioned previously support strong encryption protocols such as TLS to secure the connections between customer devices and Google’s web services and APIs.

As I said earlier, service mesh projects like Istio and linkerd offer promising advancements in this area. For example, Istio can encrypt traffic between your microservices using TLS and enforce network policies transparently without changing your application code.

# Learn more

If you are interested in trying out Network Policies, the easiest way to get started would be creating a GKE cluster. You can also read:

Network Policy documentation
Network Policy Design document: This captures the intent of the feature, however does not fully reflect how it got implemented.
Network Policy API reference
NetworkPolicy benchmark from Romana network plugin.

Thanks to Matthew DeLio and Daniel Nardo for reviewing drafts of this article.