There are two ways to run Kubernetes in Amazon cloud. Amazon provides a managed Kubernetes service (EKS) or you can launch instances and setup your own unmanaged cluster.

The advantages of EKS are fairly obvious:

  • Tight integration with Amazon services.
  • Amazon manages your Kubernetes cluster control plane.
  • Amazon manages control plane upgrades.
  • Integrated Amazon VPC Container Networking Interface(CNI) networking.

On the other hand there are advantages to building your own cluster:

  • Ability to move to newer versions of Kubernetes more quickly than Amazon.
  • Ability to use spot and reserved instances for nodes in your cluster.
  • Flexibility of cluster setup and configuration.
  • Ability to maintain similar cluster configurations in multiple clouds and bare metal.

See, Kubernetes on AWS: Tutorial and Best Practices for Deployment for more information.

AWS VPC Native Networking Pros and Cons

EKS makes getting started with Kubernetes on Amazon easy It also comes with built in VPC CNI integration without a network policy engine. Furthermore EKS has now implemented hooks into VPC CNI that allow users to install the Calico policy engine on top of the pre-installed VPC CNI plugin.

Advantages of VPC CNI

Amazon native networking provides a number of significant advantages over some of the more traditional overlay network solutions for Kubernetes:

  • Raw AWS network performance.
  • Integration of tools familiar to AWS developers and admins, like AWS VPC flow logs, Security Groups -- allowing users with existing VPC networks and networking best practices to carry those over directly to Kubernetes.
  • The ability to enforce network policy decisions at the Kubernetes layer if you install Calico.

If your team has significant experience with AWS networking, and/or your application is sensitive to network performance all of this makes VPC CNI very attractive.

Disadvantages of VPC CNI

On the other hand, there are a couple of limitations that may be significant to you. There are three primary reasons why you might instead choose an overlay network.

  1. Pod density limitations.
  2. Need for encryption on the network.
  3. Multicast requirements.

VPC CNI Pod Density Limitations

First, as we mentioned briefly in the part 2. The VPC CNI plugin is designed to use/abuse ENI interfaces to get each pod in your cluster it’s own IP address from Amazon directly.

This means that you will be network limited in the number of pods that you can run on any given worker node in the cluster.

In order to understand these limits, we will need to dig in to how VPC CNI works.

The primary IP for each ENI is used for cluster communication purposes. New pods are assigned to one of the secondary IPs for that ENI. VPC CNI has a custom daemonset that manages the assignment of IP’s to pods. Because ENI and IP allocation requests can take time, this l-ipam dameon creates a warm pool of ENIs and IP’s on each node and use a one of the available IPs for each new pod as it is assigned. This yields the following formula for maximum pod density for any given instance:

ENIs * (IPs_per_ENI -1 )

Each instance type in Amazon has unique limitations in the number of ENI’s and IP’s per ENI allowed. For example an m5.large worker node allows 25 pod IP addresses per node at an approximate cost of $2.66/month per pod.

Stepping up to an m5.xlarge allows for a theoretical maximum of 55 pods, for a monthly cost of $2.62 making the m5.large the more cost effective node choice by a small amount for clusters bound by IP address limitations.

Additional Overhead Considerations

And if that set of calculations is not enough, there are a few other factors to consider. Kubernetes clusters generally run a set of services on each node. VPC CNI itself uses an l-ipam daemonset, and if you want Kubernetes network policies, calico requires another. Furthermore production clusters generally also have daemonsets for metrics collection, log aggregation, and other cluster wide services. Each of these use an IP address per node.

So now the formula is:

(ENIs * (IPs_per_ENI -1 ) -1*DaemonSets

This makes some of the cheaper instances on Amazon completely unusable because there are no IP addresses left for application pods.

On the other hand, Kubernetes itself has a supported limit of 100 pods per node. Making some of the larger instances with lots of available addresses less attractive. However, the pod per node limit IS configurable, and in my experience this limit can be increased without much increase in kubernetes overhead.

EKS Cost Calculator

I’ve created a quick google sheet with each of the Amazon instance types, and the maximum pod densities for each based on the VPC CNI network plugin restrictions:

  • a 100 pod/node limit setting in Kubernetes,
  • a default of 4 daemonsets (2 for AWS networking, 1 for log aggregation, and 1 for metric collection)
  • and a simple cost calculation for per-pod pricing

Feel free to copy this sheet, adjust the parameters at the top, and figure out which instance types you should consider based on your networking requirements.

This sheet is not intended to provide a definitive answer on pod economics for AWS VPC based Kubernetes clusters. There are a number of important caveats to the use of this sheet:

  • CPU and memory requirements will often dictate lower pod density than the theoretical maximum here.
  • Beyond Daemonsets, Kubernetes System pods, and other “system level” operational tools used in your cluster will consume Pod IP’s and limit the number of application pods that you can run.
  • Each instance type also has network performance limitations which may impact performance often far before theoretical pod limits are reached.

This sheet does not factor in the EKS costs, storage costs, bandwidth utilization costs, etc. But, it does generate some useful insights into EKS node selection.

There are several node types which will severely limit your pod density if you are running lightweight microservices. In particular, the T2 line of instance types should be avoided because of the low pod density limits.

Furthermore because node IP addresses have a cool down period with VPC CNI, vpc subnets have maximum IP address limits, and because there is the possibility of IP address fragmentation, real world pod densities can be lower than the ones listed in the sheet.

The good news is that, in spite of the many warnings about pod density limits with VPC CNI, there are several solid options for node types that will provide cost effective density using AWS-VPC for many use cases.

If you are interested in a more detailed node selection model that takes into account memory, CPU, network throughput, and IP fragmentation, and other factors that might limit density in your cluster, please feel free to reach out to us via e-mail (support@weave.works) or on the Weaveworks community slack.

Encryption and Overlay Networks

If you you have a requirement for encrypted transport within the network, this can be provided by a CNI-overlay networking solution. VPC provides network level isolation via multiple subnets and custom routing rules.Traffic can be further isolated using security groups, but given the current VPC CNI limitations that all ENI’s on a node be launched with the same subnet, and security group, the need to encrypt traffic at the pod level is still a real concern for many.

Weave Net can be run on EKS using these instructions, and is the only overlay network plugin for you if you need pod network encryption.

Of course encryption comes at some cost in terms of CPU utilization, network throughput, and latency. A recent blog post discussing the performance of various overlay networking solutions (on a bare metal 10g network not AWS) pointed out that only Weave Net provides encryption. However it fails to account for that fact when testing performance, and compares encrypted network communication performance with other overlay networks without it.

Multicast Networks on AWS

Another requirement that can simplify your networking choices is multicast support. AWS VPC does not support multicast, and again Weave Net is the only overlay network that provides multicast support.

Cloud Portability

The fourth advantage often cited for using an overlay network on Amazon is cloud portability.

Hybrid cloud, disaster recovery, and other requirements often push users away from custom cloud vendor solutions and towards open solutions.

However, in this case the level of lock-in is quite low. The CNI layer provides a level of abstraction on top of the underlying network, and it is possible to deploy the same workloads using the same deployment configurations with different CNI backends. However, as you can see from the above discussion, the VPC CNI plugin on Amazon does impose some complications and restrictions and requires a good bit of Amazon VPC knowledge, which is not portable.

Overlay Networking Options

There are a number of different overlay networking possibilities on Amazon, and Weaveworks is the sponsor of Weave Net, one of the more popular options. As I mentioned earlier Weave Net is the only option if you need either encryption or multicast support, and is a solid community supported overlay networking option.

However, we are not tied to a single CNI provider, we recently released a popular new product Weave Kubernetes Support, which provides enterprise grade support for Kubernetes users, and rather than try to provide a rich evaluation of all the overlay networking options available to you in this post, let me try to give you some of the reasoning behind our decision about which CNI plugins to support.

For WKS we support Amazon VPC CNI including the calico policy engine because for many customers, particularly those with network latency/throughput sensitive workloads, and those with workloads that are “heavy weight” enough that they are either memory or CPU bound, that is the best solution.

Of course we also support Weave Net for overlay networking on all the clouds as well as bare metal, not just because we created it, but because

  • it provides encryption and multicast support (even where the base networking layer does not).
  • It works everywhere, and provides a solid platform for hybrid and multi-cloud solutions.

And finally, if there is a need to avoid the overhead of an L2 solution (like Weave Net, flannel, etc) we also support Calico networking, which uses BGP to program routing tables and avoids packet encapsulation. This allows for lower overhead, but at some cost in terms of IP addresses, since it assigns a block of addresses to each node in the cluster. It has algorithms for reducing the address fragmentation this causes, but it nonetheless uses more IP’s than traditional overlay approaches which generally only consume the IP addresses they actually use.

This provides us with what we believe are the best performing, most feature rich solutions, for each segment -- allowing us to provide the optimal networking experience for all our customers.

Conclusions

So, what does all this mean for you? What is the right solution for your AWS cluster. We’ve covered a lot of ground, but in the end asking yourself a few simple questions will get you to the right solution fairly quickly

Do you have a large investment in AWS VPC already?

If yes, the vpc-c plugin will allow you to connect your cluster to other AWS VPC workloads quickly and easily. You will already be familiar with flow logs, security groups, and the rest of the AWS networking stack, and be able to use the same tools in managing your Kubernetes cluster.

Do you have workloads that are particularly sensitive to network performance?

If yes, again VPC CNI plugin will provide the best networking performance.

Do you have lightweight workloads that will push the pod density limits of vpc-cni?

If yes, you should look at an overlay network provider.

Are you going to push up against the IP address limits of VPC?

If yes, an overlay network makes sense.

Do you need multicast or encryption?

If yes, Weave Net is likely your best choice.

Hopefully you found this deep dive into networking on AWS to be helpful and informative. If you have any questions, please feel free to reach out to us.