Kubernetes allows you to specify which pods can communicate with other pods. Policy is defined and described in configuration files and passed to a “network policy controller” by Kubernetes. If you’re using Weave Net as the network pod layer, it comes bundled with a network policy controller, which manages and enforces the rules you set up in Kubernetes.
And with Weave Cloud you can visualize your container networks and services all from a single dashboard. In addition to this Prometheus monitoring can be configured to alert you about any suspicious network activity.
The diagram below is representative of a typical computer network. On the right there is the network, and on the left, the threat. In the past, this problem was solved by putting a firewall in between these two things.
But what happens if an attacker gets past that single firewall? And it doesn’t even have to be an outside hacker, the thing that is attacking your system could be your own code. You may have accidently released a development version into production or inadvertently introduced a bug for example, and this would be considered a threat as well.
In a static environment, one way to get around this is by adding more firewalls to your network. But what if the nodes on this diagram are not hosts, but containers.
Containers are constantly starting and stopping and they are auto-scaling and they could be running anywhere. They may also be running in multiple environments: dev, test, production. They’re in the cloud, and you don’t know exactly where your machines are, but what you do know is that they are all connected to each other.
The traditional set up of statically configured firewalls just doesn’t cut it anymore. But with Weave Net, firewall rules can be applied to every link coming into every container.
To illustrate what to block and what to allow, consider this typical legacy app that is split into three traditional tiers: Presentation, Middle and Data.
And with this, you want to set up a rule that says the presentation layer can only be accessed by the other two tiers through port 80.
By enabling a rule like this, every pod in the presentation tier will have their ingress allowed. It will also enable the protocol TCP on port 80. By omitting any other settings, everything else by default is denied.
A YAML file will be set up to describe this. This file describes a rule called presentation-policy and it says that the pod named, presentation can only be reached on port 80 through the protocol TCP:
kind: NetworkPolicy metadata: name: presentation-policy spec: podSelector: tier: presentation ingress: - ports: - protocol: tcp port: 80
But what if you want to create a rule that says: only allow messages through the middle and the presentation tiers?
For this case, nothing about ports will be defined, which means that effectively they are free to talk on any port. You will then add a pod selector rule indicating that messages can only come from the presentation tier and from there can only go to the middle tier. This setting prevents anyone from coming outside of the system into the middle tier.
kind: NetworkPolicy metadata: name: middle-tier-policy spec: podSelector: tier: middle ingress: - from: - podSelector: matchLabels: tier: presentation
In Kubernetes there is the Master node on which is running the API Server. The Master has all of the information about what’s running: all the nodes, all the pods, all the policies.
Weave Net provides a program called the Weave NPC (network policy controller). The NPC runs once on every host, and it routes the traffic according to the rules set up in the YAML file. Weave Net does this by talking to
iptables which is a feature of Linux.
At the top level forward chain, a rule is injected that checks whether a WEAVE-NPC policy applies, and if it doesn’t, then the packet is dropped. If is does, and if there is an established connection, then the packet is accepted.
Only packets with newly opened connections are checked. Therefore, if the packet is already on an established connection, it is accepted, and the other chains are checked.
FORWARD chain: -o weave -j WEAVE-NPC -o weave -j DROP WEAVE_NPC chain: -m state --state RELATED,ESTABLISHED -j ACCEPT -m state --state NEW -j WEAVE-NPC-DEFAULT -m state --state NEW -j WEAVE-NPC-INGRESS
IPtables rules are updated when the policy’s ‘ipsets’ are changed on every coming and going pod.
Weave begins with the source address on the network, which goes over a linux bridge. In the course of traversing that bridge, the connection is checked against the IPtables rules.
IPset is an add-on module to IPtables and it allows you check sets of IP addresses all at once. This module is particularly useful for large systems like Kubernetes where there might be thousands of pods and thousands of combinations of source and destination IP addresses that could either be accepted or rejected.
Instead of writing a rule for each one of those addresses that would get slower and slower since it’s a linear search, ipset presents a hash table into which the same information is put: the source, the destination and the port and then a match is made against that in a unit-time approximate operation. And if the address passes, it gets sent to the destination.
There are two kinds of matches, one that just matches on the destination, and that’s the default set and then another that is called the ingress chain where both the source and the destination is matched.
Returning to the policy definition, it depends on whether you specified anything in the YAML file about the source and the destination that determines whether those matches are enforced.
If you specify the wrong rules, the NPC will drop your traffic.
There is an additional rule at the top level that introduces the
uLogd (netfilter logging) technique or rule destination check that is instrumented in the policy controller.
uLogd subscribes to the failed rule and then drops the connection. Those dropped connection events are then exported as Prometheus Metrics where they can be viewed within Weave Cloud.
If you suddenly see a lot of rejected connections that means you either have some sort of attacker in your system or you mis-configured it. Both of those events are interesting things to be monitoring.