Inter-pod affinity

inter-pod affinity means that the pod wants to be on the same topology as the matching pod.

spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - web
        topologyKey: topology.kubernetes.io/zone
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - backend
          topologyKey: topology.kubernetes.io/zone

requiredDuringSchedulingIgnoredDuringExecution	preferredDuringSchedulingIgnoredDuringExecution
conditions must be satisfied for the pod to be scheduled. This is also called a hard requirement.	if a condition can be satisfied, it will be satisfied. But if not, it will be ignored. This is also called a soft requirement.

podAffinityTerm – the pod affinity term defines which pods we select with a label selector and which node topology key we target.

Inter-pod anti-affinity

Inter-pod anti-affinity is the opposite of affinity. It means pods don’t want to be on the same topology as their matching pods.

The pod would not be placed in a node which has the hostname as the pod having label app=web

spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - web
        topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - backend
          topologyKey: kubernetes.io/hostname

requiredDuringSchedulingIgnoredDuringExecution	preferredDuringSchedulingIgnoredDuringExecution
conditions must be satisfied to pod be scheduled (hard requirement).	if a condition can be satisfied, it will be satisfied. But if not, it will be ignored (soft requirement)

podAffinityTerm – the pod affinity term defines which pods we select with a label selector and which node topology key we target.

A soft requirement has podAffinityTerm as separate property with an additional weight parameter that defines which term is more important
A hard requirement has an affinity term as a root list item object. For the hard affinity rule, all affinity terms and all expressions should be satisfied for the pod to be scheduled.

Degraded performance problem – and how to solve it with an anti-affinity rule

workloads could heavily use the network or attached disks that you can’t control directly. When you fail to add restrictions for the Kubernetes scheduler in an unfortunate event, several workloads using the disk or network heavily might land on the same node and cause network or disk overloading.

As a result, you’ll see degraded performance on disk- or network-dependent workloads when the node’s network bandwidth is reached. You can control this or at least lower the risk of hitting these limits to a minimum by using pod labeling and adding pod anti-affinity.

Let’s say we have several workloads that are using the network heavily. We could label all those workloads with some custom label like 'network-usage':'high' and define the pod anti-affinity rule on this workload:

apiVersion: v1
kind: Pod
metadata:
  name: network-heavy-app
  labels:
    network-usage: high
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: network-usage
            operator: In
            values:
            - high
        topologyKey: kubernetes.io/hostname
  containers:
  - name: network-heavy-app
    image: registry.k8s.io/pause:2.0

The pod anti-affinity rule is used here to prevent pods with the label 'network-usage':'high' to be scheduled on the same node(topologyKey: kubernetes.io/hostname), isolating pods from each other on different nodes.

High availability problem – and how to solve it with anti-affinity

Sometimes, the Kubernetes scheduler might schedule the same workload replicas on the same node.

That creates a high availability problem – if nodes go down, all or portion of workload replicas goes down, and that can create partial or full downtime of the application.

You can solve this problem using pod anti-affinity by targeting the application name and using the hostname topology key, so the pods would be schedule on the diffrent hostname node.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: highly-available-app
  labels:
    app: highly-available-app
spec:
  replicas: 10
  selector:
    matchLabels:
      app: highly-available-app
  template:
    metadata:
      labels:
        app: highly-available-app
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - highly-available-app
              topologyKey: kubernetes.io/hostname
      containers:
        - name: highly-available-app
          image: registry.k8s.io/pause:2.0

The example above defines a deployment where each highly-available-app replica can only be scheduled on a separate node.

Cost problem – reduce your network costs using an affinity rule

A Kubernetes cluster cost consists of the VM (CPU and RAM) price, storage price, network price, and Kubernetes-as-a-service price.

Usually, cloud providers charge for a network bandwidth that leaves an availability zone. This means that network traffic between pods that are running on different availability zones is paid!

You can still reduce costs significantly by placing heavy communicating pods in the same availability zone. You can use inter-pod affinity in the zone topology key to achieving that:

apiVersion: v1
kind: Pod
metadata:
  name: web
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - backend
        topologyKey: topology.kubernetes.io/zone
  containers:
  - name: web
    image: registry.k8s.io/pause:2.0

The example above defined a pod that can only be scheduled on the same zone as pod matching app=backend. Having this affinity requirement could decrease network costs between web and backend pods.

Kubernetes - Inter-Pod Affinity And Anti-Affinity