Cilium mutual auth … DIY

Philippe Bogaerts
7 min readAug 4, 2023
Cilium mtls SPIRE authentication

Introduction

The idea of this short tutorial is to see if we can get Cilium mutual-auth working on a sef-managed cluster.

I used a 3-node cluster on AWS based on
- Ubuntu 20.04
- Containerd 1.6.21
- Kubernetes v1.27.4

Install instructions are based on https://github.com/xxradar/k8s-calico-oss-install-containerd, but do not install any CNI at this point.

Install Cilium components

This is just a quick install, check out https://docs.cilium.io/en/v1.14/ for up-to-date install instructions.

Cilium CLI

Install the cilium cli

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

Check for version 0.15 or higher

cilium version --client
cilium-cli: v0.15.3 compiled with go1.20.4 on linux/amd64

Cilium CNI

Install the Cilium CNI

sudo snap install helm --classic

helm repo add cilium https://helm.cilium.io/

helm install cilium cilium/cilium --version 1.14.0 \
--namespace kube-system \
--set authentication.mutual.spire.enabled=true \
--set authentication.mutual.spire.install.enabled=true \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true
cilium status --wait

Verify SPIRE components

Mutual-authentication in cilium is based on the https://spiffe.io/. (Secure Production Identity Framework for Everyone). Check out the link for all details.

SPIRE is a production-ready implementation of the SPIFFE APIs that performs node and workload attestation in order to securely issue SVIDs to workloads, and verify the SVIDs of other workloads, based on a predefined set of conditions. An SVID is the document with which a workload proves its identity to a resource. (ex. a certificate).

Cilium agents request SVIDs on behalf of the pods via the spire-agents the moment the pods / Cilium Endpoints are created.

Check if the spire-server and spire-agents are up and running (probably they are not on a self-managed cluster)

ubuntu@ip-10-1-2-162:~$ kubectl get all -n cilium-spire
NAME READY STATUS RESTARTS AGE
pod/spire-agent-fkck9 0/1 Init:0/1 0 37s
pod/spire-agent-ltlnc 0/1 Init:0/1 0 41s
pod/spire-server-0 0/2 Pending 0 75s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/spire-server ClusterIP 10.104.3.112 <none> 8081/TCP 75s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/spire-agent 2 2 0 2 0 <none> 75s
NAME READY AGE
statefulset.apps/spire-server 0/1 75s

The spire-server might remain in pending state, because it requires a PersistentVolumeto boot correctly. In case it does, create a pv.

kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: spire-pv
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: ""
hostPath:
path: /mnt/data # Replace this with the desired path on the host filesystem
EOF
kubectl -n kube-system rollout restart deployment/cilium-operator
kubectl -n kube-system rollout restart ds/cilium
ubuntu@ip-10-1-2-162:~$ kubectl get all -n cilium-spire
NAME READY STATUS RESTARTS AGE
pod/spire-agent-fkck9 1/1 Running 0 2m47s
pod/spire-agent-ltlnc 1/1 Running 0 2m51s
pod/spire-server-0 2/2 Running 0 3m25scilium config view | grep mesh-auth

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/spire-server ClusterIP 10.104.3.112 <none> 8081/TCP 3m25s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/spire-agent 1 1 1 1 1 <none> 3m25s
NAME READY AGE
statefulset.apps/spire-server 1/1 3m25s

Install Hubble CLI

export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
HUBBLE_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then HUBBLE_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum}
sha256sum --check hubble-linux-${HUBBLE_ARCH}.tar.gz.sha256sum
sudo tar xzvfC hubble-linux-${HUBBLE_ARCH}.tar.gz /usr/local/bin
rm hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum}
hubble version
cilium hubble port-forward &
hubble status

Install a demo application

Install a custom app. I typically use app-routable-demo, a configurable mesh of proxies that simulate a microservices app. https://github.com/xxradar/app_routable_demo

git clone https://github.com/xxradar/app_routable_demo.git
cd ./app_routable_demo
./setup.sh
watch kubectl get po -n app-routable-demo

The moment the pods are running, the cilium agent will request an SVID for each (group of) pods. You can find the IDENTITY ID for a specific pod

kubectl get cep -n app-routable-demo

NAME ENDPOINT ID IDENTITY ID INGRESS ENFORCEMENT EGRESS ENFORCEMENT VISIBILITY POLICY ENDPOINT STATE IPV4 IPV6
echoserver-1-deployment-77c7b97758-844ct 1624 9164 <status disabled> <status disabled> <status disabled> ready 10.0.1.29
echoserver-1-deployment-77c7b97758-9kz8l 139 9164 <status disabled> <status disabled> <status disabled> ready 10.0.2.3
echoserver-1-deployment-77c7b97758-xwk5z 992 9164 <status disabled> <status disabled> <status disabled> ready 10.0.2.107
echoserver-2-deployment-74658fd96d-gdkhk 668 10485 <status disabled> <status disabled> <status disabled> ready 10.0.1.22
echoserver-2-deployment-74658fd96d-jjnpj 1071 10485 <status disabled> <status disabled> <status disabled> ready 10.0.2.208
echoserver-2-deployment-74658fd96d-w96cb 40 10485 <status disabled> <status disabled> <status disabled> ready 10.0.2.176
mycurler 635 33780 <status disabled> <status disabled> <status disabled> ready 10.0.2.100
nginx-zone1-6d57c556f8-rbf9w 720 13742 <status disabled> <status disabled> <status disabled> ready 10.0.1.175
nginx-zone2-fcf79f559-2jqm7 3409 5799 <status disabled> <status disabled> <status disabled> ready 10.0.2.225
nginx-zone3-8c78d5dbd-pvc8f 202 2857 <status disabled> <status disabled> <status disabled> ready 10.0.1.163
nginx-zone4-747cd49bfc-9ft9x 1848 7678 <status disabled> <status disabled> <status disabled> ready 10.0.2.125
nginx-zone5-7987976dc8-m87h9 1418 9866 <status disabled> <status disabled> <status disabled> ready 10.0.2.232
siege-deployment-6f8567f7fc-9qhcc 729 3694 <status disabled> <status disabled> <status disabled> ready 10.0.2.187
siege-deployment-6f8567f7fc-pt48n 1849 3694 <status disabled> <status disabled> <status disabled> ready 10.0.1.139
siege-deployment-6f8567f7fc-vtgxg 393 3694 <status disabled> <status disabled> <status disabled> ready 10.0.1.50

and find the corresponding SPIFFE ID and entry (ex nginx-zone1–6d57c556f8-rbf9w 720 13742)

kubectl exec -n cilium-spire spire-server-0 -c spire-server --   /opt/spire/bin/spire-server entry show -selector cilium:mutual-auth

Found 15 entries
...

Entry ID : 95321d21-c6ca-40a8-b5ad-87c99c6f09bf
SPIFFE ID : spiffe://spiffe.cilium/identity/13742
Parent ID : spiffe://spiffe.cilium/cilium-operator
Revision : 0
X509-SVID TTL : default
JWT-SVID TTL : default
Selector : cilium:mutual-auth

Entry ID : 84a90f8b-5bfe-4b52-85ef-48a93edb71c2
SPIFFE ID : spiffe://spiffe.cilium/identity/15527
Parent ID : spiffe://spiffe.cilium/cilium-operator
Revision : 0
X509-SVID TTL : default
JWT-SVID TTL : default
Selector : cilium:mutual-auth
...

Apply a L7 Cilium Network Policy w/o mutual auth

kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: no-mutual-auth-echo-app-routeble-demo
namespace: app-routable-demo
spec:
endpointSelector:
matchLabels:
app: nginx-zone1
ingress:
- fromEndpoints:
- matchLabels:
app: siege
authentication:
mode: "required"
toPorts:
- ports:
- port: "80"
protocol: TCP
rules:
http:
- method: "GET"
path: "/app1"
EOF

We can observe the traffic

hubble observe -n app-routable-demo -l app=siege -f
...

Traffic to app1 is allowed in this example, traffic to app2, app3 and app4 is dropped. You can change the policy to if you prefer. It shows to power of this L7 policy functionality.

path: "/app.*"

Apply mutual auth …

kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: no-mutual-auth-echo-app-routeble-demo
namespace: app-routable-demo
spec:
endpointSelector:
matchLabels:
app: nginx-zone1
ingress:
- fromEndpoints:
- matchLabels:
app: siege
authentication:
mode: "required"
toPorts:
- ports:
- port: "80"
protocol: TCP
rules:
http:
- method: "GET"
path: "/app1"
EOF
hubble observe -n app-routable-demo -l app=siege -f
Aug  4 08:01:56.733: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:38942 (ID:3694) -> kube-system/coredns-5d78c9869d-szjq5:53 (ID:7956) to-endpoint FORWARDED (UDP)
Aug 4 08:01:56.733: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:38942 (ID:3694) <- kube-system/coredns-5d78c9869d-szjq5:53 (ID:7956) to-endpoint FORWARDED (UDP)
Aug 4 08:01:56.733: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) -> app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) to-overlay FORWARDED (TCP Flags: SYN)
Aug 4 08:01:56.733: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) -> app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) policy-verdict:L3-L4 INGRESS ALLOWED (TCP Flags: SYN; Auth: SPIRE)
Aug 4 08:01:56.733: app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) <> app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) to-overlay FORWARDED (TCP Flags: SYN, ACK)
Aug 4 08:01:56.734: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) <- app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) to-endpoint FORWARDED (TCP Flags: SYN, ACK)
Aug 4 08:01:56.734: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) -> app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) to-overlay FORWARDED (TCP Flags: ACK, PSH)
Aug 4 08:01:56.734: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) -> app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) to-overlay FORWARDED (TCP Flags: ACK)
Aug 4 08:01:56.734: app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) <> app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) to-overlay FORWARDED (TCP Flags: ACK)
Aug 4 08:01:56.734: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) -> app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) http-request FORWARDED (HTTP/1.1 GET http://zone1/app2)
Aug 4 08:01:56.738: app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) <> app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) to-overlay FORWARDED (TCP Flags: ACK, PSH)
Aug 4 08:01:56.738: app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) <> app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) to-overlay FORWARDED (TCP Flags: ACK, FIN)
Aug 4 08:01:56.738: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) <- app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) http-response FORWARDED (HTTP/1.1 200 4ms (GET http://zone1/app2))
Aug 4 08:01:56.739: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) <- app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 4 08:01:56.739: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) <- app-routable-demo/nginx-zone1-6d57c556f8-rbf9w:80 (ID:13742) to-endpoint FORWARDED (TCP Flags: ACK, FIN)

Note the line:
Aug 4 08:01:56.733: app-routable-demo/siege-deployment-6f8567f7fc-9qhcc:53060 (ID:3694) -> app-routable-demo/nginx-zone1–6d57c556f8-rbf9w:80 (ID:13742) policy-verdict:L3-L4 INGRESS ALLOWED (TCP Flags: SYN; Auth: SPIRE)

Conclusion

This was a very quick run-through. I hope the tutorial sheds some additional light on the very complex topic. In some shape or form this will become an very important concept in future networking.

--

--

Philippe Bogaerts

#BruCON co-founder, #OWASP supporter, Application Delivery and Web Application Security, #Kubernetes and #container, #pentesting enthousiast, BBQ & cocktails !!