Week2 Day 2: K8s security — NetworkPolicy + Secrets

Hands-on with the network and secret-data layers of Kubernetes security: why the default flat network is a lateral-movement risk, the deeply counterintuitive “selective activation” model of NetworkPolicy (one allow rule silently flips a pod to default-deny), why Ingress and Egress are independent dimensions (and why locking egress breaks DNS), the YAML trap where a single - flips AND into OR, why NetworkPolicy needs a CNI that actually enforces it (kindnet doesn’t — another fail-open), and the truth about Secrets: they’re base64, not encrypted — RBAC is what protects them — plus the atomic-symlink-swap trick that powers volume-mounted secret rotation.

0. Where this fits Link to heading

Day 1 was PSA (pod spec constraints) + RBAC (API call permissions). Day 2 moves to the network layer + sensitive data:

NetworkPolicy: pod-level firewall, tightening “default fully-connected” into default-deny + explicit allow
Secrets: sensitive data storage — and one counterintuitive truth: they aren’t encrypted by default

The through-line is still Day 1’s default-deny philosophy + the fail-open / fail-closed lens.

1. NetworkPolicy Link to heading

1.1 Starting point: K8s is fully connected by default Link to heading

K8s flat network: every pod gets an IP, and any pod can reach any pod by default (across namespaces, across nodes).

The problem: lateral movement. If a web pod is compromised, the attacker can scan and attack every other pod in the cluster from there — including the database. Default full connectivity = no internal network segmentation = one compromised pod = the whole cluster exposed.

NetworkPolicy is a pod-level firewall whose target state is default-deny + explicit allow. Same security philosophy as RBAC, just that RBAC governs API calls and NetworkPolicy governs network traffic.

1.2 The most counterintuitive part: selective activation Link to heading

A pod selected by NO NetworkPolicy   →  fully open (default unchanged)
A pod selected by AT LEAST ONE policy →  immediately switches to default-deny,
                                          only the traffic that policy allows passes

The key side effect: you write a single “allow web→api” rule, but the actual effect is “switch api to default-deny + incidentally allow web.” Every source not listed gets cut. This is the #1 beginner trap — you think “I just added an allow rule,” but you actually “flipped the pod into default-deny mode.”

To make an entire namespace default-deny:

spec:
  podSelector: {}          # {} = select all pods
  policyTypes:
  - Ingress
  # no ingress rules at all = deny all inbound

This is the production opener for namespace security: default-deny everything first, then add allows one by one.

1.3 Ingress / Egress are two independent dimensions Link to heading

Ingress = traffic into the pod   (others → me)
Egress  = traffic out of the pod (me → others)

policyTypes controls which direction(s) a policy governs. A direction not listed is unaffected and stays fully open:

policyTypes: [Ingress]          → deny inbound only, egress untouched
policyTypes: [Egress]           → deny outbound only, ingress untouched
policyTypes: [Ingress, Egress]  → both directions denied (full isolation)

This distinction matters: “default-deny ingress” (inbound only) is the baseline for most namespaces; “full isolation” (both directions locked) is for highly sensitive workloads guarding against data exfiltration. In production, ~90% of NetworkPolicies write only Ingress — services are mostly passive request-receivers; outbound connections are normal behavior.

The #1 egress trap: locking egress also locks DNS (CoreDNS in kube-system, UDP 53). A pod resolving a domain name needs DNS first; if your egress allows the target but not DNS, name resolution fails → can’t connect. Egress policies almost always need to additionally allow kube-system DNS.

1.4 Three selector sources Link to heading

ingress:
- from:
  - podSelector:           # source 1: pods with a label in the SAME ns
      matchLabels: {app: web}
  - namespaceSelector:     # source 2: all pods in namespaces with a label (cross-ns)
      matchLabels: {team: frontend}
  - ipBlock:               # source 3: an IP CIDR (usually outside the cluster)
      cidr: 10.0.0.0/8

Source	Selects	Use
`podSelector`	Pods with a label in the same ns	Same-ns service-to-service (web→api)
`namespaceSelector`	All pods in namespaces with a label	Cross-ns (monitoring scraping all ns)
`ipBlock`	An IP range	Fixed IPs outside the cluster

Trap: a podSelector used alone only matches the namespace the policy lives in — it doesn’t cross namespaces. To allow pods from another namespace you must use a namespaceSelector.

Inside the cluster, pod-to-pod always uses podSelector/namespaceSelector (label-based, follows the pod), never ipBlock (pod IPs are dynamic — they change on recreate).

1.5 The deadly syntax: `-` is OR, fields in one item are AND Link to heading

In YAML, - determines list-item boundaries, indentation determines nesting — two independent dimensions.

Form 1: two independent - (OR)

from:
- namespaceSelector: {matchLabels: {env: prod}}   # item 1
- podSelector: {matchLabels: {app: web}}          # item 2 (its own -)

= (all pods in prod ns) OR (web pods in this ns)

Form 2: two fields in one item (AND)

from:
- namespaceSelector: {matchLabels: {env: prod}}   # same item
  podSelector: {matchLabels: {app: web}}          # no -, belongs to the item above

= (pods that are in prod ns AND app=web) — the intersection

Translate to JSON and it’s obvious: Form 1 = two objects in an array [{...}, {...}]; Form 2 = one object with two fields [{..., ...}]. The - is just the array-element separator.

K8s’s semantic convention: multiple fields in one object = AND (multiple constraints on one source); multiple objects in an array = OR (multiple independent sources). Miss a single - / one indent level and the opening size differs by an order of magnitude — and apply doesn’t error. This is NetworkPolicy’s #2 source of incidents (the #1 being missing DNS in egress).

Survival habit: always verify with kubectl describe networkpolicy after writing — it renders the rules in plain English (AND lists fields under one From block, OR uses a separator line), far more reliable than reading raw YAML.

An empty selector means “select all,” not “select none”: podSelector: {} = all pods in this ns, namespaceSelector: {} = all namespaces.

1.6 NetworkPolicy needs a CNI that enforces it (another fail-open) Link to heading

NetworkPolicy is a K8s API object, but K8s itself doesn’t enforce it — enforcement is delegated to the CNI:

You apply a NetworkPolicy → API server stores it in etcd (always succeeds)
                                  ↓
                           Does the CNI enforce it?
                            ├ Calico/Cilium: ✅ programs iptables/eBPF to block traffic
                            └ kindnet (default): ❌ doesn't read it at all, policy is inert

kindnet doesn’t support NetworkPolicy: apply succeeds, get shows it, but traffic that should be blocked still flows. More insidious than a misspelled PSA label — here the object was created correctly; there’s just nobody enforcing it.

The only reliable verification: actually send a connection that should be blocked and see if it really times out. Don’t trust “apply succeeded” or “get shows it.”

The CNI is installed at cluster creation and can’t be hot-swapped. So before NetworkPolicy experiments you must rebuild the cluster with kindnet disabled and Calico installed.

1.7 Experiment log Link to heading

Experiment 0: rebuild cluster + Calico

# kind-calico.yaml
networking:
  disableDefaultCNI: true        # disable kindnet
  podSubnet: "192.168.0.0/16"    # Calico's expected range
nodes:
- role: control-plane
- role: worker

Observed an anomaly: with no CNI, nodes are NotReady and CoreDNS is Pending, but etcd/apiserver/kube-proxy are Running. Reason: the core components run on host network (they don’t need a pod IP — breaking the “network needs CNI, CNI is a pod, pods need network” chicken-and-egg deadlock); CoreDNS is a regular pod needing a pod IP, so it’s stuck without a CNI.

After installing Calico it auto-unblocks: calico-node (a DaemonSet, one per node, running on host network) goes Running → network ready → the CoreDNS that was stuck Pending for 20 minutes goes Running with zero manual intervention. That’s declarative reconciliation — fix the precondition and the blocked things move forward on their own.

Experiment 1: default full connectivity

Created three nginx pods web/api/db (in the netpol ns, deliberately without a PSA label — change one variable at a time). web → api and web → db both return 200. Confirms the flat network is fully connected.

Experiment 2: default-deny

spec:
  podSelector: {}
  policyTypes: [Ingress]

After apply, web→api and web→db both become 000 + exit code 28.

%{http_code} = 000: no HTTP connection established, the packets were dropped
exit 28: curl timeout (--max-time 5)
drop, not reject: reject would instantly return an RST (fast fail, exit 7); drop sends packets into a black hole and the client just waits until timeout. Security firewalls use drop — it gives the attacker no information (they can’t even tell whether a service exists there).

000 + timeout is the proof NetworkPolicy is truly enforced (kindnet would return 200).

Experiment 3: precise allow

spec:
  podSelector: {matchLabels: {app: api}}
  policyTypes: [Ingress]
  ingress:
  - from:
    - podSelector: {matchLabels: {app: web}}

Three checks:

Connection	Result	Why
web → api	200	allow rule permits it (union of the two policies)
web → db	000	db only selected by default-deny, no allow
db → api	000	api only allows `app=web`; db doesn’t match

db → api being blocked is the key verification: allowing one source ≠ opening to everyone. This is exactly the lateral-movement defense from Day 5 — even if db is compromised it can’t reach api.

NetworkPolicy is also a union: default-deny + allow act together, any rule that permits lets traffic through.

How Calico blocks by label: it maintains a pod-IP ↔ label mapping on each node, and on receiving a packet it reverse-looks-up the label by source IP to match the ingress from. That’s why NetworkPolicy can control by pod identity (label) rather than just IP, and why it requires a CNI — K8s itself doesn’t maintain that mapping.

2. Secrets Link to heading

2.1 Secret vs ConfigMap Link to heading

	ConfigMap	Secret
Stores	Non-sensitive config	Sensitive data
Encoding	Plaintext	base64
etcd encryption	Off by default	Also off by default
RBAC convention	Open	Tightly restricted

Structurally near-identical; the main difference is semantics (K8s knows a Secret is sensitive — get hides values by default, logs are redacted).

2.2 The counterintuitive truth: base64 is not encryption Link to heading

echo "supersecret123" | base64        # c3VwZXJzZWNyZXQxMjM=
echo "c3VwZXJzZWNyZXQxMjM=" | base64 -d # supersecret123 (no key required)

base64 is just encoding (representing binary as text); it provides no confidentiality. Anyone who can kubectl get secret -o yaml decodes the plaintext in one line.

What actually protects a Secret is RBAC (who can get secrets) + etcd access control — not any “encryption” of the Secret object itself (there is none).

Real encryption requires explicitly configuring encryption at rest (API server EncryptionConfiguration + AES-GCM/KMS); off by default. Note: even with etcd encryption on, kubectl get -o yaml still shows base64 — encryption happens between apiserver↔etcd, transparent to kubectl. So you can’t tell from -o yaml whether etcd encryption is enabled.

type: Opaque — “Opaque” means K8s doesn’t interpret the data’s meaning (a black box to K8s), not that it’s hidden from people. The data is fully visible to anyone with read access.

2.3 env vs volume mounting Link to heading

Experiment verified: inside the pod both methods give plaintext (base64 is only at the etcd layer; kubelet decodes it on injection).

	env	volume
In-pod access	`printenv DB_PASSWORD`	`cat /etc/secret/password`
Leak surface	Large (core dumps / debug endpoints / child-process inheritance / `/proc/self/environ` / logs dumping env)	Small (must specifically read the file path)
Secret update	Not automatic (injected at start; change requires pod restart)	Automatic (~1min delay, kubelet syncs)

Production prefers volume, especially for rotating secrets.

How volume auto-rotation works — atomic symlink swap:

/etc/secret/password  → symlink → ..data/password
/etc/secret/..data    → symlink → ..2026_05_21_xx/  (a timestamped real directory)

On Secret update, kubelet: creates a new timestamped directory → writes the new value → atomically swaps the ..data symlink. A single symlink swap is atomic, so the application never reads a half-written value.

Trap: a Secret mounted via subPath does not auto-update — subPath mounts the real file directly, bypassing the ..data symlink layer, so there’s no atomic swap. For auto-rotation you must mount the whole directory, not a single file via subPath.

The env method has no symlink mechanism: env is injected into the process environment at container start and is thereafter a static value in process memory; changing the Secret can’t affect already-injected env (you can’t externally change a running process’s environment variables). So env requires a pod restart to pick up Secret changes.

2.4 Least privilege: resourceNames Link to heading

To restrict reading to a specific Secret:

rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
  resourceNames: ["db-secret"]    # can only read this one, not all Secrets

resourceNames limitation: it only works for “operate-by-name” verbs (get/update/delete/patch), not for list/watch (list has no “specific name” semantics — the authorizer can’t filter by name at authz time). So you can “only get a specific Secret” but can’t “only list a specific Secret” — either you can list all or you can’t list.

Secret security = RBAC configuration. A key eval point: did you narrow with resourceNames, or crudely grant read on the whole secrets resource?

3. Takeaways — seeds for the Stripe project Link to heading

3.1 default-deny is the through-line Link to heading

NetworkPolicy default-deny, RBAC explicit allow, PSA enforce — all “block anything not explicitly allowed.” The additive model (close everything first, then open what’s needed) is safer than the subtractive model (open everything, then close the dangerous bits): in the additive model “forgot to configure” = can’t connect (caught immediately); in the subtractive model “forgot to configure” = wide open (silently insecure).

3.2 fail-open vs fail-closed, again Link to heading

kindnet not enforcing NetworkPolicy → fail-open (dangerous)
strict decoding rejecting a misspelled field (policyTYpes) → fail-closed (safe)
“API object exists ≠ has effect”: NetworkPolicy without a CNI, Ingress without a controller, PVC without a provisioner are the same class of trap

3.3 layered defense, network dimension Link to heading

NetworkPolicy (application-layer segmentation) + seccomp restricting connect/sendto (kernel layer) = two layers against the same data-exfiltration threat. Lateral-movement defense relies on network segmentation; even a compromised pod can’t freely scan other services.

3.4 LLM eval dimensions Link to heading

NetworkPolicy: default-deny floor + precise allow; don’t miss DNS (egress); get the -/indentation AND/OR right
Secret: narrow with resourceNames; use volume not env for sensitive secrets; don’t assume Secret = encrypted

4. Takeaways Link to heading

K8s flat network is fully connected by default — a dangerous default, needs NetworkPolicy for segmentation
NetworkPolicy selective activation: being selected flips a pod to default-deny; allowing one source ≠ opening to all
Ingress/Egress independent: policyTypes governs the directions you list; locking egress must allow DNS
Three selectors: podSelector (same ns) / namespaceSelector (cross-ns) / ipBlock (outside cluster)
- is OR, fields in one item are AND — one missing - differs by an order of magnitude; verify with describe
NetworkPolicy needs a CNI: kindnet doesn’t enforce (fail-open), Calico does
drop, not reject: blocked = timeout (exit 28), giving the attacker no information
Secret base64 ≠ encryption — protected by RBAC; real encryption needs encryption at rest
env vs volume: both plaintext in-pod; volume uses atomic symlink swap for auto-rotation, env is immutable after injection; subPath doesn’t auto-update
resourceNames least privilege — only works for get/update/delete, not list/watch

5. Tomorrow’s preview (Day 3) Link to heading

seccomp in K8s. Continuing on this Calico-equipped stripe-day2 cluster:

Run the official seccomp tutorial, deploy a pod with RuntimeDefault
Write a custom Localhost profile blocking a syscall (like mkdir), trigger it from inside the pod and watch the EPERM
Map back to Week 1 Day 3-4’s BPF filter mental model — K8s’s seccompProfile is just that wrapped into a pod spec field