Week2 Day 1: K8s security — Pod Security Standards + RBAC

A first hands-on pass at Kubernetes-native security: how Pod Security Admission (PSA) actually attaches to namespaces via labels (and silently fails-open when the label is misspelled), why runAsNonRoot: true is necessary but not sufficient (the image’s USER still has to be non-root), the four-field minimum a pod needs to clear the restricted profile, and the full RBAC mental model — Role/ClusterRole × RoleBinding/ClusterRoleBinding as two independent dimensions that produce four combinations (one of which K8s rejects outright), why RBAC evaluation is union, not override, and the asymmetry that makes RBAC fail-closed while PSA fails open.

0. Where this fits Link to heading

Week 1 was Linux kernel-level security (syscalls, seccomp, AppArmor). Week 2 moves up to Kubernetes — which is essentially those same mechanisms wrapped in a declarative API. Today’s two pillars are K8s’s own admission-layer mechanisms:

PSA (Pod Security Admission): constraints on the pod spec itself (the “what can run” layer)
RBAC: constraints on who can call which K8s APIs (the “who is root” layer)

These two combined with the seccomp/AppArmor profiles (Day 3-4 of this week) are the full K8s security stack. Today is just the K8s-native half.

1. Prerequisite mental model — workload / network / security Link to heading

1.1 Workload — declarative reconciliation Link to heading

K8s’s essence is declarative + reconciliation loop: you declare “I want N pods like this,” and a controller keeps pulling reality toward that desired state.

Resource	“I want…”	Typical use
Pod	One specific container group	Rarely directly
ReplicaSet	N replicas	Internal to Deployment
Deployment	N replicas + rolling update	~90% of stateless services
StatefulSet	N ordered pods with stable identity	DB / Kafka
DaemonSet	One pod per node	Logging/monitoring agents
Job / CronJob	One-shot / scheduled tasks	Batch

Key insight: a bare Pod has no controller watching it — delete it and it’s gone. Deployment → ReplicaSet → Pod is a three-layer ownership chain; during a rolling update the old and new ReplicaSets coexist, and rollback is just scaling the old ReplicaSet back up (sub-second, no new objects created).

1.2 Network — flat network + Service abstraction Link to heading

K8s mandates three network rules:

Every pod gets an IP
All pods can talk to each other without NAT
Nodes can also talk to pods directly

Implementation is delegated to CNI plugins (kind uses kindnet, production typically Calico/Cilium).

A Service is a stable virtual IP + load balancing in front of a set of pods. It doesn’t bind to a Deployment — it binds to a label. That’s why canary deployments are trivial: two Deployments share an app=backend label, the Service selects both, traffic flows in proportion to replica counts.

The three Service types are layered, not alternatives:

ClusterIP (default, 90% of cases) — reachable inside the cluster only
NodePort — opens an extra port (30000-32767) on every node (rare in prod)
LoadBalancer — stacks a cloud LB on top of NodePort

Production is overwhelmingly ClusterIP, with 1-2 LoadBalancer Services exposing an Ingress Controller, which then does L7 routing back to ClusterIP Services.

1.3 Security — multiple independent admission dimensions Link to heading

K8s security isn’t a single mechanism — it’s multiple independent dimensions in the request pipeline:

┌──────────────────────────┐
│ kubectl apply Pod YAML   │
└──────────┬───────────────┘
           ▼
┌──────────────────────────┐
│ Authentication           │  "who are you" (cert/token)
└──────────┬───────────────┘
           ▼
┌──────────────────────────┐
│ RBAC                     │  "are you allowed to create Pods?"
└──────────┬───────────────┘
           ▼
┌──────────────────────────┐
│ Admission (PSA + webhook)│  "is this Pod spec compliant?"
└──────────┬───────────────┘
           ▼
       persist to etcd → scheduling → runtime

PSA and RBAC are different dimensions:

PSA looks at the pod spec itself (“what is this pod going to run?”)
RBAC looks at the caller (“does this caller have permission to submit that spec?”)

Same complementary structure as Day 5’s five-layer sandbox.

2. Pod Security Standards (PSA) Link to heading

2.1 Three preset profiles Link to heading

PSA is the built-in admission controller (GA in K8s 1.25, replacing the legacy PodSecurityPolicy). Three profiles:

Profile	What it blocks	Typical use
`privileged`	Nothing	System DaemonSets (kube-proxy, etc.)
`baseline`	Most common misconfigs (hostNetwork/hostPath/CAP_SYS_ADMIN, etc.)	A reasonable floor for general apps
`restricted`	Production-hardened (non-root, seccomp, drop all caps required)	Strict production workloads

2.2 Activation: namespace labels Link to heading

PSA isn’t a cluster-wide switch — it’s activated per namespace via labels:

kubectl label namespace tight pod-security.kubernetes.io/enforce=restricted

Three modes that stack:

enforce — reject violations at admission time
audit — log to audit log, allow
warn — send a kubectl warning, allow

Production rollout pattern: audit to observe → warn to nudge users → enforce to enforce. Same pattern as Day 5’s AppArmor complain → enforce workflow.

There’s also a *-version label that pins to a specific K8s version (pod-security.kubernetes.io/enforce-version=v1.30) so upstream tightening of the standard doesn’t break already-compliant pods.

2.3 Experiment 1: restricted rejects a violating pod Link to heading

apiVersion: v1
kind: Pod
metadata:
  name: bad
spec:
  containers:
  - name: c
    image: busybox
    command: ["sleep", "3600"]

Apply to the tight namespace → rejected by restricted, with the error listing every violation:

violates PodSecurity "restricted:latest":
  - allowPrivilegeEscalation != false
  - unrestricted capabilities (must drop=["ALL"])
  - runAsNonRoot != true
  - seccompProfile not set (must be RuntimeDefault or Localhost)

This error message is the restricted compliance checklist — clearer than the docs table.

2.4 Experiment 2: edit the spec to pass Link to heading

The four required fields:

apiVersion: v1
kind: Pod
metadata:
  name: good
spec:
  securityContext:                    # pod-level
    runAsNonRoot: true
    runAsUser: 1000                   # busybox image USER=root, must override explicitly
  containers:
  - name: c
    image: busybox
    command: ["sleep", "3600"]
    securityContext:                  # container-level
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
      seccompProfile:
        type: RuntimeDefault

2.5 Two-tier securityContext Link to heading

The securityContext field placement follows a rule:

Field	pod-level	container-level	Why
`runAsUser` / `runAsNonRoot` / `fsGroup` / `seccompProfile`	✅	✅	Pod shares user namespace; can be shared or independent
`allowPrivilegeEscalation` / `capabilities` / `readOnlyRootFilesystem`	❌	✅	Per-process / per-container semantics

Container-level overrides pod-level.

2.6 Trap: passing admission ≠ actually starting Link to heading

PSA only inspects the pod spec; it never inspects the image. runAsNonRoot: true still requires the image itself to have a non-root USER — otherwise kubelet refuses to start the container (not PSA — kubelet), and the pod enters CreateContainerConfigError. Two fixes:

Add runAsUser: 1000 to override the image’s USER
Switch to a non-root image (e.g. nginxinc/nginx-unprivileged)

Two layered failure modes in production:

Admission rejects the spec (PSA error) — explicit
Runtime rejects starting (image incompatible with non-root) — more subtle

2.7 PSA label typo = fail-open Link to heading

kubectl label namespace tight pod-security.kubernetes.io/enforece=restricted   # note the extra 'e'

PSA only recognizes the exact label key — typo = silently inactive = violating pods pass. No warning at all.

This is K8s’s fail-open trade-off: avoid breaking every old cluster when 1.25 shipped. Cost: typos pass silently.

Diagnostic trick: kubectl get ns -L pod-security.kubernetes.io/enforce promotes the label into a column — typoed namespaces have an empty cell.

Production answer: write a cluster policy with Kyverno or OPA Gatekeeper requiring “every namespace must have a valid PSA label.” That’s Week 2 Day 5.

3. RBAC Link to heading

3.1 Four core objects Link to heading

Object	Scope	Purpose
`Role`	namespaced	“In this ns, you can do these API verbs on these resources”
`ClusterRole`	cluster-wide	“Available anywhere or cluster-scoped”
`RoleBinding`	namespaced	“In this ns, give Role/ClusterRole to this subject”
`ClusterRoleBinding`	cluster-wide	“Across the whole cluster, give ClusterRole to this subject”

Subjects come in three kinds: User (a person, usually cert/OIDC), Group, ServiceAccount (an in-pod identity).

3.2 ServiceAccount = a pod’s identity Link to heading

When a pod starts, kubelet auto-mounts an SA token at /var/run/secrets/kubernetes.io/serviceaccount/token. Code inside the pod uses this token when calling the K8s API; the API server identifies the SA and runs it through RBAC.

Every namespace has a default SA; pods that don’t set serviceAccountName use it.

You don’t need to actually run a pod to debug RBAC — kubectl auth can-i --as=system:serviceaccount:<ns>:<name> impersonates the SA and asks the API server directly. Production SRE’s first RBAC tool.

3.3 verb + resource + apiGroup triple Link to heading

The core of a Role rule:

rules:
- apiGroups: [""]            # "" = core API group (pod/svc/cm/secret/ns/sa all here)
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

Trap 1: apiGroups: [""] — empty string, not ["core"]. Pods and other core resources are in the unnamed group. Newer resources have group names (apps / networking.k8s.io / rbac.authorization.k8s.io / etc.).

Trap 2: the standard read-only trio is get / list / watch. list is a one-shot fetch, watch is a long-lived subscription to changes (every K8s controller uses watches). Only granting list forces a controller to poll. All three together is “complete read-only.”

3.4 Experiment 3: SA + Role + RoleBinding Link to heading

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: tight
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: pod-reader-binding
  namespace: tight
subjects:
- kind: ServiceAccount
  name: reader
  namespace: tight
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: pod-reader

Verify:

kubectl auth can-i list pods    --as=system:serviceaccount:tight:reader -n tight     # yes
kubectl auth can-i delete pods  --as=system:serviceaccount:tight:reader -n tight     # no   (no delete verb)
kubectl auth can-i list pods    --as=system:serviceaccount:tight:reader -n default   # no   (Role is namespaced)
kubectl auth can-i list secrets --as=system:serviceaccount:tight:reader -n tight     # no   (only granted pods)

3.5 RBAC typo = fail-closed (vs PSA’s fail-open) Link to heading

kubectl auth can-i list secretes ...    # typoed resource (extra 'e')
Warning: the server doesn't have a resource type 'secretes'
no                                       # still returns no

K8s warns but returns no. RBAC evaluation is “explicit allow, default deny” — no rule covers the typoed resource, so deny.

Side by side:

PSA label typo → fail-open (policy is inactive, pod passes) ❌ dangerous
RBAC resource typo → fail-closed (permission denied) ✅ safe

Why the asymmetry: labels are free text — K8s can’t know your intent; RBAC verb/resource are enumerated evaluations — no match means deny. The default behavior determines what a typo costs you.

3.6 Four Role/Binding combinations Link to heading

Two independent dimensions:

Where the permission is defined (Role / ClusterRole)
Where the binding has effect (RoleBinding / ClusterRoleBinding)

2 × 2 = 4 combinations:

Definition	Binding	Effective scope	Use case
Role	RoleBinding	This ns	Plain namespace permissions
ClusterRole	ClusterRoleBinding	Whole cluster	cluster admin / monitoring agent
ClusterRole	RoleBinding	This ns (scope down)	Reuse a built-in ClusterRole (view/edit), but constrain to one ns
Role	ClusterRoleBinding	❌ K8s rejects	A namespaced definition “bound cluster-wide” has no meaning

The most counterintuitive is combo 3: ClusterRole + RoleBinding. The ClusterRole looks cluster-scoped, but the RoleBinding scopes it down to one namespace. Why does this exist: so you can reuse K8s’s built-in commonly-needed ClusterRoles (view / edit / admin) without rewriting a Role for every namespace.

K8s built-in ClusterRoles, quick reference:

view — read-only across most resources
edit — read + write most resources
admin — almost everything (cannot modify RBAC itself)
cluster-admin — truly everything

3.7 Experiment 4: scope expansion Link to heading

On top of the existing Role+RoleBinding, add a ClusterRole + ClusterRoleBinding (binding the same SA reader):

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-reader               # cluster scoped, no namespace field
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: pod-reader-cluster-binding
subjects:
- kind: ServiceAccount
  name: reader
  namespace: tight
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole              # ClusterRoleBinding must reference a ClusterRole
  name: pod-reader

Re-run the same four checks:

Check	Before CRB	After CRB
list pods in tight	yes	yes
delete pods in tight	no	no
list pods in default	no	yes ← changed
list secrets in tight	no	no

The third row flipping to yes proves the ClusterRoleBinding extends the permission to any namespace.

3.8 RBAC evaluation = union, not override Link to heading

The single most important design point: when multiple bindings reference the same SA, permissions are a union.

Evaluating "can reader list pods in default?"
  ├ RoleBinding tight/pod-reader-binding: binding is in tight, asking about default → skip
  └ ClusterRoleBinding pod-reader-cluster-binding: applies cluster-wide → allow → YES

Any rule that allows → yes. No deny rules. No override semantics.

Why the design:

Multi-team cluster ownership — union lets each team add its own permissions without trampling others
The SA name isn’t a unique key — one SA being referenced by many bindings is normal

The only way to take permissions away = delete a binding. True “deny” requires an admission webhook (Kyverno / OPA Gatekeeper) outside RBAC.

3.9 Experiment 5: deleting a binding to verify “only adds, never overrides” Link to heading

kubectl delete clusterrolebinding pod-reader-cluster-binding
kubectl auth can-i list pods --as=system:serviceaccount:tight:reader -n default     # no   (rolled back)
kubectl auth can-i list pods --as=system:serviceaccount:tight:reader -n tight       # yes  (local RoleBinding still in place)

The pods [list get watch] line in --list output for the default namespace disappears, confirming that deleting the ClusterRoleBinding actually rescinds the permission.

3.10 RBAC audit cheat sheet Link to heading

# What can this identity do right now?
kubectl auth can-i --list --as=system:serviceaccount:<ns>:<name> -n <ns>

# Which bindings reference this SA?
kubectl get rolebinding,clusterrolebinding -A -o json \
  | jq '.items[] | select(.subjects[]?.name=="<sa>" and .subjects[]?.namespace=="<ns>") | {kind, name: .metadata.name, ns: .metadata.namespace, roleRef}'

# Reverse: who can do X? (needs the krew plugin)
kubectl who-can list pods -n default

In can-i --list output the cloud of selfsubject* / /healthz / etc. entries is the baseline given to every authenticated user (via the system:authenticated group). Filter those out when auditing.

4. Takeaways — seeds for the Stripe project Link to heading

4.1 K8s security = “Linux mechanisms wrapped in declarative API” Link to heading

K8s didn’t invent new security primitives. It packaged the existing Linux ones (capabilities, seccomp, AppArmor, namespaces) into pod spec fields:

securityContext.capabilities.drop  →  Day 5 cap layer
securityContext.seccompProfile     →  Day 3-4 seccomp filter
securityContext.appArmorProfile    →  Day 5 AppArmor profile
pod.spec.hostNetwork / hostPID     →  namespace isolation

If you understand Week 1’s kernel mechanisms, you can read a K8s pod spec and reason about exactly which kernel-level state is being requested.

4.2 fail-open vs fail-closed is a core design choice Link to heading

PSA label typo fails open vs RBAC resource typo fails closed — the same K8s, two opposite trade-offs. Production hygiene: identify which points are fail-open, then put a stricter admission layer (webhook) on top of them.

4.3 Layered admission as defense in depth Link to heading

PSA gates the spec; RBAC gates the caller; admission webhooks gate business policies that neither can express. This is Day 5’s five-layer sandbox idea translated to the K8s admission layer.

4.4 Extending LLM eval dimensions Link to heading

When the Stripe project asks an LLM to generate K8s security configs, the evaluation axes generalize to:

PSA: pick the right profile (baseline vs restricted — don’t blindly restricted everything and break images)
RBAC: least privilege; no verbs: ["*"], no resources: ["*"]
Cumulative binding effect: each binding looks small individually, but does their union add up to admin?
Fail-open traps: does a typoed label / nonexistent resource make the policy silently inactive?

5. Takeaways Link to heading

K8s = Linux security mechanisms wrapped in declarative spec — no new primitives
PSA: 3 profiles (privileged/baseline/restricted) × 3 modes (enforce/audit/warn), activated by namespace label
restricted requires 4 fields: runAsNonRoot / allowPrivilegeEscalation: false / drop ALL caps / seccompProfile
securityContext has two tiers: pod-level (shared) + container-level (per-process)
Passing admission ≠ actually starting — the image’s USER must also be non-root, or set runAsUser explicitly
PSA label typo fails open (vs RBAC fails closed) — must be backed by Kyverno enforcement
RBAC four objects: Role/ClusterRole + RoleBinding/ClusterRoleBinding
Of the four combinations, Role+CRB is rejected outright; ClusterRole+RoleBinding is the “scope-down” pattern for built-in ClusterRoles
RBAC evaluation is union, not override — the only way to take permissions away is to delete a binding
kubectl auth can-i [--list] --as=... is the first-line RBAC debugging tool — no pod deployment required

6. Tomorrow’s preview (Day 2) Link to heading

NetworkPolicy + Secrets. Continuing on today’s cluster + tight namespace:

Cluster is flat / fully connected by default; once a NetworkPolicy is applied = default-deny + explicit allow
Secret vs ConfigMap; env-mount vs volume-mount and the security difference
Same fail-open / fail-closed lens: what does a NetworkPolicy with no ingress rule actually do?