Week2 Day 1: K8s security — Pod Security Standards + RBAC
A first hands-on pass at Kubernetes-native security: how Pod Security Admission (PSA) actually attaches to namespaces via labels (and silently fails-open when the label is misspelled), why runAsNonRoot: true is necessary but not sufficient (the image’s USER still has to be non-root), the four-field minimum a pod needs to clear the restricted profile, and the full RBAC mental model — Role/ClusterRole × RoleBinding/ClusterRoleBinding as two independent dimensions that produce four combinations (one of which K8s rejects outright), why RBAC evaluation is union, not override, and the asymmetry that makes RBAC fail-closed while PSA fails open.
0. Where this fits Link to heading
Week 1 was Linux kernel-level security (syscalls, seccomp, AppArmor). Week 2 moves up to Kubernetes — which is essentially those same mechanisms wrapped in a declarative API. Today’s two pillars are K8s’s own admission-layer mechanisms:
- PSA (Pod Security Admission): constraints on the pod spec itself (the “what can run” layer)
- RBAC: constraints on who can call which K8s APIs (the “who is root” layer)
These two combined with the seccomp/AppArmor profiles (Day 3-4 of this week) are the full K8s security stack. Today is just the K8s-native half.
1. Prerequisite mental model — workload / network / security Link to heading
1.1 Workload — declarative reconciliation Link to heading
K8s’s essence is declarative + reconciliation loop: you declare “I want N pods like this,” and a controller keeps pulling reality toward that desired state.
| Resource | “I want…” | Typical use |
|---|---|---|
| Pod | One specific container group | Rarely directly |
| ReplicaSet | N replicas | Internal to Deployment |
| Deployment | N replicas + rolling update | ~90% of stateless services |
| StatefulSet | N ordered pods with stable identity | DB / Kafka |
| DaemonSet | One pod per node | Logging/monitoring agents |
| Job / CronJob | One-shot / scheduled tasks | Batch |
Key insight: a bare Pod has no controller watching it — delete it and it’s gone. Deployment → ReplicaSet → Pod is a three-layer ownership chain; during a rolling update the old and new ReplicaSets coexist, and rollback is just scaling the old ReplicaSet back up (sub-second, no new objects created).
1.2 Network — flat network + Service abstraction Link to heading
K8s mandates three network rules:
- Every pod gets an IP
- All pods can talk to each other without NAT
- Nodes can also talk to pods directly
Implementation is delegated to CNI plugins (kind uses kindnet, production typically Calico/Cilium).
A Service is a stable virtual IP + load balancing in front of a set of pods. It doesn’t bind to a Deployment — it binds to a label. That’s why canary deployments are trivial: two Deployments share an app=backend label, the Service selects both, traffic flows in proportion to replica counts.
The three Service types are layered, not alternatives:
- ClusterIP (default, 90% of cases) — reachable inside the cluster only
- NodePort — opens an extra port (30000-32767) on every node (rare in prod)
- LoadBalancer — stacks a cloud LB on top of NodePort
Production is overwhelmingly ClusterIP, with 1-2 LoadBalancer Services exposing an Ingress Controller, which then does L7 routing back to ClusterIP Services.
1.3 Security — multiple independent admission dimensions Link to heading
K8s security isn’t a single mechanism — it’s multiple independent dimensions in the request pipeline:
┌──────────────────────────┐
│ kubectl apply Pod YAML │
└──────────┬───────────────┘
▼
┌──────────────────────────┐
│ Authentication │ "who are you" (cert/token)
└──────────┬───────────────┘
▼
┌──────────────────────────┐
│ RBAC │ "are you allowed to create Pods?"
└──────────┬───────────────┘
▼
┌──────────────────────────┐
│ Admission (PSA + webhook)│ "is this Pod spec compliant?"
└──────────┬───────────────┘
▼
persist to etcd → scheduling → runtime
PSA and RBAC are different dimensions:
- PSA looks at the pod spec itself (“what is this pod going to run?”)
- RBAC looks at the caller (“does this caller have permission to submit that spec?”)
Same complementary structure as Day 5’s five-layer sandbox.
2. Pod Security Standards (PSA) Link to heading
2.1 Three preset profiles Link to heading
PSA is the built-in admission controller (GA in K8s 1.25, replacing the legacy PodSecurityPolicy). Three profiles:
| Profile | What it blocks | Typical use |
|---|---|---|
privileged | Nothing | System DaemonSets (kube-proxy, etc.) |
baseline | Most common misconfigs (hostNetwork/hostPath/CAP_SYS_ADMIN, etc.) | A reasonable floor for general apps |
restricted | Production-hardened (non-root, seccomp, drop all caps required) | Strict production workloads |
2.2 Activation: namespace labels Link to heading
PSA isn’t a cluster-wide switch — it’s activated per namespace via labels:
kubectl label namespace tight pod-security.kubernetes.io/enforce=restricted
Three modes that stack:
enforce— reject violations at admission timeaudit— log to audit log, allowwarn— send a kubectl warning, allow
Production rollout pattern: audit to observe → warn to nudge users → enforce to enforce. Same pattern as Day 5’s AppArmor complain → enforce workflow.
There’s also a *-version label that pins to a specific K8s version (pod-security.kubernetes.io/enforce-version=v1.30) so upstream tightening of the standard doesn’t break already-compliant pods.
2.3 Experiment 1: restricted rejects a violating pod Link to heading
apiVersion: v1
kind: Pod
metadata:
name: bad
spec:
containers:
- name: c
image: busybox
command: ["sleep", "3600"]
Apply to the tight namespace → rejected by restricted, with the error listing every violation:
violates PodSecurity "restricted:latest":
- allowPrivilegeEscalation != false
- unrestricted capabilities (must drop=["ALL"])
- runAsNonRoot != true
- seccompProfile not set (must be RuntimeDefault or Localhost)
This error message is the restricted compliance checklist — clearer than the docs table.
2.4 Experiment 2: edit the spec to pass Link to heading
The four required fields:
apiVersion: v1
kind: Pod
metadata:
name: good
spec:
securityContext: # pod-level
runAsNonRoot: true
runAsUser: 1000 # busybox image USER=root, must override explicitly
containers:
- name: c
image: busybox
command: ["sleep", "3600"]
securityContext: # container-level
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
seccompProfile:
type: RuntimeDefault
2.5 Two-tier securityContext Link to heading
The securityContext field placement follows a rule:
| Field | pod-level | container-level | Why |
|---|---|---|---|
runAsUser / runAsNonRoot / fsGroup / seccompProfile | ✅ | ✅ | Pod shares user namespace; can be shared or independent |
allowPrivilegeEscalation / capabilities / readOnlyRootFilesystem | ❌ | ✅ | Per-process / per-container semantics |
Container-level overrides pod-level.
2.6 Trap: passing admission ≠ actually starting Link to heading
PSA only inspects the pod spec; it never inspects the image. runAsNonRoot: true still requires the image itself to have a non-root USER — otherwise kubelet refuses to start the container (not PSA — kubelet), and the pod enters CreateContainerConfigError. Two fixes:
- Add
runAsUser: 1000to override the image’sUSER - Switch to a non-root image (e.g.
nginxinc/nginx-unprivileged)
Two layered failure modes in production:
- Admission rejects the spec (PSA error) — explicit
- Runtime rejects starting (image incompatible with non-root) — more subtle
2.7 PSA label typo = fail-open Link to heading
kubectl label namespace tight pod-security.kubernetes.io/enforece=restricted # note the extra 'e'
PSA only recognizes the exact label key — typo = silently inactive = violating pods pass. No warning at all.
This is K8s’s fail-open trade-off: avoid breaking every old cluster when 1.25 shipped. Cost: typos pass silently.
Diagnostic trick: kubectl get ns -L pod-security.kubernetes.io/enforce promotes the label into a column — typoed namespaces have an empty cell.
Production answer: write a cluster policy with Kyverno or OPA Gatekeeper requiring “every namespace must have a valid PSA label.” That’s Week 2 Day 5.
3. RBAC Link to heading
3.1 Four core objects Link to heading
| Object | Scope | Purpose |
|---|---|---|
Role | namespaced | “In this ns, you can do these API verbs on these resources” |
ClusterRole | cluster-wide | “Available anywhere or cluster-scoped” |
RoleBinding | namespaced | “In this ns, give Role/ClusterRole to this subject” |
ClusterRoleBinding | cluster-wide | “Across the whole cluster, give ClusterRole to this subject” |
Subjects come in three kinds: User (a person, usually cert/OIDC), Group, ServiceAccount (an in-pod identity).
3.2 ServiceAccount = a pod’s identity Link to heading
When a pod starts, kubelet auto-mounts an SA token at /var/run/secrets/kubernetes.io/serviceaccount/token. Code inside the pod uses this token when calling the K8s API; the API server identifies the SA and runs it through RBAC.
Every namespace has a default SA; pods that don’t set serviceAccountName use it.
You don’t need to actually run a pod to debug RBAC — kubectl auth can-i --as=system:serviceaccount:<ns>:<name> impersonates the SA and asks the API server directly. Production SRE’s first RBAC tool.
3.3 verb + resource + apiGroup triple Link to heading
The core of a Role rule:
rules:
- apiGroups: [""] # "" = core API group (pod/svc/cm/secret/ns/sa all here)
resources: ["pods"]
verbs: ["get", "list", "watch"]
Trap 1: apiGroups: [""] — empty string, not ["core"]. Pods and other core resources are in the unnamed group. Newer resources have group names (apps / networking.k8s.io / rbac.authorization.k8s.io / etc.).
Trap 2: the standard read-only trio is get / list / watch. list is a one-shot fetch, watch is a long-lived subscription to changes (every K8s controller uses watches). Only granting list forces a controller to poll. All three together is “complete read-only.”
3.4 Experiment 3: SA + Role + RoleBinding Link to heading
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
namespace: tight
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: pod-reader-binding
namespace: tight
subjects:
- kind: ServiceAccount
name: reader
namespace: tight
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: pod-reader
Verify:
kubectl auth can-i list pods --as=system:serviceaccount:tight:reader -n tight # yes
kubectl auth can-i delete pods --as=system:serviceaccount:tight:reader -n tight # no (no delete verb)
kubectl auth can-i list pods --as=system:serviceaccount:tight:reader -n default # no (Role is namespaced)
kubectl auth can-i list secrets --as=system:serviceaccount:tight:reader -n tight # no (only granted pods)
3.5 RBAC typo = fail-closed (vs PSA’s fail-open) Link to heading
kubectl auth can-i list secretes ... # typoed resource (extra 'e')
Warning: the server doesn't have a resource type 'secretes'
no # still returns no
K8s warns but returns no. RBAC evaluation is “explicit allow, default deny” — no rule covers the typoed resource, so deny.
Side by side:
- PSA label typo → fail-open (policy is inactive, pod passes) ❌ dangerous
- RBAC resource typo → fail-closed (permission denied) ✅ safe
Why the asymmetry: labels are free text — K8s can’t know your intent; RBAC verb/resource are enumerated evaluations — no match means deny. The default behavior determines what a typo costs you.
3.6 Four Role/Binding combinations Link to heading
Two independent dimensions:
- Where the permission is defined (Role / ClusterRole)
- Where the binding has effect (RoleBinding / ClusterRoleBinding)
2 × 2 = 4 combinations:
| Definition | Binding | Effective scope | Use case |
|---|---|---|---|
| Role | RoleBinding | This ns | Plain namespace permissions |
| ClusterRole | ClusterRoleBinding | Whole cluster | cluster admin / monitoring agent |
| ClusterRole | RoleBinding | This ns (scope down) | Reuse a built-in ClusterRole (view/edit), but constrain to one ns |
| Role | ClusterRoleBinding | ❌ K8s rejects | A namespaced definition “bound cluster-wide” has no meaning |
The most counterintuitive is combo 3: ClusterRole + RoleBinding. The ClusterRole looks cluster-scoped, but the RoleBinding scopes it down to one namespace. Why does this exist: so you can reuse K8s’s built-in commonly-needed ClusterRoles (view / edit / admin) without rewriting a Role for every namespace.
K8s built-in ClusterRoles, quick reference:
view— read-only across most resourcesedit— read + write most resourcesadmin— almost everything (cannot modify RBAC itself)cluster-admin— truly everything
3.7 Experiment 4: scope expansion Link to heading
On top of the existing Role+RoleBinding, add a ClusterRole + ClusterRoleBinding (binding the same SA reader):
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: pod-reader # cluster scoped, no namespace field
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: pod-reader-cluster-binding
subjects:
- kind: ServiceAccount
name: reader
namespace: tight
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole # ClusterRoleBinding must reference a ClusterRole
name: pod-reader
Re-run the same four checks:
| Check | Before CRB | After CRB |
|---|---|---|
| list pods in tight | yes | yes |
| delete pods in tight | no | no |
| list pods in default | no | yes ← changed |
| list secrets in tight | no | no |
The third row flipping to yes proves the ClusterRoleBinding extends the permission to any namespace.
3.8 RBAC evaluation = union, not override Link to heading
The single most important design point: when multiple bindings reference the same SA, permissions are a union.
Evaluating "can reader list pods in default?"
├ RoleBinding tight/pod-reader-binding: binding is in tight, asking about default → skip
└ ClusterRoleBinding pod-reader-cluster-binding: applies cluster-wide → allow → YES
Any rule that allows → yes. No deny rules. No override semantics.
Why the design:
- Multi-team cluster ownership — union lets each team add its own permissions without trampling others
- The SA name isn’t a unique key — one SA being referenced by many bindings is normal
The only way to take permissions away = delete a binding. True “deny” requires an admission webhook (Kyverno / OPA Gatekeeper) outside RBAC.
3.9 Experiment 5: deleting a binding to verify “only adds, never overrides” Link to heading
kubectl delete clusterrolebinding pod-reader-cluster-binding
kubectl auth can-i list pods --as=system:serviceaccount:tight:reader -n default # no (rolled back)
kubectl auth can-i list pods --as=system:serviceaccount:tight:reader -n tight # yes (local RoleBinding still in place)
The pods [list get watch] line in --list output for the default namespace disappears, confirming that deleting the ClusterRoleBinding actually rescinds the permission.
3.10 RBAC audit cheat sheet Link to heading
# What can this identity do right now?
kubectl auth can-i --list --as=system:serviceaccount:<ns>:<name> -n <ns>
# Which bindings reference this SA?
kubectl get rolebinding,clusterrolebinding -A -o json \
| jq '.items[] | select(.subjects[]?.name=="<sa>" and .subjects[]?.namespace=="<ns>") | {kind, name: .metadata.name, ns: .metadata.namespace, roleRef}'
# Reverse: who can do X? (needs the krew plugin)
kubectl who-can list pods -n default
In can-i --list output the cloud of selfsubject* / /healthz / etc. entries is the baseline given to every authenticated user (via the system:authenticated group). Filter those out when auditing.
4. Takeaways — seeds for the Stripe project Link to heading
4.1 K8s security = “Linux mechanisms wrapped in declarative API” Link to heading
K8s didn’t invent new security primitives. It packaged the existing Linux ones (capabilities, seccomp, AppArmor, namespaces) into pod spec fields:
securityContext.capabilities.drop → Day 5 cap layer
securityContext.seccompProfile → Day 3-4 seccomp filter
securityContext.appArmorProfile → Day 5 AppArmor profile
pod.spec.hostNetwork / hostPID → namespace isolation
If you understand Week 1’s kernel mechanisms, you can read a K8s pod spec and reason about exactly which kernel-level state is being requested.
4.2 fail-open vs fail-closed is a core design choice Link to heading
PSA label typo fails open vs RBAC resource typo fails closed — the same K8s, two opposite trade-offs. Production hygiene: identify which points are fail-open, then put a stricter admission layer (webhook) on top of them.
4.3 Layered admission as defense in depth Link to heading
PSA gates the spec; RBAC gates the caller; admission webhooks gate business policies that neither can express. This is Day 5’s five-layer sandbox idea translated to the K8s admission layer.
4.4 Extending LLM eval dimensions Link to heading
When the Stripe project asks an LLM to generate K8s security configs, the evaluation axes generalize to:
- PSA: pick the right profile (baseline vs restricted — don’t blindly restricted everything and break images)
- RBAC: least privilege; no
verbs: ["*"], noresources: ["*"] - Cumulative binding effect: each binding looks small individually, but does their union add up to admin?
- Fail-open traps: does a typoed label / nonexistent resource make the policy silently inactive?
5. Takeaways Link to heading
- K8s = Linux security mechanisms wrapped in declarative spec — no new primitives
- PSA: 3 profiles (privileged/baseline/restricted) × 3 modes (enforce/audit/warn), activated by namespace label
- restricted requires 4 fields:
runAsNonRoot/allowPrivilegeEscalation: false/drop ALL caps/seccompProfile securityContexthas two tiers: pod-level (shared) + container-level (per-process)- Passing admission ≠ actually starting — the image’s
USERmust also be non-root, or setrunAsUserexplicitly - PSA label typo fails open (vs RBAC fails closed) — must be backed by Kyverno enforcement
- RBAC four objects: Role/ClusterRole + RoleBinding/ClusterRoleBinding
- Of the four combinations, Role+CRB is rejected outright; ClusterRole+RoleBinding is the “scope-down” pattern for built-in ClusterRoles
- RBAC evaluation is union, not override — the only way to take permissions away is to delete a binding
kubectl auth can-i [--list] --as=...is the first-line RBAC debugging tool — no pod deployment required
6. Tomorrow’s preview (Day 2) Link to heading
NetworkPolicy + Secrets. Continuing on today’s cluster + tight namespace:
- Cluster is flat / fully connected by default; once a NetworkPolicy is applied = default-deny + explicit allow
- Secret vs ConfigMap; env-mount vs volume-mount and the security difference
- Same fail-open / fail-closed lens: what does a NetworkPolicy with no ingress rule actually do?