Week2 Day 4: K8s security — AppArmor
AppArmor as the path-based complement to seccomp’s syscall-based filtering — and a hands-on lesson in why that complement is harder to deploy. This post covers how K8s wraps AppArmor (securityContext.appArmorProfile, symmetric with seccomp’s three types), why an AppArmor profile is heavier than a seccomp one (it must be pre-loaded into the node’s kernel via apparmor_parser, not just dropped as a file), why deny /data/** w blocks every syscall that writes there (closing the hole where touch bypassed yesterday’s mkdir block), and the day’s most valuable lesson: the experiment couldn’t run at all because a Mac/Docker-Desktop kind node has no AppArmor LSM in its kernel — a firsthand encounter with the fail-open trap and the portability problem of AppArmor-based mitigations.
Environment note: today’s real enforcement experiment couldn’t be done hands-on — the kind node under Mac + Docker Desktop has no AppArmor LSM in its kernel (
/sys/module/apparmordoesn’t exist). AppArmor depends on the host kernel having the LSM compiled in, and Docker Desktop’s slim Linux VM kernel doesn’t. So this post is concept-focused + a full contrast with seccomp, with real enforcement verification deferred to an Ubuntu environment. This “can’t do it” is itself one of today’s most important lessons (the environment-dependency / portability problem of AppArmor mitigations).
0. Where this fits Link to heading
Day 3 was seccomp (block by syscall); today is AppArmor (block by path) — the last kernel security mechanism of Week 2, forming a complete contrast with seccomp. The core: seccomp blocks “the kind of action” (syscall), AppArmor blocks “the target of the action” (path) — their weaknesses are orthogonal and they compose complementarily.
Week 1 Day 5 already covered AppArmor’s kernel mechanics (LSM hook, profile syntax, enforce/complain, path-based weakness); today’s focus is “how K8s wraps it” + “the engineering lesson of environment dependency.”
1. AppArmor’s K8s integration Link to heading
1.1 Recap (Week 1 Day 5 bridge) Link to heading
AppArmor = an LSM (Linux Security Module) intercepting at kernel LSM hooks, doing path-based MAC. The profile attaches at execve time (the
bprm_check_securityhook), keyed by binary path.
Week 1 key points:
- path-based (authorizes by file path, not by syscall)
- intercepts at LSM hooks, sees the resolved path (seccomp’s blind spot)
- enforce (block) / complain (log only) modes
- path-based weakness: hardlink / bind mount can bypass
1.2 The core split with seccomp Link to heading
| seccomp | AppArmor | |
|---|---|---|
| Blocks | syscall number + integer args | resolved path / network |
| Example | “forbid the mkdir syscall” | “forbid writing any file under /data” |
| Weakness | syscall-variant bypass (touch uses openat to bypass a mkdir block) | path-remap bypass (hardlink/bind mount) |
| Blind spot | can’t dereference pointers (can’t see path content) | can’t see namespace transitions |
| Portability | present in any Linux kernel | depends on the host kernel having the LSM enabled |
Core mental model: seccomp blocks “which syscall you use,” AppArmor blocks “which path you touch.” Yesterday touch (openat) bypassed the mkdir seccomp block, but with AppArmor blocking writes to /data, touch is blocked too (regardless of syscall, it only looks at the path).
A precision point: seccomp can see integer args (it can distinguish socket(AF_INET) vs socket(AF_UNIX)), it just can’t see pointer-dereferenced content (string paths, structs). So “block socket by protocol family” is doable with seccomp, “block open by path” needs AppArmor. Which layer a given attack’s mitigation goes to depends on “is the arg you want to filter an integer or a pointer.”
1.3 The K8s field for attaching a profile Link to heading
New (1.30+ GA): securityContext
spec:
securityContext:
appArmorProfile:
type: Localhost # or RuntimeDefault / Unconfined
localhostProfile: k8s-deny-data # name of a profile already loaded on the node
Three types symmetric with seccomp: RuntimeDefault (the runtime’s default profile, docker-default) / Localhost (custom) / Unconfined (none).
Old (pre-1.30): annotation (deprecated, seen in old YAML)
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/<container>: localhost/k8s-deny-data
The annotation is per-container; securityContext is pod/container-level — the new form is cleaner.
1.4 The key difference: how a profile gets to the node (heavier than seccomp) Link to heading
- seccomp Localhost: a JSON file placed at
/var/lib/kubelet/seccomp/, runc reads and compiles it into cBPF. The file is the profile. - AppArmor Localhost: the profile must first be loaded into the node’s kernel via
apparmor_parser, becoming a named profile; the pod only references that name already in the kernel.
Flow:
1. Write the profile text (AppArmor syntax)
2. On the node, run apparmor_parser -r profile.txt → load into the kernel
3. The kernel now has a named profile (e.g. "k8s-deny-data")
4. The pod spec references it via localhostProfile: k8s-deny-data
5. runc, at exec, attaches the process to that in-kernel profile
Architectural root cause: a seccomp filter is per-process and self-carried (runc installs it on the about-to-exec process); an AppArmor profile is a kernel-global named object (registered into the kernel first, processes attach to it). This gives AppArmor an extra “pre-load into kernel” step, and requires the node kernel to have the AppArmor LSM enabled — exactly what the kind node lacks.
2. Profile syntax + enforce/complain + the fail-open trap Link to heading
2.1 A K8s-scenario profile Link to heading
The syntax is Week 1 Day 5’s (same underlying AppArmor). “Deny writing /data”:
#include <tunables/global>
profile k8s-deny-data flags=(attach_disconnected) {
#include <abstractions/base>
file, # allow file access by default
network, # allow network
deny /data/** w, # core: block all writes to /data
deny /data/ w,
}
- the profile name
k8s-deny-data= whatlocalhostProfilereferences in the pod spec flags=(attach_disconnected)is common in containers — in a container’s mount namespace AppArmor can’t always resolve a complete (“connected”) path; this flag lets it still handle disconnected pathsdeny /data/** wis path-based: regardless of mkdir/openat/mknod, any write under /data is blocked — one line closes the hole touch bypassed yesterday
2.2 enforce vs complain (Week 1 Day 5 bridge) Link to heading
- enforce: violation denied (EACCES/EPERM) + audit log
- complain: violation allowed, audit log only (
apparmor="ALLOWED")
complain is for developing a profile (observe which paths the app touches, then tighten), enforce for production. Parallel to seccomp’s SCMP_ACT_LOG, and the same “whitelist: run real traffic under LOG first to collect” workflow: observe → converge → enforce.
2.3 Today’s most important lesson: fail-open when the node doesn’t support it Link to heading
fail-open (allow when the mechanism is missing, favors availability) vs fail-closed (deny when missing, favors security).
Scenario: a pod spec requires an AppArmor profile, but the node it’s scheduled to has no AppArmor in its kernel (like the kind node):
- fail-open (pre-1.30 behavior): the pod starts normally, the profile silently doesn’t take effect → you think there’s protection, but it’s actually wide open, with no error
- fail-closed (1.30+ securityContext): K8s detects the node doesn’t support it and refuses to start the pod → you find out immediately
Why fail-open is dangerous: not the “no protection” itself, but that “you believe there’s protection, so you relax other defenses, but there isn’t” — false security. Worse than “no protection configured at all” (with none, you know to be careful; with fail-open, you’re oblivious).
The K8s community migrated AppArmor from annotation (fail-open) to securityContext (fail-closed) precisely because “a security mechanism silently failing” is the most dangerous class of bug.
This is the same trap running through all of Week 2:
- Day 1: PSA label typo
enforece→ policy silently inactive → fail-open - Day 2: kindnet not enforcing NetworkPolicy → policy inert → fail-open
- Day 4: node without AppArmor → profile silently gone (old behavior) → fail-open
All three are “you declared a security control, but the underlying environment made it silently fail, and you don’t know.”
Trade-off: fail-closed has a cost too (pods that can’t start hurt availability). Security-sensitive scenarios pick fail-closed (AppArmor is right to); availability-sensitive ones might pick fail-open (e.g. if a rate limiter dies, you might rather let traffic through than take the whole site down). “When a component fails, favor security or availability” is a core architectural decision.
3. The path-based weakness + the seccomp/AppArmor combination Link to heading
3.1 The path-based weakness in K8s Link to heading
Week 1 Day 5: rules bind to paths, so changing the “path↔inode mapping” bypasses. In containers:
- hardlink:
ln /data/target /tmp/safe; the profile allows writing /tmp/safe but the inode is /data’s → bypass - bind mount: a container with CAP_SYS_ADMIN can bind-mount a whole directory to bypass
- mount namespace complicates path resolution: a container has its own mount ns, so /data in the profile is a different real path from the container’s vs the host’s view — that’s why the
attach_disconnectedflag exists; path semantics get complex under namespace nesting, making AppArmor more error-prone in containers than on bare metal
seccomp doesn’t have this weakness (syscall numbers are kernel-fixed integers, no “renaming”), but has its own blind spots (can’t see paths, many syscall variants). The weaknesses are orthogonal.
3.2 seccomp + AppArmor combination (Week 1 Day 5’s “five layers” returns) Link to heading
Attack: write a malicious file to /data
├ seccomp blocking mkdir only → openat(O_CREAT) bypasses ❌
├ AppArmor blocking /data writes only → hardlink /data to an allowed path bypasses ❌
└ seccomp + AppArmor together:
seccomp limits the syscall set (shrinks the variant space)
AppArmor limits the path (regardless of syscall)
+ cap drop / namespace blocks CAP_SYS_ADMIN (prevents bind mount)
→ bypass paths closed off one by one ✅
Attached together in a pod spec:
spec:
securityContext:
seccompProfile:
type: RuntimeDefault # limit syscall set
appArmorProfile:
type: Localhost
localhostProfile: k8s-deny-data # limit path
containers:
- name: c
securityContext:
capabilities:
drop: ["ALL"] # block CAP_SYS_ADMIN (prevents bind mount)
allowPrivilegeEscalation: false
This is exactly what PSA restricted wants to enforce (Week2 Day 1) — it requires seccomp + drop caps + non-root, which is assembling this multi-layer complementary combination. PSA restricted ≈ packaging the key layers of Week 1’s five-layer sandbox into an admission standard.
The cognitive loop: Week 1 learned the kernel mechanisms → Week 2 learned how they become pod spec fields → PSA restricted packages the security combination into a declarative standard. Three layers of abstraction (kernel mechanism → pod spec field → admission policy) are the same thing wrapped at different heights. Once you understand the bottom, PSA restricted’s four requirements aren’t a “magic checklist” but “the minimal set assembling multi-layer complementary defense.”
3.3 The key to defense-in-depth: orthogonal weaknesses Link to heading
Not stacking two same-kind defenses (that’s just redundancy), but layering two with orthogonal weaknesses — A’s blind spot is exactly B’s strength:
- seccomp’s blind spot (path / syscall variants) = AppArmor’s strength (by path, covering all syscalls)
- AppArmor’s blind spot (path remapping) is closed by cap drop + namespace
The key to multi-layer defense isn’t “more layers” but “the layers’ blind spots don’t overlap.” This is the core framework for evaluating “how many mitigation layers an attack needs, and at which layer each goes.”
4. Takeaways — seeds for the Stripe project Link to heading
4.1 Mitigation portability is an independent dimension Link to heading
AppArmor depends on the host kernel having the LSM enabled — today’s kind node lacks exactly that, verified firsthand. An LLM-generated AppArmor profile mitigation silently doesn’t take effect on a node without AppArmor (fail-open). seccomp is in any Linux kernel — far more portable. “Which layer to pick a mitigation” isn’t only about security, but also portability.
4.2 fail-open is a more dangerous failure mode than “no protection” Link to heading
A fail-open mitigation (assuming the node has AppArmor) is a time bomb on a heterogeneous fleet. An ideal mitigation either uses a universally-supported mechanism (seccomp), or explicitly requires fail-closed (don’t schedule to a node that doesn’t support it). “Does the mitigation fail open or closed when the target mechanism is unavailable” should be an independent eval dimension.
4.3 Multi-layer defense needs orthogonal weaknesses, not more layers Link to heading
When evaluating a mitigation combination, look at whether the layers’ blind spots overlap. seccomp + AppArmor is the textbook case (syscall dimension + path dimension, weaknesses offset).
4.4 Reasoning from the bottom up to the top design Link to heading
Understanding Week 1’s kernel mechanics lets you see through what PSA restricted’s four requirements actually block and miss at the bottom. Evaluating a mitigation’s quality requires this “see through the wrapping” ability.
5. Takeaways Link to heading
- AppArmor = an LSM, blocks by path — complements seccomp’s pointer/path blind spot
- K8s integration:
securityContext.appArmorProfile, three types symmetric with seccomp - Heavier than seccomp: the profile must be pre-loaded into the node kernel via
apparmor_parser, and the node kernel must have AppArmor enabled - path-based advantage:
deny /data/** win one line blocks every syscall writing to /data (closing the touch-bypass hole) - path-based weakness: hardlink/bind mount remapping can bypass; container mount ns complicates path resolution (needs attach_disconnected)
- node unsupported → fail-open (old) → fail-closed (1.30+) — a silently-failing security mechanism is the most dangerous bug; today’s most valuable lesson
- seccomp + AppArmor weaknesses are orthogonal, compose complementarily — syscall dimension + path dimension, exactly what PSA restricted assembles
- Portability: AppArmor depends on the host kernel LSM, worse than seccomp — the kind node lacks this layer outright
- fail-open vs fail-closed trade-off — security-sensitive picks fail-closed, availability-sensitive may pick fail-open
- Cognitive loop: kernel mechanism (W1) → pod spec field (W2) → admission policy (PSA), three wrappings of the same thing
6. Deferred: the real experiment on Ubuntu Link to heading
The hands-on part skipped today due to environment limits, to be done on Ubuntu (cloud VM / bare metal, kernel with AppArmor enabled):
aa-statusto confirm AppArmor is enabled- Write the
k8s-deny-dataprofile,apparmor_parser -rto load it into the kernel - Deploy a pod referencing the profile
- From inside the pod,
touch /data/xto verify the block (EACCES) - Contrast with yesterday’s seccomp: same touch, bypasses a mkdir block but is blocked by /data
- On the node, check
dmesg | grep apparmorforapparmor="DENIED"records - Check
/proc/<pid>/attr/currentto confirm profile attachment (analogous to Day 3’s /proc Seccomp field)
7. Tomorrow’s preview (Day 5) Link to heading
Admission controllers + Kyverno/OPA Gatekeeper. Week 2 finale, doesn’t depend on AppArmor, the environment can run it:
- ValidatingAdmissionWebhook / MutatingAdmissionWebhook mechanics (interception/rewriting before the request hits etcd)
- Install Kyverno, write a policy requiring every pod to set seccomp (can also require AppArmor)
- Submit a non-compliant pod to verify the block
- This is the layer that tightens all the prior mechanisms (PSA/RBAC/NetworkPolicy/seccomp/AppArmor) into “production-grade enforcement” — and the tool that solves the Day 1-4 fail-open traps (use policy to require “every ns must have a valid PSA label,” etc.)