Week2 Day 4: K8s security — AppArmor

AppArmor as the path-based complement to seccomp’s syscall-based filtering — and a hands-on lesson in why that complement is harder to deploy. This post covers how K8s wraps AppArmor (securityContext.appArmorProfile, symmetric with seccomp’s three types), why an AppArmor profile is heavier than a seccomp one (it must be pre-loaded into the node’s kernel via apparmor_parser, not just dropped as a file), why deny /data/** w blocks every syscall that writes there (closing the hole where touch bypassed yesterday’s mkdir block), and the day’s most valuable lesson: the experiment couldn’t run at all because a Mac/Docker-Desktop kind node has no AppArmor LSM in its kernel — a firsthand encounter with the fail-open trap and the portability problem of AppArmor-based mitigations.

Environment note: today’s real enforcement experiment couldn’t be done hands-on — the kind node under Mac + Docker Desktop has no AppArmor LSM in its kernel (/sys/module/apparmor doesn’t exist). AppArmor depends on the host kernel having the LSM compiled in, and Docker Desktop’s slim Linux VM kernel doesn’t. So this post is concept-focused + a full contrast with seccomp, with real enforcement verification deferred to an Ubuntu environment. This “can’t do it” is itself one of today’s most important lessons (the environment-dependency / portability problem of AppArmor mitigations).

0. Where this fits Link to heading

Day 3 was seccomp (block by syscall); today is AppArmor (block by path) — the last kernel security mechanism of Week 2, forming a complete contrast with seccomp. The core: seccomp blocks “the kind of action” (syscall), AppArmor blocks “the target of the action” (path) — their weaknesses are orthogonal and they compose complementarily.

Week 1 Day 5 already covered AppArmor’s kernel mechanics (LSM hook, profile syntax, enforce/complain, path-based weakness); today’s focus is “how K8s wraps it” + “the engineering lesson of environment dependency.”


1. AppArmor’s K8s integration Link to heading

1.1 Recap (Week 1 Day 5 bridge) Link to heading

AppArmor = an LSM (Linux Security Module) intercepting at kernel LSM hooks, doing path-based MAC. The profile attaches at execve time (the bprm_check_security hook), keyed by binary path.

Week 1 key points:

  • path-based (authorizes by file path, not by syscall)
  • intercepts at LSM hooks, sees the resolved path (seccomp’s blind spot)
  • enforce (block) / complain (log only) modes
  • path-based weakness: hardlink / bind mount can bypass

1.2 The core split with seccomp Link to heading

seccompAppArmor
Blockssyscall number + integer argsresolved path / network
Example“forbid the mkdir syscall”“forbid writing any file under /data”
Weaknesssyscall-variant bypass (touch uses openat to bypass a mkdir block)path-remap bypass (hardlink/bind mount)
Blind spotcan’t dereference pointers (can’t see path content)can’t see namespace transitions
Portabilitypresent in any Linux kerneldepends on the host kernel having the LSM enabled

Core mental model: seccomp blocks “which syscall you use,” AppArmor blocks “which path you touch.” Yesterday touch (openat) bypassed the mkdir seccomp block, but with AppArmor blocking writes to /data, touch is blocked too (regardless of syscall, it only looks at the path).

A precision point: seccomp can see integer args (it can distinguish socket(AF_INET) vs socket(AF_UNIX)), it just can’t see pointer-dereferenced content (string paths, structs). So “block socket by protocol family” is doable with seccomp, “block open by path” needs AppArmor. Which layer a given attack’s mitigation goes to depends on “is the arg you want to filter an integer or a pointer.”

1.3 The K8s field for attaching a profile Link to heading

New (1.30+ GA): securityContext

spec:
  securityContext:
    appArmorProfile:
      type: Localhost                  # or RuntimeDefault / Unconfined
      localhostProfile: k8s-deny-data  # name of a profile already loaded on the node

Three types symmetric with seccomp: RuntimeDefault (the runtime’s default profile, docker-default) / Localhost (custom) / Unconfined (none).

Old (pre-1.30): annotation (deprecated, seen in old YAML)

metadata:
  annotations:
    container.apparmor.security.beta.kubernetes.io/<container>: localhost/k8s-deny-data

The annotation is per-container; securityContext is pod/container-level — the new form is cleaner.

1.4 The key difference: how a profile gets to the node (heavier than seccomp) Link to heading

  • seccomp Localhost: a JSON file placed at /var/lib/kubelet/seccomp/, runc reads and compiles it into cBPF. The file is the profile.
  • AppArmor Localhost: the profile must first be loaded into the node’s kernel via apparmor_parser, becoming a named profile; the pod only references that name already in the kernel.

Flow:

1. Write the profile text (AppArmor syntax)
2. On the node, run apparmor_parser -r profile.txt → load into the kernel
3. The kernel now has a named profile (e.g. "k8s-deny-data")
4. The pod spec references it via localhostProfile: k8s-deny-data
5. runc, at exec, attaches the process to that in-kernel profile

Architectural root cause: a seccomp filter is per-process and self-carried (runc installs it on the about-to-exec process); an AppArmor profile is a kernel-global named object (registered into the kernel first, processes attach to it). This gives AppArmor an extra “pre-load into kernel” step, and requires the node kernel to have the AppArmor LSM enabled — exactly what the kind node lacks.


2. Profile syntax + enforce/complain + the fail-open trap Link to heading

2.1 A K8s-scenario profile Link to heading

The syntax is Week 1 Day 5’s (same underlying AppArmor). “Deny writing /data”:

#include <tunables/global>

profile k8s-deny-data flags=(attach_disconnected) {
  #include <abstractions/base>

  file,                          # allow file access by default
  network,                       # allow network

  deny /data/** w,               # core: block all writes to /data
  deny /data/ w,
}
  • the profile name k8s-deny-data = what localhostProfile references in the pod spec
  • flags=(attach_disconnected) is common in containers — in a container’s mount namespace AppArmor can’t always resolve a complete (“connected”) path; this flag lets it still handle disconnected paths
  • deny /data/** w is path-based: regardless of mkdir/openat/mknod, any write under /data is blocked — one line closes the hole touch bypassed yesterday

2.2 enforce vs complain (Week 1 Day 5 bridge) Link to heading

  • enforce: violation denied (EACCES/EPERM) + audit log
  • complain: violation allowed, audit log only (apparmor="ALLOWED")

complain is for developing a profile (observe which paths the app touches, then tighten), enforce for production. Parallel to seccomp’s SCMP_ACT_LOG, and the same “whitelist: run real traffic under LOG first to collect” workflow: observe → converge → enforce.

2.3 Today’s most important lesson: fail-open when the node doesn’t support it Link to heading

fail-open (allow when the mechanism is missing, favors availability) vs fail-closed (deny when missing, favors security).

Scenario: a pod spec requires an AppArmor profile, but the node it’s scheduled to has no AppArmor in its kernel (like the kind node):

  • fail-open (pre-1.30 behavior): the pod starts normally, the profile silently doesn’t take effect → you think there’s protection, but it’s actually wide open, with no error
  • fail-closed (1.30+ securityContext): K8s detects the node doesn’t support it and refuses to start the pod → you find out immediately

Why fail-open is dangerous: not the “no protection” itself, but that “you believe there’s protection, so you relax other defenses, but there isn’t” — false security. Worse than “no protection configured at all” (with none, you know to be careful; with fail-open, you’re oblivious).

The K8s community migrated AppArmor from annotation (fail-open) to securityContext (fail-closed) precisely because “a security mechanism silently failing” is the most dangerous class of bug.

This is the same trap running through all of Week 2:

  • Day 1: PSA label typo enforece → policy silently inactive → fail-open
  • Day 2: kindnet not enforcing NetworkPolicy → policy inert → fail-open
  • Day 4: node without AppArmor → profile silently gone (old behavior) → fail-open

All three are “you declared a security control, but the underlying environment made it silently fail, and you don’t know.”

Trade-off: fail-closed has a cost too (pods that can’t start hurt availability). Security-sensitive scenarios pick fail-closed (AppArmor is right to); availability-sensitive ones might pick fail-open (e.g. if a rate limiter dies, you might rather let traffic through than take the whole site down). “When a component fails, favor security or availability” is a core architectural decision.


3. The path-based weakness + the seccomp/AppArmor combination Link to heading

3.1 The path-based weakness in K8s Link to heading

Week 1 Day 5: rules bind to paths, so changing the “path↔inode mapping” bypasses. In containers:

  • hardlink: ln /data/target /tmp/safe; the profile allows writing /tmp/safe but the inode is /data’s → bypass
  • bind mount: a container with CAP_SYS_ADMIN can bind-mount a whole directory to bypass
  • mount namespace complicates path resolution: a container has its own mount ns, so /data in the profile is a different real path from the container’s vs the host’s view — that’s why the attach_disconnected flag exists; path semantics get complex under namespace nesting, making AppArmor more error-prone in containers than on bare metal

seccomp doesn’t have this weakness (syscall numbers are kernel-fixed integers, no “renaming”), but has its own blind spots (can’t see paths, many syscall variants). The weaknesses are orthogonal.

3.2 seccomp + AppArmor combination (Week 1 Day 5’s “five layers” returns) Link to heading

Attack: write a malicious file to /data
  ├ seccomp blocking mkdir only → openat(O_CREAT) bypasses ❌
  ├ AppArmor blocking /data writes only → hardlink /data to an allowed path bypasses ❌
  └ seccomp + AppArmor together:
       seccomp limits the syscall set (shrinks the variant space)
       AppArmor limits the path (regardless of syscall)
       + cap drop / namespace blocks CAP_SYS_ADMIN (prevents bind mount)
       → bypass paths closed off one by one ✅

Attached together in a pod spec:

spec:
  securityContext:
    seccompProfile:
      type: RuntimeDefault          # limit syscall set
    appArmorProfile:
      type: Localhost
      localhostProfile: k8s-deny-data  # limit path
  containers:
  - name: c
    securityContext:
      capabilities:
        drop: ["ALL"]               # block CAP_SYS_ADMIN (prevents bind mount)
      allowPrivilegeEscalation: false

This is exactly what PSA restricted wants to enforce (Week2 Day 1) — it requires seccomp + drop caps + non-root, which is assembling this multi-layer complementary combination. PSA restricted ≈ packaging the key layers of Week 1’s five-layer sandbox into an admission standard.

The cognitive loop: Week 1 learned the kernel mechanisms → Week 2 learned how they become pod spec fields → PSA restricted packages the security combination into a declarative standard. Three layers of abstraction (kernel mechanism → pod spec field → admission policy) are the same thing wrapped at different heights. Once you understand the bottom, PSA restricted’s four requirements aren’t a “magic checklist” but “the minimal set assembling multi-layer complementary defense.”

3.3 The key to defense-in-depth: orthogonal weaknesses Link to heading

Not stacking two same-kind defenses (that’s just redundancy), but layering two with orthogonal weaknesses — A’s blind spot is exactly B’s strength:

  • seccomp’s blind spot (path / syscall variants) = AppArmor’s strength (by path, covering all syscalls)
  • AppArmor’s blind spot (path remapping) is closed by cap drop + namespace

The key to multi-layer defense isn’t “more layers” but “the layers’ blind spots don’t overlap.” This is the core framework for evaluating “how many mitigation layers an attack needs, and at which layer each goes.”


4. Takeaways — seeds for the Stripe project Link to heading

4.1 Mitigation portability is an independent dimension Link to heading

AppArmor depends on the host kernel having the LSM enabled — today’s kind node lacks exactly that, verified firsthand. An LLM-generated AppArmor profile mitigation silently doesn’t take effect on a node without AppArmor (fail-open). seccomp is in any Linux kernel — far more portable. “Which layer to pick a mitigation” isn’t only about security, but also portability.

4.2 fail-open is a more dangerous failure mode than “no protection” Link to heading

A fail-open mitigation (assuming the node has AppArmor) is a time bomb on a heterogeneous fleet. An ideal mitigation either uses a universally-supported mechanism (seccomp), or explicitly requires fail-closed (don’t schedule to a node that doesn’t support it). “Does the mitigation fail open or closed when the target mechanism is unavailable” should be an independent eval dimension.

4.3 Multi-layer defense needs orthogonal weaknesses, not more layers Link to heading

When evaluating a mitigation combination, look at whether the layers’ blind spots overlap. seccomp + AppArmor is the textbook case (syscall dimension + path dimension, weaknesses offset).

4.4 Reasoning from the bottom up to the top design Link to heading

Understanding Week 1’s kernel mechanics lets you see through what PSA restricted’s four requirements actually block and miss at the bottom. Evaluating a mitigation’s quality requires this “see through the wrapping” ability.


5. Takeaways Link to heading

  1. AppArmor = an LSM, blocks by path — complements seccomp’s pointer/path blind spot
  2. K8s integration: securityContext.appArmorProfile, three types symmetric with seccomp
  3. Heavier than seccomp: the profile must be pre-loaded into the node kernel via apparmor_parser, and the node kernel must have AppArmor enabled
  4. path-based advantage: deny /data/** w in one line blocks every syscall writing to /data (closing the touch-bypass hole)
  5. path-based weakness: hardlink/bind mount remapping can bypass; container mount ns complicates path resolution (needs attach_disconnected)
  6. node unsupported → fail-open (old) → fail-closed (1.30+) — a silently-failing security mechanism is the most dangerous bug; today’s most valuable lesson
  7. seccomp + AppArmor weaknesses are orthogonal, compose complementarily — syscall dimension + path dimension, exactly what PSA restricted assembles
  8. Portability: AppArmor depends on the host kernel LSM, worse than seccomp — the kind node lacks this layer outright
  9. fail-open vs fail-closed trade-off — security-sensitive picks fail-closed, availability-sensitive may pick fail-open
  10. Cognitive loop: kernel mechanism (W1) → pod spec field (W2) → admission policy (PSA), three wrappings of the same thing

6. Deferred: the real experiment on Ubuntu Link to heading

The hands-on part skipped today due to environment limits, to be done on Ubuntu (cloud VM / bare metal, kernel with AppArmor enabled):

  1. aa-status to confirm AppArmor is enabled
  2. Write the k8s-deny-data profile, apparmor_parser -r to load it into the kernel
  3. Deploy a pod referencing the profile
  4. From inside the pod, touch /data/x to verify the block (EACCES)
  5. Contrast with yesterday’s seccomp: same touch, bypasses a mkdir block but is blocked by /data
  6. On the node, check dmesg | grep apparmor for apparmor="DENIED" records
  7. Check /proc/<pid>/attr/current to confirm profile attachment (analogous to Day 3’s /proc Seccomp field)

7. Tomorrow’s preview (Day 5) Link to heading

Admission controllers + Kyverno/OPA Gatekeeper. Week 2 finale, doesn’t depend on AppArmor, the environment can run it:

  • ValidatingAdmissionWebhook / MutatingAdmissionWebhook mechanics (interception/rewriting before the request hits etcd)
  • Install Kyverno, write a policy requiring every pod to set seccomp (can also require AppArmor)
  • Submit a non-compliant pod to verify the block
  • This is the layer that tightens all the prior mechanisms (PSA/RBAC/NetworkPolicy/seccomp/AppArmor) into “production-grade enforcement” — and the tool that solves the Day 1-4 fail-open traps (use policy to require “every ns must have a valid PSA label,” etc.)