Kubernetes has become the default compute platform for containerized workloads at companies of every size. But Kubernetes security is hard — the attack surface is expansive, the security primitives are sophisticated, and the default configurations prioritize usability over security in ways that create significant risk in production environments. The NSA/CISA Kubernetes Hardening Guide remains the most comprehensive government guidance on Kubernetes security. This guide synthesizes those recommendations with current best practices for 2026 production environments.
The Kubernetes Attack Surface: What Attackers Target
- Compromised container images: Malicious code injected into base images or application dependencies distributed through public registries.
- Misconfigured RBAC: Overly permissive ClusterRole bindings that allow attackers to escalate privileges once they've compromised any pod in the cluster.
- Exposed API server: The Kubernetes API server, if publicly accessible or improperly authenticated, is the crown jewel target — full control of the cluster.
- Container breakout: Vulnerabilities in container runtimes (runc, containerd) or kernel exploits that allow a compromised container to gain host-level access.
- Network-based lateral movement: Once inside the cluster, compromised pods that can reach other pods' services and move laterally to access sensitive data or services.
Cluster Hardening: Control Plane Security
API Server Access Controls
The Kubernetes API server should never be publicly accessible from the internet. If you're running a managed Kubernetes service (EKS, GKE, AKS), configure the API server endpoint as private and ensure access is controlled via VPN or bastion host. Enable audit logging on the API server — these logs are essential for detecting and investigating security incidents. Enable admission controllers: NodeRestriction (prevents nodes from modifying other nodes' pods), PodSecurity (enforces pod security standards), and optionally OPA Gatekeeper or Kyverno for declarative custom policy enforcement.
RBAC: Least Privilege Strictly Enforced
RBAC misconfiguration is the most common Kubernetes security failure in production environments. Key principles:
- No wildcard permissions: Never grant * on resources or verbs in production. Audit all ClusterRoles for wildcard grants.
- Minimize ClusterRole vs. Role usage: Use namespace-scoped Roles wherever possible. ClusterRoles grant cluster-wide permissions — only use them when truly necessary.
- No binding cluster-admin to service accounts: Service accounts with cluster-admin binding are a critical risk — any pod using that service account becomes a full cluster takeover path for attackers.
Pod Security: Hardening Workload Configurations
Kubernetes' PodSecurity admission controller enforces Pod Security Standards at the namespace level with three profiles: Privileged (no restrictions), Baseline (prevents known privilege escalations), and Restricted (heavily restricted, best practice for most workloads). Label namespaces with the restricted profile as the default and grant exceptions only to namespaces with documented justification.
What every production pod should prevent:
- Running as root (runAsNonRoot: true)
- Privilege escalation (allowPrivilegeEscalation: false)
- Privileged containers (privileged: false)
- Writing to the root filesystem (readOnlyRootFilesystem: true)
- Dropping all Linux capabilities that aren't specifically required
- HostPID, HostIPC, HostNetwork access (only ever for specific system-level workloads)
Supply Chain Security: Container Image Hardening
Every container image should be scanned for known CVEs before being deployed to production. Trivy (open-source, from Aqua Security) is the most widely adopted container vulnerability scanner — it can be integrated into GitHub Actions, GitLab CI, or Jenkins pipelines to fail builds when critical CVEs in base images or application dependencies are detected. Set a policy: no images with Critical CVEs may be deployed to production.
Sigstore's Cosign tool enables cryptographic signing of container images at build time. Deployed alongside a Kubernetes admission controller, this ensures only images signed by your CI/CD system can be deployed to your cluster — blocking deployments of unverified or tampered images even if an attacker gains registry write access.
Runtime Security: Detecting the Unexpected
Falco (CNCF-graduated project maintained by Sysdig) is the de facto standard for Kubernetes runtime security. It monitors Linux system calls at the kernel level and triggers alerts when applications behave unexpectedly (a web server process spawning a shell, outbound connections to unexpected IP addresses, reading credential files). Running Falco as a DaemonSet with a well-tuned ruleset provides the behavioral anomaly detection layer that static security controls cannot.
Network Security: Cluster-Internal Traffic Control
By default, all pods in a Kubernetes cluster can communicate with all other pods — across namespaces, across nodes. For production workloads, this flat network model is a significant lateral movement risk. NetworkPolicies restrict pod-to-pod communication to only the flows required for application functionality. Start with a default-deny-all policy in each namespace and add explicit allow rules for required service-to-service communication paths.
For teams requiring more advanced network security than Kubernetes NetworkPolicies provide, service mesh solutions — Istio, Linkerd, or Cilium with mTLS — provide network encryption and more granular traffic control at the cost of additional operational complexity.