Docker interview questions covering containerisation fundamentals, image building, orchestration, networking, security, and production best practices.
Go beyond the surface: "Containers share the host kernel and use namespaces for isolation. This means no boot sequence, no guest OS memory overhead, and millisecond startup. The trade-off is weaker isolation than VMs — a kernel exploit escapes all containers."
VMs virtualise the hardware layer — each VM runs a full guest OS on a hypervisor. Containers virtualise the OS layer — they share the host kernel and isolate processes using namespaces and cgroups. Containers are more efficient because they skip the guest OS overhead: faster startup (milliseconds vs minutes), lower memory footprint (MBs vs GBs), and higher density (hundreds of containers vs tens of VMs on the same host). Trade-offs: containers share a kernel, so a kernel vulnerability affects all containers. VMs provide stronger isolation. Strong candidates mention: OCI runtime spec, the role of namespaces (pid, net, mnt, uts, ipc, user) and cgroups (resource limits), and that containers are not lightweight VMs — they are isolated processes.
Fundamental containerisation knowledge. Candidates who say containers are lightweight VMs have a mental model that will cause security and architecture mistakes. Those who mention namespaces and cgroups understand what Docker actually does.
Show the caching strategy: "I COPY package.json and install dependencies before copying application code. That way, code changes do not invalidate the dependency cache. Multi-stage builds keep the final image free of build tools."
Each Dockerfile instruction creates a layer. Layers are cached and reused — if a layer has not changed, Docker skips rebuilding it and everything above. Optimisation: order instructions from least to most frequently changing (COPY package.json before COPY . so dependency install is cached), combine RUN commands to reduce layers, use multi-stage builds to separate build-time dependencies from the runtime image, use specific base image tags (not latest), and clean up in the same RUN command (apt-get install && rm -rf /var/lib/apt/lists/*). Strong candidates discuss: the union filesystem (overlay2), why each COPY invalidates subsequent layers, using .dockerignore to exclude unnecessary files, and choosing slim or alpine base images.
Core Docker optimisation skill. Candidates who write naive Dockerfiles with no layer ordering produce slow builds and bloated images. Those who understand the layer cache and use multi-stage builds create efficient CI pipelines.
Show the pattern: "Build stage installs dependencies and compiles. Production stage starts from a minimal image and copies only the binary or build output. My Go images go from 1.2GB with the toolchain to 15MB with just the binary on scratch."
Multi-stage builds use multiple FROM instructions. Earlier stages install build tools, compile code, and run tests. The final stage copies only the compiled artifacts into a minimal base image. Benefits: the production image contains no build tools, compilers, source code, or dev dependencies — dramatically reducing image size and attack surface. Example: stage 1 uses node:20 to npm install and npm run build, stage 2 uses nginx:alpine and copies only the dist/ folder. Strong candidates discuss: naming stages (FROM golang:1.22 AS builder), copying between stages (COPY --from=builder), using build stages for running tests in CI, and the security benefit of excluding source code from production images.
Essential for production Docker usage. Candidates shipping images with build tools and source code are creating bloated, insecure images. Those who use multi-stage builds demonstrate understanding of the production container lifecycle.
Clarify the key difference: "Custom bridge networks give automatic DNS — containers find each other by name. The default bridge does not. I always create a named network in Compose rather than relying on the default."
Bridge (default): creates a private network on the host. Containers on the same bridge can communicate by container name (DNS resolution via embedded DNS server). Exposed ports are mapped to the host with -p. Host: container shares the host network namespace directly — no isolation but no port mapping overhead. Overlay: spans multiple Docker hosts for Swarm or multi-node setups. Containers communicate via virtual networks. Key concepts: each bridge gets a subnet, containers get IP addresses, port publishing (host:container mapping), and custom bridge networks provide automatic DNS resolution (the default bridge does not). Strong candidates discuss: using custom networks for service isolation, the difference between expose and publish, container-to-container communication within Compose, and DNS-based service discovery.
Tests understanding of container networking. Candidates who only use port mapping without understanding bridge networks will struggle with multi-container setups. Those who understand DNS resolution and network isolation can design proper service architectures.
Match mount type to use case: "Named volumes for database data — portable and Docker-managed. Bind mounts for development — I see code changes instantly. tmpfs for secrets or scratch data that should never touch disk."
Volumes: managed by Docker, stored in /var/lib/docker/volumes/, portable, and the recommended approach for persistent data. Bind mounts: map a specific host path into the container — useful for development (live code reloading) but tied to the host filesystem structure. tmpfs: stored in memory only, never written to disk — good for sensitive data that should not persist. Key considerations: volumes survive container removal, bind mounts depend on host paths, and data in the container writable layer is lost when the container is removed. Strong candidates discuss: named volumes for databases, bind mounts for development workflows, volume drivers for cloud storage (EFS, Azure Files), and the importance of not storing application state in the container layer.
Tests data management understanding. Candidates who lose data because they did not mount a volume for their database have a fundamental gap. Those who understand the three mount types and their trade-offs handle stateful containers correctly.
Show production awareness: "I use healthcheck on the database so the app service waits until Postgres is actually accepting connections, not just until the container starts. depends_on alone is not enough."
Compose defines services, networks, and volumes in a YAML file. Service dependencies: depends_on controls startup order but does not wait for readiness — use healthcheck with condition: service_healthy for true dependency management. Environment: use env_file for shared variables, environment for service-specific overrides, and .env for Compose variable substitution. Best practices: one service per container, custom networks for isolation, named volumes for persistence, and profiles for optional services (debug tools, admin panels). Strong candidates discuss: Compose override files (docker-compose.override.yml) for dev-specific config, the watch feature for live reloading, resource limits (deploy.resources), and build caching with cache_from.
Tests practical multi-container development. Candidates who have race conditions because they rely on depends_on without health checks are missing a critical pattern. Those who structure Compose files with proper health checks, networks, and override files manage complex stacks reliably.
Layer the defences: "Non-root user, minimal base image, read-only filesystem, dropped capabilities, resource limits, and image scanning in CI. Each layer catches what the others miss."
Key practices: run as non-root user (USER directive), use minimal base images (alpine, distroless, scratch), scan images for vulnerabilities (Trivy, Snyk, Docker Scout), do not store secrets in images or environment variables (use Docker secrets or external vaults), set read-only root filesystem where possible, drop capabilities (--cap-drop=ALL, then add back only what is needed), limit resources (memory, CPU), use signed images with Docker Content Trust, and keep the host Docker daemon updated. Strong candidates discuss: the principle of least privilege applied to containers, seccomp profiles, AppArmor/SELinux, the risk of mounting the Docker socket, and that running as root inside a container is root on the host without user namespace remapping.
Critical for production container security. Candidates who run containers as root with full capabilities are creating unnecessary risk. Those who understand the layered security model and can articulate specific mitigations build secure container deployments.
Follow a systematic approach: "First docker logs for application errors. If the container dies immediately, I check the exit code — 137 means OOM, 1 means the app crashed. Then docker run with an interactive shell to inspect the filesystem and test commands manually."
Diagnostic steps: check container logs (docker logs), inspect the container state (docker inspect), check exit code for clues (137 = OOM killed, 1 = application error), try running interactively (docker run -it --entrypoint sh) to explore the filesystem, check if the image builds correctly, verify environment variables are set, and check volume mounts and permissions. Common causes: missing environment variables, wrong file permissions on mounted volumes, port conflicts, insufficient memory limits, missing dependencies in the image, and entrypoint scripts failing. Strong candidates discuss: docker events for daemon-level debugging, exec into running containers, using ephemeral debug containers (docker debug), and checking dmesg for OOM kills.
Tests operational skills. Candidates who cannot debug a crashing container will be stuck in production incidents. Those who follow a systematic approach from logs to exit codes to interactive inspection resolve issues efficiently.
Show the full pipeline: "CI builds with BuildKit, runs tests in the image, scans with Trivy, tags with the git SHA, and pushes to ECR. Deployment pulls the specific SHA tag — never latest — and rolls out with health checks."
Pipeline stages: build the image (docker build with build args for version/commit), run tests inside a container (docker run with test command or docker compose for integration tests), scan for vulnerabilities, tag with version and commit SHA, push to a registry (Docker Hub, ECR, GCR, GHCR), and deploy (pull and restart, or update orchestrator). Best practices: use BuildKit for parallel builds and cache mounts, cache layers between CI runs (--cache-from), use immutable tags (never overwrite latest in production), and sign images. Strong candidates discuss: multi-platform builds with buildx, registry authentication in CI, image promotion between environments (dev → staging → prod tags), and the difference between building in CI versus building in Docker (Docker-in-Docker vs Docker-out-of-Docker).
Senior DevOps question. Candidates who build and push without testing or scanning are shipping unvalidated images. Those who understand layer caching in CI, immutable tags, and image promotion have production-grade pipelines.
Show the key difference: "Memory limits are hard — exceed them and the OOM killer terminates your process. CPU limits are soft — exceed them and your container gets throttled. I profile the application first, then set limits with headroom."
Memory: set with --memory flag or deploy.resources.limits.memory in Compose. When a container exceeds its memory limit, the kernel OOM-kills the process (exit code 137). Memory reservation (--memory-reservation) is a soft limit for scheduling. CPU: --cpus limits the number of CPU cores, --cpu-shares sets relative weight for scheduling. Unlike memory, exceeding CPU limits does not kill the container — it gets throttled. Strong candidates discuss: the difference between limits (hard ceiling) and reservations (scheduling hint), swap limits (--memory-swap), the impact of CPU throttling on application latency, monitoring container resource usage with docker stats, and setting appropriate limits based on application profiling rather than guessing.
Tests operational understanding. Candidates who do not set resource limits risk one container consuming all host resources. Those who understand the difference between memory kills and CPU throttling can troubleshoot production performance issues.
Show a practical strategy: "Every image gets a git SHA tag for traceability and a semver tag for releases. CI scans before push. A cron job deletes images older than 90 days except tagged releases. Production always pulls a specific SHA, never latest."
Tagging strategies: semantic version tags (v1.2.3) for releases, git SHA tags for traceability, environment tags (staging, production) as mutable pointers, and never use latest in production (it is ambiguous and mutable). Image cleanup: set retention policies to delete old images (by age or count), use registry garbage collection, and remove untagged manifests. Security scanning: integrate Trivy, Snyk, or Docker Scout into CI to scan before pushing, and periodically scan existing images for newly discovered CVEs. Strong candidates discuss: multi-architecture manifests, registry mirroring for availability, pull-through caches for rate limit avoidance, and image signing with cosign or Docker Content Trust for supply chain security.
Tests production maturity. Candidates who push all images as latest and never clean up will have registries full of untracked, potentially vulnerable images. Those with a clear tagging, scanning, and retention strategy manage images professionally.
Clarify both layers: "USER in the Dockerfile sets who runs inside the container. Rootless Docker sets who runs the daemon on the host. You want both — non-root in the container AND a rootless daemon — for maximum isolation."
Rootless Docker runs the Docker daemon itself as a non-root user on the host, using user namespaces to map container root to an unprivileged host user. This is different from USER in a Dockerfile, which sets the in-container user but still requires a root-owned daemon. Rootless Docker provides defence in depth: even if an attacker escapes the container, they land as an unprivileged user on the host. Trade-offs: some features are limited (no privileged ports below 1024 without workarounds, no AppArmor, limited storage drivers), and networking uses slirp4netns or pasta which can be slower. Strong candidates discuss: the difference between rootless mode and userns-remap, Podman as a rootless-by-default alternative, when rootless is mandatory (compliance, multi-tenant environments), and the performance implications for networking and storage.
Advanced security question. Candidates who think a non-root USER directive is sufficient do not understand container escapes. Those who understand rootless mode and can articulate the trade-offs demonstrate deep container security knowledge.
Show the production pattern: "Apps log JSON to stdout. Docker captures it. In dev, I use the json-file driver with rotation. In production, I ship to CloudWatch or Loki via a logging driver so logs are centralised and searchable."
Applications in containers should log to stdout/stderr — Docker captures these streams. Logging drivers determine where the output goes: json-file (default, local files), syslog, journald, fluentd, awslogs, gcplogs. For production: ship logs to a centralised system (ELK, Loki, CloudWatch) using a logging driver or a sidecar collector. Monitoring: docker stats for basic metrics, cAdvisor for detailed container metrics, Prometheus + Grafana for dashboards and alerting. Strong candidates discuss: the log-opts for controlling rotation and size with the json-file driver, structured logging (JSON) for parseability, the trade-off between blocking and non-blocking logging modes, and correlating logs across multiple containers with request IDs or trace context.
Tests operational maturity. Candidates who log to files inside containers lose logs when containers restart. Those who understand logging drivers, structured logging, and centralised collection build observable systems.
Show CI-specific optimisation: "I use BuildKit with --cache-from pointing to the registry so CI runners pull cached layers. Cache mounts persist the npm/pip cache across builds. Dependency changes rebuild one layer, not everything."
BuildKit features: --mount=type=cache for persistent package manager caches (pip, npm, apt) across builds, --cache-from and --cache-to for exporting and importing layer caches from registries, and parallel execution of independent build stages. Registry-based caching: push cache to the registry so CI runners without local state can reuse layers. Cache invalidation: use COPY for dependency files before source code, leverage .dockerignore to prevent unnecessary invalidation, and use build arguments carefully (each unique ARG value busts the cache). Strong candidates discuss: the difference between inline and registry cache backends, using GitHub Actions cache with BuildKit, cache key strategies for monorepos, and the time/storage trade-off of caching large layers.
Senior build optimisation question. Candidates whose CI builds take 10 minutes because they rebuild everything from scratch do not understand Docker caching. Those who use BuildKit cache mounts and registry-based caching have fast, efficient pipelines.
Match complexity to need: "Compose for single-server deployments — most startups live here longer than they think. Swarm when I need multi-node with minimal ceremony. Kubernetes when I need auto-scaling, complex networking policies, or the team already has the expertise."
Docker Compose: single-host development and simple production deployments. No clustering, no auto-scaling, but minimal complexity. Docker Swarm: built-in orchestration for multi-node clusters. Simpler than Kubernetes, good for small teams who need basic service scaling, rolling updates, and load balancing without the Kubernetes learning curve. Kubernetes: full-featured orchestration for large-scale deployments. Auto-scaling, self-healing, service mesh, RBAC, and a massive ecosystem — but significant operational complexity. Decision factors: team size, number of services, scaling requirements, uptime SLAs, and existing expertise. Strong candidates explain: most applications do not need Kubernetes, Compose is production-viable for many workloads behind a reverse proxy, and the operational cost of Kubernetes often exceeds the benefit for small teams.
Tests architectural judgement. Candidates who default to Kubernetes for every project are over-engineering. Those who never consider orchestration beyond Compose may struggle at scale. Look for the ability to assess project needs and match the tool to the complexity budget.