Kubernetes & Docker: Container Orchestration Mastery 2026
Learn Kubernetes and Docker from basics to production deployment. Includes real-world examples, scaling strategies, and DevOps best practices for 2026.
Container orchestration has become essential for modern application deployment. After managing hundreds of Kubernetes clusters in production, here's what you need to know to master container orchestration in 2026.
Related reading: Check out our guides on micro-frontends architecture and Docker development setup for more deployment insights.
Why Container Orchestration Matters#
The Container Revolution#
Before containers:
- "Works on my machine" syndrome
- Complex deployment processes
- Environment inconsistencies
- Difficult scaling
- Resource waste
With containers:
- Consistent environments
- Rapid deployment
- Efficient resource usage
- Easy scaling
- Microservices enablement
Why Kubernetes Won#
Market dominance: 88% of organizations use Kubernetes Cloud native: Native support from all major cloud providers Ecosystem: Massive tooling and community support Flexibility: Runs anywhere (cloud, on-prem, hybrid) Production proven: Powers the world's largest applications
Docker Fundamentals#
Creating Docker Images#
# Multi-stage build for Node.js app
FROM node:18-alpine AS builder
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Copy source code
COPY . .
# Build application
RUN npm run build
# Production stage
FROM node:18-alpine
WORKDIR /app
# Copy only necessary files from builder
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
USER nodejs
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node healthcheck.js
CMD ["node", "dist/server.js"]
Docker Compose for Development#
# docker-compose.yml
version: '3.8'
services:
app:
build:
context: .
dockerfile: Dockerfile.dev
ports:
- "3000:3000"
volumes:
- .:/app
- /app/node_modules
environment:
- NODE_ENV=development
- DATABASE_URL=postgresql://postgres:password@db:5432/myapp
- REDIS_URL=redis://redis:6379
depends_on:
- db
- redis
networks:
- app-network
command: npm run dev
db:
image: postgres:15-alpine
ports:
- "5432:5432"
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
- POSTGRES_DB=myapp
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- app-network
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis-data:/data
networks:
- app-network
command: redis-server --appendonly yes
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- app
networks:
- app-network
volumes:
postgres-data:
redis-data:
networks:
app-network:
driver: bridge
Docker Best Practices#
# Optimized Dockerfile with best practices
# Use specific version tags
FROM node:18.17.0-alpine3.18 AS builder
# Set working directory
WORKDIR /app
# Install security updates
RUN apk update && apk upgrade && \
apk add --no-cache dumb-init
# Copy dependency files first (better caching)
COPY package*.json ./
COPY yarn.lock ./
# Install dependencies with cache mount
RUN --mount=type=cache,target=/root/.yarn \
yarn install --frozen-lockfile --production=false
# Copy source code
COPY . .
# Build application
RUN yarn build && \
yarn install --production --ignore-scripts --prefer-offline
# Production stage
FROM node:18.17.0-alpine3.18
# Install dumb-init for proper signal handling
RUN apk add --no-cache dumb-init
# Create app directory
WORKDIR /app
# Copy built application
COPY --from=builder --chown=node:node /app/dist ./dist
COPY --from=builder --chown=node:node /app/node_modules ./node_modules
COPY --from=builder --chown=node:node /app/package.json ./
# Use non-root user
USER node
# Expose port
EXPOSE 3000
# Use dumb-init to handle signals properly
ENTRYPOINT ["dumb-init", "--"]
# Start application
CMD ["node", "dist/server.js"]
Kubernetes Architecture#
Core Components#
Control Plane:
- API Server: Central management point
- etcd: Distributed key-value store
- Scheduler: Assigns pods to nodes
- Controller Manager: Maintains desired state
- Cloud Controller Manager: Cloud provider integration
Worker Nodes:
- Kubelet: Node agent
- Container Runtime: Docker/containerd/CRI-O
- Kube-proxy: Network proxy
Basic Kubernetes Objects#
# Namespace
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: production
---
# ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: production
data:
APP_ENV: "production"
LOG_LEVEL: "info"
API_URL: "https://api.example.com"
---
# Secret
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
namespace: production
type: Opaque
stringData:
DATABASE_URL: "postgresql://user:pass@db:5432/myapp"
JWT_SECRET: "your-secret-key"
API_KEY: "your-api-key"
---
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
labels:
app: web-app
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
version: v1.0.0
spec:
containers:
- name: app
image: myregistry/web-app:v1.0.0
ports:
- containerPort: 3000
name: http
env:
- name: NODE_ENV
value: "production"
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
volumeMounts:
- name: app-storage
mountPath: /app/data
volumes:
- name: app-storage
persistentVolumeClaim:
claimName: app-pvc
---
# Service
apiVersion: v1
kind: Service
metadata:
name: web-app-service
namespace: production
spec:
type: ClusterIP
selector:
app: web-app
ports:
- port: 80
targetPort: 3000
protocol: TCP
name: http
---
# Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-app-ingress
namespace: production
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
tls:
- hosts:
- app.example.com
secretName: app-tls
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app-service
port:
number: 80
---
# HorizontalPodAutoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 2
periodSeconds: 30
selectPolicy: Max
Advanced Kubernetes Patterns#
StatefulSet for Databases#
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: production
spec:
serviceName: postgres
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15-alpine
ports:
- containerPort: 5432
name: postgres
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
DaemonSet for Monitoring#
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: monitoring
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
hostNetwork: true
hostPID: true
containers:
- name: node-exporter
image: prom/node-exporter:latest
ports:
- containerPort: 9100
name: metrics
volumeMounts:
- name: proc
mountPath: /host/proc
readOnly: true
- name: sys
mountPath: /host/sys
readOnly: true
args:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
volumes:
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
Job and CronJob#
# One-time Job
apiVersion: batch/v1
kind: Job
metadata:
name: database-migration
namespace: production
spec:
template:
spec:
containers:
- name: migration
image: myregistry/migrations:latest
command: ["npm", "run", "migrate"]
envFrom:
- secretRef:
name: app-secrets
restartPolicy: OnFailure
backoffLimit: 3
activeDeadlineSeconds: 600
---
# Scheduled CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-database
namespace: production
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: myregistry/backup:latest
command: ["/bin/sh", "-c"]
args:
- |
pg_dump $DATABASE_URL | gzip > /backup/db-$(date +%Y%m%d).sql.gz
aws s3 cp /backup/db-$(date +%Y%m%d).sql.gz s3://backups/
envFrom:
- secretRef:
name: app-secrets
volumeMounts:
- name: backup-storage
mountPath: /backup
volumes:
- name: backup-storage
emptyDir: {}
restartPolicy: OnFailure
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
Helm Charts#
Creating a Helm Chart#
# Chart.yaml
apiVersion: v2
name: web-app
description: A Helm chart for web application
type: application
version: 1.0.0
appVersion: "1.0.0"
dependencies:
- name: postgresql
version: "12.x.x"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled
- name: redis
version: "17.x.x"
repository: "https://charts.bitnami.com/bitnami"
condition: redis.enabled
# values.yaml
replicaCount: 3
image:
repository: myregistry/web-app
pullPolicy: IfNotPresent
tag: "1.0.0"
service:
type: ClusterIP
port: 80
targetPort: 3000
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: app-tls
hosts:
- app.example.com
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
postgresql:
enabled: true
auth:
username: myapp
password: changeme
database: myapp
primary:
persistence:
size: 10Gi
redis:
enabled: true
auth:
enabled: false
master:
persistence:
size: 1Gi
# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "web-app.fullname" . }}
labels:
{{- include "web-app.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "web-app.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "web-app.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.targetPort }}
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: http
readinessProbe:
httpGet:
path: /ready
port: http
resources:
{{- toYaml .Values.resources | nindent 12 }}
env:
{{- if .Values.postgresql.enabled }}
- name: DATABASE_URL
value: "postgresql://{{ .Values.postgresql.auth.username }}:{{ .Values.postgresql.auth.password }}@{{ include "web-app.fullname" . }}-postgresql:5432/{{ .Values.postgresql.auth.database }}"
{{- end }}
{{- if .Values.redis.enabled }}
- name: REDIS_URL
value: "redis://{{ include "web-app.fullname" . }}-redis-master:6379"
{{- end }}
Using Helm#
# Install chart
helm install my-app ./web-app -n production --create-namespace
# Upgrade release
helm upgrade my-app ./web-app -n production
# Rollback
helm rollback my-app 1 -n production
# List releases
helm list -n production
# Uninstall
helm uninstall my-app -n production
# Install with custom values
helm install my-app ./web-app -n production \
--set replicaCount=5 \
--set image.tag=v2.0.0 \
--values custom-values.yaml
Monitoring and Logging#
Prometheus Monitoring#
# ServiceMonitor for Prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: web-app-metrics
namespace: production
spec:
selector:
matchLabels:
app: web-app
endpoints:
- port: metrics
interval: 30s
path: /metrics
---
# PrometheusRule for Alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: web-app-alerts
namespace: production
spec:
groups:
- name: web-app
interval: 30s
rules:
- alert: HighErrorRate
expr: |
rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value }} for {{ $labels.instance }}"
- alert: HighMemoryUsage
expr: |
container_memory_usage_bytes{pod=~"web-app-.*"} /
container_spec_memory_limit_bytes{pod=~"web-app-.*"} > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage"
description: "Memory usage is {{ $value | humanizePercentage }}"
Centralized Logging#
# Fluentd DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: logging
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccountName: fluentd
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
CI/CD Pipeline#
GitLab CI/CD#
# .gitlab-ci.yml
stages:
- build
- test
- deploy
variables:
DOCKER_REGISTRY: registry.example.com
IMAGE_NAME: $DOCKER_REGISTRY/web-app
KUBE_NAMESPACE: production
build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $DOCKER_REGISTRY
- docker build -t $IMAGE_NAME:$CI_COMMIT_SHA .
- docker tag $IMAGE_NAME:$CI_COMMIT_SHA $IMAGE_NAME:latest
- docker push $IMAGE_NAME:$CI_COMMIT_SHA
- docker push $IMAGE_NAME:latest
only:
- main
test:
stage: test
image: $IMAGE_NAME:$CI_COMMIT_SHA
script:
- npm test
- npm run lint
only:
- main
deploy:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl config set-cluster k8s --server="$KUBE_URL" --insecure-skip-tls-verify=true
- kubectl config set-credentials admin --token="$KUBE_TOKEN"
- kubectl config set-context default --cluster=k8s --user=admin
- kubectl config use-context default
- kubectl set image deployment/web-app app=$IMAGE_NAME:$CI_COMMIT_SHA -n $KUBE_NAMESPACE
- kubectl rollout status deployment/web-app -n $KUBE_NAMESPACE
only:
- main
when: manual
GitHub Actions#
# .github/workflows/deploy.yml
name: Build and Deploy
on:
push:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-push:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v3
- name: Log in to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
deploy:
needs: build-and-push
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Configure kubectl
run: |
echo "${{ secrets.KUBE_CONFIG }}" | base64 -d > kubeconfig
export KUBECONFIG=kubeconfig
- name: Deploy to Kubernetes
run: |
kubectl set image deployment/web-app \
app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
-n production
kubectl rollout status deployment/web-app -n production
Production Best Practices#
Resource Management#
# ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "100"
requests.memory: 200Gi
limits.cpu: "200"
limits.memory: 400Gi
persistentvolumeclaims: "50"
services.loadbalancers: "5"
---
# LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: production-limits
namespace: production
spec:
limits:
- max:
cpu: "2"
memory: "4Gi"
min:
cpu: "100m"
memory: "128Mi"
default:
cpu: "500m"
memory: "512Mi"
defaultRequest:
cpu: "250m"
memory: "256Mi"
type: Container
Network Policies#
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: web-app-network-policy
namespace: production
spec:
podSelector:
matchLabels:
app: web-app
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 3000
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
Security Best Practices#
# PodSecurityPolicy
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: true
Troubleshooting#
Common kubectl Commands#
# Get resources
kubectl get pods -n production
kubectl get deployments -n production
kubectl get services -n production
kubectl get ingress -n production
# Describe resources
kubectl describe pod web-app-xxx -n production
kubectl describe deployment web-app -n production
# Logs
kubectl logs web-app-xxx -n production
kubectl logs -f web-app-xxx -n production # Follow
kubectl logs web-app-xxx -n production --previous # Previous container
# Execute commands
kubectl exec -it web-app-xxx -n production -- /bin/sh
kubectl exec web-app-xxx -n production -- env
# Port forwarding
kubectl port-forward web-app-xxx 8080:3000 -n production
# Copy files
kubectl cp web-app-xxx:/app/data/file.txt ./file.txt -n production
# Debug
kubectl debug web-app-xxx -it --image=busybox -n production
# Events
kubectl get events -n production --sort-by='.lastTimestamp'
# Resource usage
kubectl top nodes
kubectl top pods -n production
Frequently Asked Questions#
Q: Kubernetes vs Docker Swarm? A: Kubernetes won. It has better ecosystem, more features, and industry adoption. Docker Swarm is simpler but less powerful.
Q: How many replicas should I run? A: Minimum 3 for high availability. Use HPA to scale based on metrics. Consider costs vs. reliability needs.
Q: What about serverless containers? A: AWS Fargate, Google Cloud Run, and Azure Container Instances offer serverless containers. Good for simpler workloads without K8s complexity.
Q: How do I handle secrets securely? A: Use external secret managers (AWS Secrets Manager, HashiCorp Vault) with tools like External Secrets Operator. Don't commit secrets to Git.
Q: What's the learning curve? A: Steep. Start with Docker, then basic K8s concepts. Use managed services (EKS, GKE, AKS) to reduce operational burden.
Container orchestration with Kubernetes is complex but essential for modern cloud-native applications. Start small, learn incrementally, and leverage managed services to reduce operational overhead.
Continue Reading
Docker Development Environment Setup: Complete Guide 2026
Step-by-step guide to creating a production-ready Docker development environment with hot reload, debugging, and Docker Compose.
Serverless Edge Computing: The 2026 Revolution in Web Performance
Edge computing is revolutionizing web performance. Learn how to leverage Cloudflare Workers, Vercel Edge Functions, and Deno Deploy for lightning-fast applications.
GraphQL vs REST: The Definitive API Design Guide for 2026
Discover when to use GraphQL vs REST for your API. Includes performance comparisons, implementation examples, and best practices for modern API design.
Browse by Topic
Find stories that matter to you.