Container Orchestration Cheat Sheet: The Ultimate Guide for DevOps Engineers

Introduction: What is Container Orchestration?

Container orchestration automates the deployment, scaling, networking, and management of containerized applications across clusters of hosts. Instead of manually managing individual containers, orchestration tools like Kubernetes, Docker Swarm, and others provide a framework to control container lifecycles at scale, ensuring high availability, optimal resource utilization, and simplified operations.

Why Container Orchestration Matters:

  • Manages complex, multi-container applications across distributed environments
  • Automates deployment, scaling, and failover processes
  • Optimizes resource utilization across your infrastructure
  • Provides self-healing capabilities for improved reliability
  • Simplifies networking between containers and external systems

Core Concepts & Principles

Key Components of Container Orchestration

ComponentDescription
ContainersLightweight, portable units that package application code with dependencies
ClusterGroup of machines (nodes) that run containerized applications
Control PlaneBrain of the orchestration system that makes global decisions
Worker NodesMachines that actually run the containers
ServicesDefinitions of how applications should run and be accessed
PodsSmallest deployable units that can contain one or more containers (Kubernetes)
Tasks/JobsUnits of work assigned to worker nodes
Overlay NetworksAllow containers to communicate across multiple hosts
VolumesPersistent storage that can be attached to containers
RegistriesRepositories for storing and distributing container images

Core Principles

  • Declarative Configuration: Define the desired state, let the system figure out how to achieve it
  • Immutability: Containers are immutable; changes require new deployments
  • Self-Healing: Automatically recover from failures by restarting, rescheduling, or replacing containers
  • Service Discovery: Automatically detect and connect to services regardless of location
  • Load Balancing: Distribute traffic across containers for performance and availability
  • Scaling: Increase or decrease container instances based on demand
  • Rolling Updates: Update applications with zero downtime

Major Container Orchestration Platforms Compared

FeatureKubernetesDocker SwarmAmazon ECSNomad (HashiCorp)
ComplexityHighLowMediumMedium
ScalabilityExcellentGoodExcellentExcellent
Auto-scalingYesLimitedYesYes
Service DiscoveryDNS/Environment VariablesDNS/VIPALB/Service ConnectConsul Integration
Load BalancingInternal or ExternalInternalELB IntegrationRequires Integration
Self-HealingYesYesYesYes
Rolling UpdatesYesYesYesYes
Secrets ManagementNativeNativeAWS Secrets ManagerVault Integration
Community SupportExcellentGoodAWS OnlyGrowing
Learning CurveSteepGentleModerateModerate
Multi-cloud SupportYesYesNo (AWS Only)Yes

Kubernetes Quick Reference

Architecture Components

  • Control Plane Components:

    • API Server: REST API for cluster interaction
    • etcd: Consistent key-value store for all cluster data
    • Scheduler: Assigns pods to nodes
    • Controller Manager: Runs controller processes
    • Cloud Controller Manager: Integrates with cloud providers
  • Node Components:

    • kubelet: Ensures containers are running in a pod
    • kube-proxy: Maintains network rules for service communication
    • Container Runtime: Software for running containers (Docker, containerd, etc.)

Essential kubectl Commands

# Cluster Information
kubectl cluster-info                 # Display cluster info
kubectl get nodes                    # List all nodes
kubectl describe node <node-name>    # Show detailed node info

# Pod Management
kubectl get pods                     # List all pods in current namespace
kubectl get pods -A                  # List pods across all namespaces
kubectl describe pod <pod-name>      # Show detailed pod info
kubectl logs <pod-name>              # View pod logs
kubectl exec -it <pod-name> -- sh    # Open shell in pod

# Deployments
kubectl create deployment <name> --image=<image>  # Create deployment
kubectl get deployments                           # List all deployments
kubectl scale deployment <name> --replicas=<n>    # Scale deployment
kubectl rollout status deployment <name>          # Check rollout status
kubectl rollout undo deployment <name>            # Rollback deployment

# Services
kubectl expose deployment <name> --port=<port>    # Create service
kubectl get services                              # List all services
kubectl describe service <name>                   # Show service details

# Configuration
kubectl apply -f <file.yaml>                      # Apply config from file
kubectl delete -f <file.yaml>                     # Delete resources from file
kubectl create configmap <name> --from-file=<file>  # Create ConfigMap
kubectl create secret generic <name> --from-literal=key=value  # Create Secret

# Namespace Management
kubectl create namespace <name>                   # Create namespace
kubectl get namespaces                            # List namespaces
kubectl config set-context --current --namespace=<name>  # Switch namespace

# Troubleshooting
kubectl get events                                # View cluster events
kubectl describe pod <pod-name>                   # Check pod events/errors
kubectl logs <pod-name> -p                        # View previous pod logs
kubectl port-forward <pod-name> 8080:80           # Forward local port to pod

Common Kubernetes Resource YAML Structure

# Deployment Example
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: "1"
            memory: "512Mi"
          requests:
            cpu: "0.5"
            memory: "256Mi"
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
---
# Service Example
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP

Docker Swarm Quick Reference

Architecture Components

  • Manager Nodes: Maintain cluster state, schedule services, serve API
  • Worker Nodes: Execute containers (tasks)
  • Services: Definitions of tasks to execute on nodes
  • Tasks: Docker containers running on nodes

Essential Docker Swarm Commands

# Swarm Initialization
docker swarm init --advertise-addr <MANAGER-IP>  # Initialize a swarm
docker swarm join-token worker                   # Get worker join token
docker swarm join-token manager                  # Get manager join token

# Node Management
docker node ls                                   # List all nodes
docker node inspect <NODE-ID>                    # Inspect a node
docker node promote <NODE-ID>                    # Promote to manager
docker node demote <NODE-ID>                     # Demote to worker
docker node update --availability drain <NODE-ID>  # Drain a node

# Service Management
docker service create --name <name> --replicas <n> <image>  # Create service
docker service ls                                # List services
docker service ps <service-name>                 # List tasks in a service
docker service inspect <service-name>            # Inspect a service
docker service scale <service-name>=<n>          # Scale a service
docker service update --image <new-image> <service-name>  # Update service
docker service rm <service-name>                 # Remove service

# Stack Management (using docker-compose.yml)
docker stack deploy -c docker-compose.yml <stack-name>  # Deploy stack
docker stack ls                                  # List stacks
docker stack services <stack-name>               # List services in stack
docker stack ps <stack-name>                     # List stack tasks
docker stack rm <stack-name>                     # Remove stack

# Networks
docker network create --driver overlay <network-name>  # Create overlay network
docker network ls                                # List networks

Docker Stack YAML Example

version: '3.8'

services:
  webapp:
    image: nginx:latest
    ports:
      - "80:80"
    deploy:
      replicas: 3
      restart_policy:
        condition: on-failure
      update_config:
        delay: 5s
        order: start-first
    networks:
      - frontend

  api:
    image: myapp/api:latest
    deploy:
      replicas: 2
    networks:
      - frontend
      - backend

  database:
    image: postgres:14
    environment:
      POSTGRES_PASSWORD: secret
    volumes:
      - db-data:/var/lib/postgresql/data
    networks:
      - backend
    deploy:
      placement:
        constraints:
          - node.role == manager

networks:
  frontend:
  backend:

volumes:
  db-data:

Common Patterns & Best Practices

Multi-Container Design Patterns

  1. Sidecar Pattern: Add functionality to a primary container

    • Example: Log collector, metrics collector, proxy
  2. Ambassador Pattern: Proxy connections to outside world

    • Example: Service mesh proxies (Istio, Linkerd)
  3. Adapter Pattern: Standardize output of main container

    • Example: Monitoring adapters that format metrics for central systems
  4. Init Containers: Run setup tasks before main container starts

    • Example: Schema setup, dependency checks, permission adjustments

Resource Management Best Practices

  • Always set resource requests and limits for containers
  • Use namespace resource quotas to control resource consumption
  • Plan for capacity based on peak usage plus buffer
  • Implement pod disruption budgets for critical applications
  • Use horizontal pod autoscaling based on CPU/memory metrics
  • Consider vertical pod autoscaling for applications that can’t scale horizontally

Security Best Practices

  • Follow principle of least privilege for containers and services
  • Use non-root users inside containers
  • Implement network policies to control traffic between pods
  • Use secrets management for sensitive information
  • Scan container images for vulnerabilities
  • Implement pod security policies/standards
  • Use RBAC for access control to the orchestration platform
  • Regularly update base images and orchestration components

High Availability Strategies

  • Run multiple replicas of critical services
  • Deploy across multiple availability zones
  • Use anti-affinity rules to distribute replicas
  • Implement proper liveness and readiness probes
  • Plan for graceful degradation during partial outages
  • Use persistent storage with proper backup strategies
  • Implement service mesh for resilient communication

Common Challenges & Solutions

Networking Issues

ChallengeSolution
Service DiscoveryUse platform’s built-in DNS or service mesh
Container ConnectivityCheck network policies, DNS resolution, service definitions
External AccessConfigure ingress controllers or load balancers properly
Cross-Cluster CommunicationImplement service mesh or federation capabilities

Storage Challenges

ChallengeSolution
Data PersistenceUse persistent volumes with appropriate storage classes
PerformanceMatch storage class to application needs (SSD vs HDD)
Backup & RecoveryImplement volume snapshots or external backup solutions
Multi-container AccessUse ReadWriteMany volumes where supported

Scaling & Performance

ChallengeSolution
Resource ContentionSet appropriate requests/limits, use priority classes
Slow DeploymentsOptimize image size, use local registries, implement caching
Cold StartsPre-warm critical services, optimize container startup
Autoscaling LagAdjust scaling thresholds, implement predictive scaling

Troubleshooting Steps

  1. Check pod status and events:

    kubectl get pods
    kubectl describe pod <pod-name>
    
  2. Examine logs:

    kubectl logs <pod-name>
    kubectl logs <pod-name> -c <container-name>  # Multi-container pods
    
  3. Verify network connectivity:

    kubectl exec -it <pod-name> -- ping <service-name>
    kubectl exec -it <pod-name> -- curl <service-name>:<port>
    
  4. Check resource utilization:

    kubectl top pods
    kubectl top nodes
    
  5. Debug with temporary pods:

    kubectl run debug --image=busybox --rm -it -- sh
    

Advanced Topics & Extensions

Service Mesh Integration

  • Istio: Provides traffic management, security, and observability
  • Linkerd: Lightweight service mesh focused on simplicity
  • Consul Connect: Service mesh with robust service discovery
  • Kuma: Universal service mesh for Kubernetes and VMs

CI/CD Integration

  • GitOps Workflow: Use Git as source of truth for deployments
  • Helm: Package manager for Kubernetes deployments
  • Argo CD: Declarative GitOps CD tool for Kubernetes
  • Flux: GitOps toolkit for continuous delivery
  • Jenkins X: CI/CD solution for cloud native applications

Observability Stack

  • Prometheus: Metrics collection and alerting
  • Grafana: Visualization dashboards
  • Jaeger/Zipkin: Distributed tracing
  • Fluentd/Loki: Log aggregation
  • ELK Stack: Elasticsearch, Logstash, Kibana for logging

Multi-Cluster Management

  • Kubernetes Federation: Control multiple Kubernetes clusters
  • Rancher: Multi-cluster management platform
  • Anthos: Google’s multi-cloud Kubernetes platform
  • Karmada: Kubernetes resource distributor across clusters
  • Cluster API: Kubernetes cluster lifecycle management

Resources for Further Learning

Documentation

Books

  • “Kubernetes: Up and Running” by Kelsey Hightower et al.
  • “Docker in Practice” by Ian Miell and Aidan Hobson Sayers
  • “The Kubernetes Book” by Nigel Poulton
  • “Cloud Native DevOps with Kubernetes” by John Arundel and Justin Domingus

Courses & Certifications

  • Certified Kubernetes Administrator (CKA)
  • Certified Kubernetes Application Developer (CKAD)
  • Docker Certified Associate (DCA)
  • Cloud Provider Specific: AWS ECS/EKS, GCP GKE, Azure AKS certifications

Community Resources

  • Kubernetes Slack Community
  • CNCF (Cloud Native Computing Foundation) projects
  • DockerCon and KubeCon conferences
  • GitHub repositories and example projects

Practice Environments

  • Minikube: Local Kubernetes cluster
  • Kind: Kubernetes in Docker
  • Play with Kubernetes: Browser-based learning environment
  • Play with Docker: Docker learning playground

This cheat sheet serves as a starting point for container orchestration. As technology evolves rapidly, always refer to official documentation for the most up-to-date information.

Scroll to Top