Complete DevOps Practices Cheat Sheet: Essential Guide for Modern Software Delivery

Introduction

DevOps is a cultural and technical movement that bridges the gap between software development (Dev) and IT operations (Ops). It emphasizes collaboration, automation, and continuous improvement to deliver software faster, more reliably, and with higher quality. DevOps practices are essential for modern organizations seeking competitive advantage through rapid, reliable software delivery and improved operational efficiency.

Core DevOps Principles

The Three Ways of DevOps

WayFocusDescriptionKey Practices
First WayFlowOptimize work flow from Dev to OpsCI/CD, Automation, Small batches
Second WayFeedbackAmplify feedback loopsMonitoring, Testing, Fast recovery
Third WayContinuous Learning**Foster experimentation and learningBlameless postmortems, Risk-taking

CALMS Framework

  • Culture: Collaboration, shared responsibility, trust
  • Automation: Eliminate manual, repetitive tasks
  • Lean: Focus on value, eliminate waste
  • Measurement: Data-driven decisions and improvements
  • Sharing: Knowledge sharing and transparency

DevOps Lifecycle and Practices

Plan Phase

Purpose: Define requirements, plan work, track progress

Key Practices:

  • Agile methodologies (Scrum, Kanban)
  • User story mapping
  • Sprint planning and backlog management
  • Requirements traceability
  • Risk assessment and mitigation planning

Tools: Jira, Azure DevOps, Trello, Asana, Monday.com


Code Phase

Purpose: Write, review, and version control code

Key Practices:

  • Version Control: Git workflows (GitFlow, GitHub Flow)
  • Code Reviews: Pull/merge request processes
  • Pair Programming: Collaborative coding
  • Code Standards: Linting, formatting, style guides
  • Documentation: README files, API docs, inline comments

Git Workflow Best Practices:

Main Branch Strategy:
├── main (production-ready code)
├── develop (integration branch)
├── feature/* (new features)
├── release/* (release preparation)
└── hotfix/* (urgent production fixes)

Tools: Git, GitHub, GitLab, Bitbucket, Azure Repos


Build Phase

Purpose: Compile, package, and prepare applications

Key Practices:

  • Automated Builds: Triggered by code commits
  • Build Optimization: Parallel builds, caching
  • Artifact Management: Storing build outputs
  • Dependency Management: Package managers, lock files
  • Build Reproducibility: Consistent build environments

Build Pipeline Components:

  1. Source code checkout
  2. Dependency installation
  3. Code compilation
  4. Unit test execution
  5. Code quality analysis
  6. Artifact creation
  7. Artifact storage

Tools: Jenkins, GitLab CI, GitHub Actions, Azure Pipelines, TeamCity


Test Phase

Purpose: Validate code quality and functionality

Testing Pyramid:

         /\
        /  \  Manual/Exploratory Tests
       /____\
      /      \  Integration Tests
     /________\
    /          \  Unit Tests
   /____________\

Testing Types and Strategies:

Test TypeScopeAutomation LevelTools
Unit TestsIndividual functions/methodsHighJUnit, pytest, Jest
Integration TestsComponent interactionsHighTestNG, Postman, REST Assured
System TestsEnd-to-end workflowsMediumSelenium, Cypress, Playwright
Performance TestsLoad, stress, scalabilityMediumJMeter, LoadRunner, K6
Security TestsVulnerabilities, complianceHighOWASP ZAP, SonarQube, Snyk

Test Automation Best Practices:

  • Maintain test pyramid ratios (70% unit, 20% integration, 10% E2E)
  • Implement shift-left testing
  • Use test data management strategies
  • Maintain test environment consistency
  • Implement parallel test execution

Release Phase

Purpose: Deploy applications to various environments

Deployment Strategies:

StrategyDescriptionProsConsUse Case
Blue-GreenTwo identical environments, switch trafficZero downtime, easy rollbackHigh resource costCritical applications
RollingGradual replacement of instancesResource efficientPartial downtime riskMost applications
CanarySmall traffic percentage to new versionRisk mitigationComplex setupHigh-risk changes
Feature FlagsControl feature visibilityFine-grained controlCode complexityA/B testing

Release Management:

  • Environment Promotion: Dev → Test → Staging → Production
  • Release Planning: Coordination, communication, rollback plans
  • Change Management: Approval processes, documentation
  • Deployment Automation: Infrastructure as Code (IaC)

Tools: Kubernetes, Docker, Ansible, Terraform, Helm, Spinnaker


Deploy Phase

Purpose: Install and configure applications in target environments

Deployment Best Practices:

  • Immutable Infrastructure: Replace rather than modify
  • Configuration Management: Externalized, environment-specific
  • Health Checks: Readiness and liveness probes
  • Gradual Rollouts: Minimize blast radius
  • Automated Rollbacks: Quick recovery mechanisms

Container Deployment Pattern:

Application Code + Dependencies → Container Image → Registry → Orchestrator → Running Container

Operate Phase

Purpose: Run and maintain applications in production

Key Practices:

  • Infrastructure Monitoring: CPU, memory, disk, network
  • Application Monitoring: Performance metrics, error rates
  • Log Management: Centralized logging, log analysis
  • Alerting: Proactive issue detection
  • Incident Response: On-call procedures, escalation

Site Reliability Engineering (SRE) Principles:

  • Service Level Objectives (SLOs)
  • Error budgets
  • Toil reduction
  • Reliability engineering

Monitor Phase

Purpose: Observe system behavior and gather insights

Observability Pillars:

PillarPurposeExamplesTools
MetricsQuantitative measurementsResponse time, throughput, error ratePrometheus, Grafana, Datadog
LogsDiscrete event recordsApplication logs, system logsELK Stack, Splunk, Fluentd
TracesRequest flow trackingDistributed tracingJaeger, Zipkin, New Relic

Key Metrics to Monitor:

  • Golden Signals: Latency, Traffic, Errors, Saturation
  • Business Metrics: User engagement, conversion rates
  • Technical Metrics: Infrastructure utilization, deployment frequency

Continuous Integration/Continuous Deployment (CI/CD)

CI/CD Pipeline Stages

Code Commit → Build → Test → Security Scan → Package → Deploy → Monitor
     ↑                                                                ↓
     └─────────────────── Feedback Loop ──────────────────────────────┘

CI Best Practices

  • Frequent Commits: Small, focused changes
  • Fast Builds: Optimize build times (<10 minutes)
  • Fail Fast: Stop pipeline on first failure
  • Parallel Execution: Run tests concurrently
  • Build Once, Deploy Many: Promote same artifact

CD Best Practices

  • Automated Deployments: Minimize manual intervention
  • Environment Parity: Keep environments similar
  • Progressive Delivery: Gradual feature rollouts
  • Monitoring Integration: Deploy with observability
  • Rollback Capability: Quick recovery options

Infrastructure as Code (IaC)

IaC Principles

  • Declarative: Describe desired state, not steps
  • Idempotent: Same result regardless of execution count
  • Version Controlled: Track infrastructure changes
  • Testable: Validate infrastructure configurations
  • Modular: Reusable, composable components

IaC Tools Comparison

ToolTypeStrengthsBest For
TerraformDeclarativeMulti-cloud, large ecosystemComplex infrastructure
AnsibleImperative/DeclarativeAgentless, easy learning curveConfiguration management
CloudFormationDeclarativeAWS native, deep integrationAWS-only environments
PulumiImperativeReal programming languagesDeveloper-friendly IaC

IaC Best Practices

  • Use modules/roles for reusability
  • Implement state management (remote backends)
  • Validate configurations before applying
  • Use secrets management for sensitive data
  • Document infrastructure decisions

Containerization and Orchestration

Docker Best Practices

Dockerfile Optimization:

# Use specific, minimal base images
FROM node:16-alpine

# Set working directory
WORKDIR /app

# Copy package files first (layer caching)
COPY package*.json ./
RUN npm ci --only=production

# Copy application code
COPY . .

# Use non-root user
USER node

# Expose port
EXPOSE 3000

# Use exec form for CMD
CMD ["node", "server.js"]

Container Security:

  • Scan images for vulnerabilities
  • Use minimal base images
  • Run as non-root user
  • Implement resource limits
  • Keep containers stateless

Kubernetes Best Practices

Resource Management:

resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"

Health Checks:

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5

Monitoring and Observability

Monitoring Strategy

Four Golden Signals:

  1. Latency: Time to process requests
  2. Traffic: Amount of demand on system
  3. Errors: Rate of failed requests
  4. Saturation: Resource utilization

SLI/SLO Framework:

  • SLI (Service Level Indicator): Quantitative measure
  • SLO (Service Level Objective): Target value/range for SLI
  • SLA (Service Level Agreement): Business agreement

Alerting Best Practices

  • Alert on symptoms, not causes
  • Use multiple severity levels
  • Implement alert fatigue prevention
  • Include runbook links in alerts
  • Test alerting mechanisms regularly

Security in DevOps (DevSecOps)

Security Integration Points

PhaseSecurity PracticesTools
PlanThreat modeling, security requirementsOWASP Threat Dragon
CodeStatic analysis, secure coding practicesSonarQube, Checkmarx
BuildDependency scanning, SASTSnyk, WhiteSource
TestDAST, penetration testingOWASP ZAP, Burp Suite
DeployInfrastructure scanning, complianceTerraform security, Falco
MonitorRuntime security, anomaly detectionFalco, Sysdig

Security Best Practices

  • Shift Left Security: Integrate early in pipeline
  • Principle of Least Privilege: Minimal required permissions
  • Defense in Depth: Multiple security layers
  • Secrets Management: Vault, encrypted storage
  • Compliance as Code: Automated compliance checks

Common Challenges and Solutions

Technical Challenges

Challenge: Slow Build Times Solutions:

  • Implement build caching
  • Use parallel execution
  • Optimize dependencies
  • Use incremental builds

Challenge: Environment Inconsistencies Solutions:

  • Use containerization
  • Implement IaC
  • Standardize base images
  • Use configuration management

Challenge: Deployment Failures Solutions:

  • Implement automated testing
  • Use deployment strategies (blue-green, canary)
  • Create rollback procedures
  • Monitor deployment health

Cultural Challenges

Challenge: Dev/Ops Silos Solutions:

  • Cross-functional teams
  • Shared responsibilities
  • Regular communication
  • Joint metrics and goals

Challenge: Resistance to Change Solutions:

  • Start with pilot projects
  • Demonstrate quick wins
  • Provide training and support
  • Leadership buy-in

Best Practices and Tips

Team Practices

  • Cross-functional Collaboration: Break down silos
  • Shared Ownership: Everyone responsible for production
  • Blameless Postmortems: Focus on system improvements
  • Continuous Learning: Regular retrospectives and training
  • Documentation: Keep runbooks and procedures updated

Technical Practices

  • Everything as Code: Infrastructure, configuration, policies
  • Immutable Infrastructure: Replace, don’t modify
  • Microservices Architecture: Loosely coupled, independently deployable
  • API-First Design: Enable integration and automation
  • Test Automation: Comprehensive, reliable test suites

Process Practices

  • Small Batch Sizes: Frequent, small releases
  • Fast Feedback: Quick detection and resolution
  • Continuous Improvement: Regular process optimization
  • Risk Management: Gradual rollouts, feature flags
  • Metrics-Driven Decisions: Use data to guide improvements

DevOps Metrics and KPIs

DORA Metrics (DevOps Research and Assessment)

MetricDescriptionElite PerformersHigh Performers
Deployment FrequencyHow often code is deployedOn-demand (multiple per day)Between once per week and once per month
Lead Time for ChangesTime from commit to productionLess than one hourBetween one week and one month
Mean Time to RecoveryTime to recover from failuresLess than one hourLess than one day
Change Failure RatePercentage of deployments causing failures0-15%0-15%

Additional Metrics

  • Mean Time Between Failures (MTBF)
  • System Availability/Uptime
  • Code Coverage Percentage
  • Technical Debt Ratio
  • Customer Satisfaction Scores

Tool Ecosystem

CI/CD Platforms

  • Jenkins: Open-source, highly customizable
  • GitLab CI/CD: Integrated with GitLab
  • GitHub Actions: Native GitHub integration
  • Azure DevOps: Microsoft ecosystem integration
  • CircleCI: Cloud-native, fast builds

Monitoring and Observability

  • Prometheus + Grafana: Open-source monitoring stack
  • Datadog: Comprehensive APM platform
  • New Relic: Application performance monitoring
  • Splunk: Log analysis and SIEM
  • ELK Stack: Elasticsearch, Logstash, Kibana

Container and Orchestration

  • Docker: Containerization platform
  • Kubernetes: Container orchestration
  • OpenShift: Enterprise Kubernetes platform
  • Docker Swarm: Docker-native orchestration
  • Amazon ECS/EKS: AWS container services

Infrastructure as Code

  • Terraform: Multi-cloud IaC
  • Ansible: Configuration management
  • Chef: Infrastructure automation
  • Puppet: Configuration management
  • AWS CloudFormation: AWS-native IaC

Getting Started Roadmap

Phase 1: Foundation (Months 1-3)

  1. Version Control: Implement Git workflows
  2. Basic CI: Automated builds and tests
  3. Containerization: Dockerize applications
  4. Monitoring: Basic application monitoring

Phase 2: Automation (Months 4-6)

  1. CD Pipeline: Automated deployments
  2. Infrastructure as Code: Terraform/Ansible
  3. Security Integration: SAST/DAST tools
  4. Enhanced Monitoring: Logging and alerting

Phase 3: Optimization (Months 7-12)

  1. Advanced Deployment: Blue-green, canary
  2. Microservices: Service decomposition
  3. Observability: Distributed tracing
  4. Culture: Cross-functional teams

Phase 4: Excellence (Ongoing)

  1. Site Reliability Engineering: SLOs, error budgets
  2. Chaos Engineering: Resilience testing
  3. AI/ML Integration: Intelligent operations
  4. Continuous Improvement: Regular optimization

Resources for Further Learning

Essential Books

  • “The Phoenix Project” by Gene Kim, Kevin Behr, George Spafford
  • “The DevOps Handbook” by Gene Kim, Jez Humble, Patrick Debois
  • “Accelerate” by Nicole Forsgren, Jez Humble, Gene Kim
  • “Site Reliability Engineering” by Google
  • “Continuous Delivery” by Jez Humble and David Farley

Online Platforms

  • Coursera: DevOps specializations
  • Udemy: Hands-on DevOps courses
  • A Cloud Guru: Cloud and DevOps training
  • Pluralsight: Technology skills platform
  • Linux Academy: Cloud and DevOps learning

Certifications

  • AWS Certified DevOps Engineer
  • Microsoft Azure DevOps Engineer Expert
  • Google Professional Cloud DevOps Engineer
  • Docker Certified Associate
  • Kubernetes Administrator (CKA)

Communities and Conferences

  • DevOps Enterprise Summit
  • DockerCon
  • KubeCon + CloudNativeCon
  • DevOps.com Community
  • Reddit r/devops

Tools and Documentation

  • Kubernetes Documentation
  • Docker Documentation
  • Terraform Documentation
  • Jenkins User Handbook
  • CNCF Landscape

Quick Reference Commands

Git Commands

# Feature branch workflow
git checkout -b feature/new-feature
git add .
git commit -m "Add new feature"
git push origin feature/new-feature

# Create pull request, then merge
git checkout main
git pull origin main
git branch -d feature/new-feature

Docker Commands

# Build and run container
docker build -t myapp:latest .
docker run -p 3000:3000 myapp:latest

# Container management
docker ps                    # List running containers
docker logs <container-id>   # View logs
docker exec -it <id> bash   # Access container shell

Kubernetes Commands

# Deployment management
kubectl apply -f deployment.yaml
kubectl get pods
kubectl describe pod <pod-name>
kubectl logs <pod-name>

# Service management
kubectl expose deployment myapp --port=80 --target-port=3000
kubectl get services

Terraform Commands

# Infrastructure management
terraform init
terraform plan
terraform apply
terraform destroy

# State management
terraform state list
terraform state show <resource>

Memory Aids

DevOps Acronyms

  • CAMS: Culture, Automation, Measurement, Sharing
  • DORA: DevOps Research and Assessment
  • SRE: Site Reliability Engineering
  • IaC: Infrastructure as Code
  • CI/CD: Continuous Integration/Continuous Deployment

Remember the Three Ways

  1. Flow: Optimize the entire value stream
  2. Feedback: Amplify feedback loops
  3. Continuous Learning: Foster experimentation

This comprehensive cheatsheet covers the essential DevOps practices, tools, and methodologies needed for successful software delivery in modern organizations.

Scroll to Top