Development Environment & Tools - 2/2

From Basic Containerization to Production-Grade Orchestration

You’ve mastered containerization fundamentals that eliminate “works on my machine” problems, implemented Docker best practices with multi-stage builds and security hardening, configured professional container networking that handles service discovery and load balancing, established data persistence strategies with comprehensive backup and recovery, and created Docker Compose orchestration that manages complex multi-service applications. Your containerized applications now run consistently across all environments with proper resource management and monitoring. But here’s the production reality that separates hobby containers from enterprise-grade systems: basic containerization is just the foundation—production deployment requires orchestration at scale, bulletproof security, optimized performance, and enterprise registry management that can handle hundreds of services across multiple data centers.

The production container nightmare that destroys scalable systems:

# Your production container horror story
# Operations Team: "We need to scale the API to handle Black Friday traffic"

# Attempt 1: Manual container scaling
$ docker-compose up --scale app=20
Creating myapp_app_2 ... done
Creating myapp_app_3 ... done
# ... 5 minutes later
Creating myapp_app_15 ... error
Error: Cannot create container for service app: port 3000 already in use

# Attempt 2: Manual port management
$ for i in {3001..3020}; do
    docker run -d -p $i:3000 myapp:latest
  done
# Load balancer configuration nightmare begins
# No health checking, no service discovery, no graceful shutdowns

# Attempt 3: Deploy to production without proper testing
$ docker push myapp:latest
$ ssh production-server
production$ docker pull myapp:latest
# 2GB image takes 10 minutes to download
# No rollback strategy when deployment fails
# No security scanning, no vulnerability management

# The cascading production disasters:
# - Manual scaling hits resource limits and port conflicts
# - No health checking causes traffic to failed containers
# - Image vulnerabilities expose production systems
# - No centralized logging makes debugging impossible
# - Manual deployments cause inconsistent application state
# - No registry management leads to storage explosion
# - Container security bypass allows privilege escalation
# - No resource limits cause memory exhaustion crashes
# - Missing monitoring makes performance problems invisible

# Black Friday result: Complete system failure
# 6-hour outage, $2M revenue loss, emergency all-hands meeting
# CTO resignation, competitor gains market share
# The painful lesson: Containers without orchestration don't scale

The uncomfortable production truth: Perfect containerization with Docker Compose can’t save you from production disasters when your deployment strategy lacks orchestration, security scanning, performance optimization, and enterprise-grade registry management. Professional container operations require thinking beyond single-host deployments.

Real-world container production failure consequences:

// What happens when container production practices are amateur:
const containerProductionFailureImpact = {
  scalingDisasters: {
    problem: "Manual container scaling fails during traffic spikes",
    cause: "No orchestration platform for automatic scaling and load balancing",
    impact:
      "Website crashes during peak traffic, customers can't complete purchases",
    cost: "$500K lost revenue per hour of downtime",
  },

  securityBreaches: {
    problem: "Container escape vulnerability exploited in production",
    cause: "No security scanning, privileged containers, outdated base images",
    impact: "Attacker gains host system access, steals customer data",
    consequences: "GDPR fines, lawsuit settlements, brand reputation destroyed",
  },

  performanceCrises: {
    problem: "Application performance degrades 10x after containerization",
    cause: "Unoptimized images, no resource limits, inefficient networking",
    impact: "Customer complaints, SLA violations, support team overwhelmed",
    reality: "Competitors with optimized containers capture market share",
  },

  operationalChaos: {
    problem: "Deploy breaks production, no rollback strategy exists",
    cause: "No registry versioning, manual deployment process",
    impact: "4-hour emergency recovery, entire engineering team mobilized",
    prevention:
      "Professional registry management would cost $100/month to implement",
  },

  // Perfect containerization means nothing when production operations
  // lack orchestration, security, optimization, and management discipline
};

Advanced containerization mastery requires understanding:

  • Container orchestration basics that provide automatic scaling, health checking, and service discovery beyond single-host deployments
  • Docker production deployment with zero-downtime strategies, rollback capabilities, and enterprise-grade reliability
  • Container security hardening that prevents privilege escalation, scans for vulnerabilities, and implements defense-in-depth
  • Image optimization techniques that reduce deployment time, minimize attack surface, and optimize runtime performance
  • Registry management that handles versioning, access control, and storage optimization at enterprise scale

This article transforms your containers from development toys into production-grade infrastructure that scales, performs, and operates reliably under real-world conditions.


Container Orchestration Basics: Beyond Single-Host Deployments

The Evolution from Docker Compose to Real Orchestration

Understanding why Docker Compose isn’t enough for production:

// Docker Compose vs Real Orchestration: The scalability cliff
const containerOrchestrationNeed = {
  dockerComposeGoodFor: {
    environment: "Single host development/testing",
    services: "< 10 containers",
    scalability: "Manual scaling with port conflicts",
    reliability: "Single point of failure (the host)",
    networking: "Bridge networks on one machine",
    deployment: "Manual docker-compose up/down",
    monitoring: "Basic health checks only",
    security: "Host-level isolation only",
  },

  productionRequirements: {
    environment: "Multi-host clusters across regions",
    services: "100s-1000s of containers",
    scalability: "Auto-scaling based on metrics",
    reliability: "High availability across failures",
    networking: "Service mesh with load balancing",
    deployment: "Rolling updates, blue-green, canary",
    monitoring: "Centralized logging and metrics",
    security: "Network policies, secrets management",
  },

  theOrchestrationGap: [
    "No automatic failover when hosts crash",
    "No cluster-wide resource scheduling",
    "No service discovery across hosts",
    "No centralized configuration management",
    "No rolling deployment strategies",
    "No cluster networking and security policies",
    "No centralized logging and monitoring",
    "No automatic scaling based on load",
  ],
};

Introduction to Kubernetes concepts for backend engineers:

# Kubernetes: The production container orchestration standard
# Think of Kubernetes as your datacenter operating system

# ========================================
# Core Kubernetes Concepts
# ========================================

# 1. Cluster: Set of machines running containerized applications
#    - Master nodes: Control plane (API server, scheduler, etcd)
#    - Worker nodes: Run application containers

# 2. Pod: Smallest deployable unit (usually one container)
#    - Pods are ephemeral - they come and go
#    - Shared network and storage within pod

# 3. Deployment: Manages replica sets and rolling updates
#    - Ensures desired number of pods are running
#    - Handles updates and rollbacks

# 4. Service: Stable network endpoint for pods
#    - Load balances traffic across healthy pods
#    - Provides service discovery

# 5. ConfigMap/Secret: Configuration and sensitive data
#    - Decoupled from container images
#    - Centrally managed and versioned

# Sample Kubernetes deployment for backend API
# backend-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-backend
  labels:
    app: myapp-backend
spec:
  replicas: 3  # Start with 3 instances
  selector:
    matchLabels:
      app: myapp-backend
  template:
    metadata:
      labels:
        app: myapp-backend
    spec:
      containers:
      - name: backend
        image: myapp:v1.2.3
        ports:
        - containerPort: 3000
        env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: database-secret
              key: url
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        readinessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10

---
# backend-service.yaml - Load balancer for backend pods
apiVersion: v1
kind: Service
metadata:
  name: myapp-backend-service
spec:
  selector:
    app: myapp-backend
  ports:
  - protocol: TCP
    port: 80
    targetPort: 3000
  type: LoadBalancer  # Creates external load balancer

# The magic: Kubernetes automatically handles:
# - Pod scheduling across available nodes
# - Health checking and replacement of failed pods
# - Load balancing traffic across healthy pods
# - Rolling updates without downtime
# - Resource allocation and limits
# - Service discovery between services

Docker Swarm: Lightweight orchestration alternative:

#!/bin/bash
# docker-swarm-setup.sh - Simple orchestration with Docker Swarm

initialize_swarm_cluster() {
    echo "🐝 Initializing Docker Swarm cluster..."

    # Initialize swarm on manager node
    docker swarm init --advertise-addr $(hostname -I | awk '{print $1}')

    # Get join token for workers
    local join_token=$(docker swarm join-token worker -q)

    echo "✅ Swarm cluster initialized"
    echo "Worker join command:"
    echo "docker swarm join --token $join_token $(hostname -I | awk '{print $1}'):2377"
}

deploy_swarm_stack() {
    echo "🚀 Deploying application stack to Swarm..."

    # Create overlay network for services
    docker network create --driver overlay myapp-network

    # Deploy stack using docker-compose file
    cat > docker-stack.yml << 'EOF'
version: '3.8'
services:
  app:
    image: myapp:latest
    deploy:
      replicas: 5
      resources:
        limits:
          memory: 512M
          cpus: '0.5'
        reservations:
          memory: 256M
          cpus: '0.25'
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
      update_config:
        parallelism: 1
        delay: 10s
        failure_action: rollback
    networks:
      - myapp-network

  nginx:
    image: nginx:alpine
    deploy:
      replicas: 2
      placement:
        constraints: [node.role == manager]
    ports:
      - "80:80"
    networks:
      - myapp-network
    configs:
      - source: nginx_config
        target: /etc/nginx/nginx.conf

networks:
  myapp-network:
    driver: overlay

configs:
  nginx_config:
    external: true

secrets:
  db_password:
    external: true
EOF

    # Deploy the stack
    docker stack deploy -c docker-stack.yml myapp

    echo "✅ Stack deployed to Swarm"

    # Show service status
    docker service ls
    docker stack ps myapp
}

scale_swarm_services() {
    local service="${1:-myapp_app}"
    local replicas="${2:-10}"

    echo "📈 Scaling $service to $replicas replicas..."

    docker service scale "$service=$replicas"

    # Monitor scaling progress
    watch docker service ps "$service"
}

rolling_update_swarm() {
    local service="${1:-myapp_app}"
    local image="${2:-myapp:latest}"

    echo "🔄 Performing rolling update of $service to $image..."

    docker service update \
        --image "$image" \
        --update-parallelism 1 \
        --update-delay 30s \
        "$service"

    # Monitor update progress
    docker service ps "$service"
}

# Docker Swarm advantages:
# - Native Docker integration
# - Simple setup and management
# - Built-in load balancing
# - Rolling updates and rollbacks
# - Secrets and config management
# - Multi-host networking

# Docker Swarm limitations:
# - Less feature-rich than Kubernetes
# - Smaller ecosystem
# - Limited auto-scaling capabilities
# - Less sophisticated networking options

Container orchestration decision matrix:

// Choosing the right orchestration platform
class OrchestrationDecisionMatrix {
  evaluateOptions(requirements) {
    const platforms = {
      dockerCompose: {
        complexity: "Low",
        scalability: "Single host only",
        features: "Basic container management",
        ecosystem: "Limited",
        operationalOverhead: "Minimal",
        bestFor: ["Development", "Small deployments", "Testing"],
        limitations: ["No multi-host", "No auto-scaling", "No HA"],
      },

      dockerSwarm: {
        complexity: "Medium",
        scalability: "Multi-host clusters",
        features: "Built-in orchestration",
        ecosystem: "Growing",
        operationalOverhead: "Low",
        bestFor: ["Medium deployments", "Docker-native shops", "Quick setup"],
        limitations: ["Limited advanced features", "Smaller ecosystem"],
      },

      kubernetes: {
        complexity: "High",
        scalability: "Massive clusters",
        features: "Comprehensive orchestration",
        ecosystem: "Huge",
        operationalOverhead: "High",
        bestFor: ["Large deployments", "Complex requirements", "Enterprise"],
        limitations: ["Learning curve", "Operational complexity"],
      },

      managedServices: {
        complexity: "Medium",
        scalability: "Cloud provider limits",
        features: "Fully managed",
        ecosystem: "Integrated cloud services",
        operationalOverhead: "Low",
        bestFor: ["Cloud-first", "Reduced ops burden", "Fast time-to-market"],
        examples: ["EKS", "GKE", "AKS", "Fargate", "Cloud Run"],
      },
    };

    return this.recommendPlatform(requirements, platforms);
  }

  recommendPlatform(requirements, platforms) {
    if (requirements.team < 5 && requirements.services < 10) {
      return "dockerCompose";
    }

    if (
      requirements.team < 20 &&
      requirements.dockerExperience &&
      !requirements.k8sComplexity
    ) {
      return "dockerSwarm";
    }

    if (requirements.cloudFirst && requirements.managedServices) {
      return "managedServices";
    }

    if (
      requirements.scale > 100 ||
      requirements.complexNetworking ||
      requirements.enterprise
    ) {
      return "kubernetes";
    }

    return "dockerSwarm"; // Safe middle ground
  }
}

// Usage example
const requirements = {
  team: 15,
  services: 25,
  scale: 50,
  dockerExperience: true,
  k8sComplexity: false,
  cloudFirst: true,
  managedServices: true,
  complexNetworking: false,
  enterprise: false,
};

const decision = new OrchestrationDecisionMatrix();
const recommendation = decision.evaluateOptions(requirements);
console.log(`Recommended platform: ${recommendation}`);

Docker in Production: Deployment Strategies That Don’t Fail

Zero-Downtime Deployment Patterns

Professional deployment strategies that actually work in production:

#!/bin/bash
# production-deployment.sh - Zero-downtime deployment strategies

set -euo pipefail

# ========================================
# Blue-Green Deployment
# ========================================
blue_green_deployment() {
    local new_version="${1:-latest}"
    local current_env="${2:-blue}"
    local target_env="green"

    if [ "$current_env" = "green" ]; then
        target_env="blue"
    fi

    echo "🔄 Starting blue-green deployment: $current_env -> $target_env"

    # Step 1: Deploy to inactive environment
    echo "📦 Deploying $new_version to $target_env environment..."

    docker-compose -f docker-compose.prod.yml \
        -f docker-compose.$target_env.yml \
        pull app

    docker-compose -f docker-compose.prod.yml \
        -f docker-compose.$target_env.yml \
        up -d app

    # Step 2: Wait for new environment to be healthy
    echo "🏥 Waiting for $target_env environment to be healthy..."
    wait_for_health "app-$target_env" 300

    # Step 3: Run smoke tests
    echo "🧪 Running smoke tests against $target_env..."
    run_smoke_tests "http://app-$target_env:3000"

    # Step 4: Switch traffic
    echo "🔀 Switching traffic from $current_env to $target_env..."
    update_load_balancer "$target_env"

    # Step 5: Monitor for issues
    echo "📊 Monitoring deployment for 2 minutes..."
    sleep 120

    # Step 6: Verify success
    if check_deployment_success; then
        echo "✅ Blue-green deployment successful"

        # Step 7: Cleanup old environment
        echo "🧹 Cleaning up $current_env environment..."
        docker-compose -f docker-compose.prod.yml \
            -f docker-compose.$current_env.yml \
            down app

        echo "Current active environment: $target_env"
    else
        echo "❌ Deployment failed, rolling back..."
        rollback_blue_green "$current_env" "$target_env"
        exit 1
    fi
}

# ========================================
# Rolling Deployment
# ========================================
rolling_deployment() {
    local new_version="${1:-latest}"
    local total_instances="${2:-5}"
    local batch_size="${3:-1}"

    echo "🔄 Starting rolling deployment to $new_version..."
    echo "Total instances: $total_instances, Batch size: $batch_size"

    # Update image in deployment configuration
    update_deployment_image "$new_version"

    # Rolling update in batches
    for (( i=1; i<=total_instances; i+=batch_size )); do
        local end_instance=$((i + batch_size - 1))
        if [ $end_instance -gt $total_instances ]; then
            end_instance=$total_instances
        fi

        echo "📦 Updating instances $i to $end_instance..."

        # Stop old instances
        for (( j=i; j<=end_instance; j++ )); do
            docker stop "myapp-$j" || true
            docker rm "myapp-$j" || true
        done

        # Start new instances
        for (( j=i; j<=end_instance; j++ )); do
            docker run -d \
                --name "myapp-$j" \
                --network production \
                --health-cmd "curl -f http://localhost:3000/health || exit 1" \
                --health-interval 10s \
                --health-retries 3 \
                --health-start-period 30s \
                "myapp:$new_version"
        done

        # Wait for new instances to be healthy
        for (( j=i; j<=end_instance; j++ )); do
            wait_for_health "myapp-$j" 60
        done

        # Brief pause between batches
        if [ $end_instance -lt $total_instances ]; then
            echo "⏸️  Pausing 30 seconds before next batch..."
            sleep 30
        fi
    done

    echo "✅ Rolling deployment completed successfully"
}

# ========================================
# Canary Deployment
# ========================================
canary_deployment() {
    local new_version="${1:-latest}"
    local canary_percent="${2:-10}"
    local total_instances="${3:-10}"

    echo "🐤 Starting canary deployment: $canary_percent% traffic to $new_version"

    local canary_instances=$(( (total_instances * canary_percent) / 100 ))
    if [ $canary_instances -eq 0 ]; then
        canary_instances=1
    fi

    echo "Deploying $canary_instances canary instances out of $total_instances total"

    # Step 1: Deploy canary instances
    echo "📦 Deploying canary instances..."
    for (( i=1; i<=canary_instances; i++ )); do
        docker run -d \
            --name "myapp-canary-$i" \
            --network production \
            --label deployment=canary \
            --label version="$new_version" \
            "myapp:$new_version"
    done

    # Step 2: Configure load balancer for weighted routing
    configure_canary_routing "$canary_percent"

    # Step 3: Monitor canary metrics
    echo "📊 Monitoring canary deployment for 10 minutes..."
    monitor_canary_metrics 600

    # Step 4: Analyze results
    if analyze_canary_success; then
        echo "✅ Canary deployment successful, promoting to full deployment"
        promote_canary_to_full "$new_version"
    else
        echo "❌ Canary deployment failed, rolling back"
        rollback_canary
        exit 1
    fi
}

# ========================================
# Deployment Utilities
# ========================================
wait_for_health() {
    local container="$1"
    local timeout="${2:-60}"
    local counter=0

    while [ $counter -lt $timeout ]; do
        if docker inspect "$container" --format='{{.State.Health.Status}}' 2>/dev/null | grep -q "healthy"; then
            echo "✅ $container is healthy"
            return 0
        fi

        sleep 2
        ((counter += 2))
        echo -n "."
    done

    echo "❌ $container failed to become healthy within $timeout seconds"
    return 1
}

run_smoke_tests() {
    local endpoint="$1"

    echo "🧪 Running smoke tests against $endpoint..."

    # Test 1: Health check
    if ! curl -f "$endpoint/health" &>/dev/null; then
        echo "❌ Health check failed"
        return 1
    fi

    # Test 2: API authentication
    if ! curl -f -H "Authorization: Bearer test-token" "$endpoint/api/auth/verify" &>/dev/null; then
        echo "❌ Auth verification failed"
        return 1
    fi

    # Test 3: Database connectivity
    if ! curl -f "$endpoint/api/health/database" &>/dev/null; then
        echo "❌ Database connectivity failed"
        return 1
    fi

    echo "✅ All smoke tests passed"
    return 0
}

update_load_balancer() {
    local active_env="$1"

    # Update Nginx upstream configuration
    cat > /etc/nginx/conf.d/upstream.conf << EOF
upstream backend {
    server app-${active_env}:3000;
}
EOF

    # Reload Nginx configuration
    docker exec nginx-lb nginx -s reload

    echo "✅ Load balancer updated to route to $active_env"
}

check_deployment_success() {
    # Check error rate
    local error_rate=$(get_error_rate_percentage)
    if [ "$error_rate" -gt 5 ]; then
        echo "❌ Error rate too high: $error_rate%"
        return 1
    fi

    # Check response time
    local avg_response_time=$(get_average_response_time)
    if [ "$avg_response_time" -gt 2000 ]; then
        echo "❌ Response time too high: ${avg_response_time}ms"
        return 1
    fi

    return 0
}

rollback_deployment() {
    local rollback_version="${1:-previous}"

    echo "🔙 Rolling back to $rollback_version..."

    # Get previous version from deployment history
    if [ "$rollback_version" = "previous" ]; then
        rollback_version=$(get_previous_version)
    fi

    # Perform rollback using appropriate strategy
    case "$DEPLOYMENT_STRATEGY" in
        blue-green)
            rollback_blue_green
            ;;
        rolling)
            rolling_deployment "$rollback_version"
            ;;
        canary)
            rollback_canary
            ;;
        *)
            echo "❌ Unknown deployment strategy: $DEPLOYMENT_STRATEGY"
            exit 1
            ;;
    esac

    echo "✅ Rollback to $rollback_version completed"
}

# Command routing
case "${1:-help}" in
    blue-green)
        blue_green_deployment "${2:-latest}" "${3:-blue}"
        ;;
    rolling)
        rolling_deployment "${2:-latest}" "${3:-5}" "${4:-1}"
        ;;
    canary)
        canary_deployment "${2:-latest}" "${3:-10}" "${4:-10}"
        ;;
    rollback)
        rollback_deployment "${2:-previous}"
        ;;
    help|*)
        cat << EOF
Production Deployment Manager

Usage: $0 <strategy> [options]

Deployment Strategies:
    blue-green [version] [current-env]  Blue-green deployment
    rolling [version] [instances] [batch] Rolling deployment
    canary [version] [percent] [total]   Canary deployment
    rollback [version]                   Rollback to previous version

Examples:
    $0 blue-green v1.2.3 blue          # Blue-green deployment
    $0 rolling v1.2.3 10 2             # Rolling update 10 instances, 2 at a time
    $0 canary v1.2.3 20 10             # Canary deployment with 20% traffic
    $0 rollback v1.2.2                 # Rollback to specific version
EOF
        ;;
esac

Production-ready Docker configuration:

# Dockerfile.production - Production-optimized container
FROM node:18-alpine AS base

# Security updates and essential tools
RUN apk update && apk upgrade && \
    apk add --no-cache dumb-init curl ca-certificates tini && \
    rm -rf /var/cache/apk/* /tmp/* /var/tmp/*

# Create non-root user
RUN addgroup -g 10001 -S appgroup && \
    adduser -S appuser -u 10001 -G appgroup

# Production dependencies stage
FROM base AS production-deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production --no-audit --no-fund && \
    npm cache clean --force && \
    rm -rf ~/.npm /tmp/*

# Build stage
FROM base AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --include=dev
COPY . .
RUN npm run build && \
    npm run test:ci

# Final production stage
FROM base AS production

# Install production dependencies
COPY --from=production-deps --chown=appuser:appgroup /app/node_modules ./node_modules

# Copy built application
WORKDIR /app
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=builder --chown=appuser:appgroup /app/public ./public
COPY --chown=appuser:appgroup package.json ./

# Production environment configuration
ENV NODE_ENV=production \
    PORT=3000 \
    LOG_LEVEL=info \
    METRICS_ENABLED=true \
    SHUTDOWN_TIMEOUT=30000

# Switch to non-root user
USER appuser

# Expose port
EXPOSE 3000

# Health check configuration
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:3000/health/ready || exit 1

# Signal handling and graceful shutdown
ENTRYPOINT ["tini", "--"]
CMD ["node", "--enable-source-maps", "--max-old-space-size=512", "dist/server.js"]

# Production optimizations applied:
# - Multi-stage build reduces final image size
# - Security updates and minimal attack surface
# - Non-root user for security
# - Proper signal handling with tini
# - Resource constraints and monitoring
# - Comprehensive health checking
# - Optimized Node.js flags

Container Security: Defense in Depth

Comprehensive Container Security Framework

Container security that actually protects production systems:

#!/bin/bash
# container-security.sh - Comprehensive container security hardening

set -euo pipefail

# ========================================
# Image Security Scanning
# ========================================
security_scan_image() {
    local image="${1:-myapp:latest}"

    echo "🔍 Running comprehensive security scan on $image..."

    # Scan 1: Trivy vulnerability scanner
    if command -v trivy &> /dev/null; then
        echo "📊 Running Trivy vulnerability scan..."
        trivy image \
            --severity HIGH,CRITICAL \
            --format table \
            --exit-code 1 \
            "$image"
    else
        echo "💡 Installing Trivy for vulnerability scanning..."
        docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
            aquasec/trivy image --severity HIGH,CRITICAL "$image"
    fi

    # Scan 2: Docker Scout (if available)
    if docker scout version &> /dev/null; then
        echo "🛡️  Running Docker Scout analysis..."
        docker scout cves "$image"
        docker scout recommendations "$image"
    fi

    # Scan 3: Custom security checks
    echo "🔎 Running custom security validation..."
    validate_image_security "$image"

    echo "✅ Security scan completed"
}

validate_image_security() {
    local image="$1"
    local temp_container="security-check-$$"

    echo "🔍 Validating image security configuration..."

    # Start temporary container for inspection
    docker run -d --name "$temp_container" "$image" sleep 300

    # Check 1: User privileges
    local user=$(docker exec "$temp_container" whoami)
    if [ "$user" = "root" ]; then
        echo "❌ SECURITY ISSUE: Container running as root"
        docker rm -f "$temp_container"
        return 1
    else
        echo "✅ Container running as non-root user: $user"
    fi

    # Check 2: Writable filesystem
    local writable_dirs=$(docker exec "$temp_container" find / -writable -type d 2>/dev/null | grep -v '/tmp\|/proc\|/sys\|/dev' || true)
    if [ -n "$writable_dirs" ]; then
        echo "⚠️  Writable directories found outside expected locations:"
        echo "$writable_dirs"
    fi

    # Check 3: Unnecessary packages
    local package_count=$(docker exec "$temp_container" sh -c 'apk list 2>/dev/null | wc -l' || echo "0")
    echo "ℹ️  Package count: $package_count"

    # Check 4: SUID/SGID binaries
    local suid_files=$(docker exec "$temp_container" find / -perm -4000 -o -perm -2000 2>/dev/null || true)
    if [ -n "$suid_files" ]; then
        echo "⚠️  SUID/SGID binaries found:"
        echo "$suid_files"
    fi

    # Cleanup
    docker rm -f "$temp_container"

    echo "✅ Image security validation completed"
}

# ========================================
# Runtime Security Configuration
# ========================================
secure_container_run() {
    local image="${1:-myapp:latest}"
    local name="${2:-myapp-secure}"

    echo "🔒 Starting container with security hardening..."

    docker run -d \
        --name "$name" \
        `# Security: Drop all capabilities, add only needed ones` \
        --cap-drop=ALL \
        --cap-add=NET_BIND_SERVICE \
        `# Security: No new privileges` \
        --security-opt=no-new-privileges:true \
        `# Security: Read-only root filesystem` \
        --read-only \
        --tmpfs /tmp:rw,noexec,nosuid,size=100m \
        --tmpfs /run:rw,noexec,nosuid,size=100m \
        `# Security: User namespace remapping` \
        --user 10001:10001 \
        `# Security: Resource limits` \
        --memory=512m \
        --cpus=1.0 \
        --pids-limit=100 \
        `# Security: Network isolation` \
        --network=secure-network \
        `# Security: AppArmor/SELinux profiles` \
        --security-opt apparmor:myapp-profile \
        `# Monitoring: Log driver` \
        --log-driver=json-file \
        --log-opt max-size=10m \
        --log-opt max-file=3 \
        "$image"

    echo "✅ Secure container started: $name"
}

create_apparmor_profile() {
    echo "🛡️  Creating AppArmor security profile..."

    cat > /etc/apparmor.d/myapp-profile << 'EOF'
#include <tunables/global>

profile myapp-profile flags=(attach_disconnected,mediate_deleted) {
    #include <abstractions/base>

    # Allow network access
    network inet tcp,
    network inet udp,

    # Allow file access to application directory
    /app/** r,
    /app/uploads/** rw,
    /app/logs/** rw,

    # Allow temporary file access
    /tmp/** rw,
    /run/** rw,

    # Deny dangerous operations
    deny /etc/shadow r,
    deny /etc/passwd w,
    deny /proc/sys/** w,
    deny mount,
    deny umount,
    deny ptrace,
    deny capability sys_admin,
    deny capability dac_override,

    # Allow necessary capabilities
    capability net_bind_service,
    capability setuid,
    capability setgid,
}
EOF

    # Load the profile
    apparmor_parser -r /etc/apparmor.d/myapp-profile

    echo "✅ AppArmor profile created and loaded"
}

# ========================================
# Network Security
# ========================================
setup_network_security() {
    echo "🌐 Setting up secure container networking..."

    # Create isolated network
    docker network create \
        --driver bridge \
        --subnet 172.30.0.0/16 \
        --gateway 172.30.0.1 \
        --opt com.docker.network.bridge.enable_icc=false \
        --opt com.docker.network.bridge.enable_ip_masquerade=true \
        --opt com.docker.network.bridge.host_binding_ipv4=127.0.0.1 \
        secure-network

    # Create network policies using iptables
    create_network_policies

    echo "✅ Secure networking configured"
}

create_network_policies() {
    echo "🔥 Configuring network policies..."

    # Block container-to-host communication (except necessary ports)
    iptables -I DOCKER-USER 1 -i docker0 -o eth0 -j DROP
    iptables -I DOCKER-USER 2 -i docker0 -o eth0 -p tcp --dport 80 -j ACCEPT
    iptables -I DOCKER-USER 3 -i docker0 -o eth0 -p tcp --dport 443 -j ACCEPT

    # Block inter-container communication (except same network)
    iptables -I DOCKER-USER 4 -i docker0 -o docker0 -j DROP

    # Allow specific service communication
    iptables -I DOCKER-USER 5 -s 172.30.0.0/16 -d 172.30.0.0/16 -j ACCEPT

    echo "✅ Network policies configured"
}

# ========================================
# Secrets Management
# ========================================
setup_secrets_management() {
    echo "🔐 Setting up secure secrets management..."

    # Initialize Docker secrets (Swarm mode)
    if docker info | grep -q "Swarm: active"; then
        echo "Using Docker Swarm secrets"
        setup_swarm_secrets
    else
        echo "Using external secrets management"
        setup_external_secrets
    fi
}

setup_swarm_secrets() {
    # Create secrets from files
    echo "Creating Docker Swarm secrets..."

    echo "$DB_PASSWORD" | docker secret create db_password -
    echo "$JWT_SECRET" | docker secret create jwt_secret -
    echo "$API_KEY" | docker secret create api_key -

    # Deploy service with secrets
    docker service create \
        --name myapp-secure \
        --secret db_password \
        --secret jwt_secret \
        --secret api_key \
        --env DB_PASSWORD_FILE=/run/secrets/db_password \
        --env JWT_SECRET_FILE=/run/secrets/jwt_secret \
        --env API_KEY_FILE=/run/secrets/api_key \
        myapp:latest
}

setup_external_secrets() {
    echo "Configuring external secrets integration..."

    # Use HashiCorp Vault integration
    docker run -d \
        --name vault-agent \
        --network secure-network \
        --volume vault-secrets:/vault/secrets \
        vault:latest vault agent -config=/vault/config/agent.hcl

    # Application container with Vault integration
    docker run -d \
        --name myapp-vault \
        --network secure-network \
        --volume vault-secrets:/vault/secrets:ro \
        --env VAULT_SECRETS_PATH=/vault/secrets \
        myapp:latest
}

# ========================================
# Security Monitoring
# ========================================
setup_security_monitoring() {
    echo "📊 Setting up security monitoring..."

    # Falco for runtime security monitoring
    docker run -d \
        --name falco \
        --privileged \
        --volume /var/run/docker.sock:/host/var/run/docker.sock \
        --volume /dev:/host/dev \
        --volume /proc:/host/proc:ro \
        --volume /boot:/host/boot:ro \
        --volume /lib/modules:/host/lib/modules:ro \
        --volume /usr:/host/usr:ro \
        --volume /etc:/host/etc:ro \
        falcosecurity/falco:latest

    # ClamAV for malware scanning
    docker run -d \
        --name clamav \
        --volume /var/run/docker.sock:/var/run/docker.sock:ro \
        --volume clamav-db:/var/lib/clamav \
        clamav/clamav:latest

    echo "✅ Security monitoring configured"
}

# ========================================
# Compliance and Auditing
# ========================================
generate_security_report() {
    echo "📋 Generating security compliance report..."

    local report_file="security-report-$(date +%Y%m%d-%H%M%S).json"

    # Container security assessment
    {
        echo "{"
        echo '  "timestamp": "'$(date -Iseconds)'",'
        echo '  "containers": ['

        first=true
        docker ps --format "{{.Names}}" | while read -r container; do
            if [ "$first" = false ]; then
                echo ","
            fi
            first=false

            echo "    {"
            echo '      "name": "'$container'",'
            echo '      "image": "'$(docker inspect "$container" --format '{{.Config.Image}}')'\",'
            echo '      "user": "'$(docker exec "$container" whoami 2>/dev/null || echo "unknown")'\",'
            echo '      "privileged": '$(docker inspect "$container" --format '{{.HostConfig.Privileged}}' | tr '[:upper:]' '[:lower:]')','
            echo '      "capabilities": '$(docker inspect "$container" --format '{{json .HostConfig.CapAdd}}')','
            echo '      "readonly_rootfs": '$(docker inspect "$container" --format '{{.HostConfig.ReadonlyRootfs}}' | tr '[:upper:]' '[:lower:]')'
            echo "    }"
        done

        echo "  ]"
        echo "}"
    } > "$report_file"

    echo "✅ Security report generated: $report_file"
}

# Command routing
case "${1:-help}" in
    scan)
        security_scan_image "${2:-myapp:latest}"
        ;;
    run-secure)
        secure_container_run "${2:-myapp:latest}" "${3:-myapp-secure}"
        ;;
    network-security)
        setup_network_security
        ;;
    secrets)
        setup_secrets_management
        ;;
    monitoring)
        setup_security_monitoring
        ;;
    report)
        generate_security_report
        ;;
    full-setup)
        echo "🛡️  Setting up comprehensive container security..."
        security_scan_image "${2:-myapp:latest}"
        create_apparmor_profile
        setup_network_security
        setup_secrets_management
        setup_security_monitoring
        echo "✅ Full security setup completed"
        ;;
    help|*)
        cat << EOF
Container Security Manager

Usage: $0 <command> [options]

Commands:
    scan [image]                 Security scan container image
    run-secure [image] [name]    Run container with security hardening
    network-security             Set up secure container networking
    secrets                      Configure secrets management
    monitoring                   Set up security monitoring
    report                       Generate security compliance report
    full-setup [image]           Complete security setup

Examples:
    $0 scan myapp:v1.2.3        # Scan image for vulnerabilities
    $0 run-secure myapp:latest   # Run with security hardening
    $0 full-setup myapp:latest   # Complete security configuration
EOF
        ;;
esac

Image Optimization: Performance and Efficiency

Advanced Image Size and Performance Optimization

Image optimization techniques that actually matter in production:

#!/bin/bash
# image-optimization.sh - Advanced container image optimization

set -euo pipefail

# ========================================
# Multi-stage Build Optimization
# ========================================
create_optimized_dockerfile() {
    echo "🏗️  Creating optimized multi-stage Dockerfile..."

    cat > Dockerfile.optimized << 'EOF'
# ========================================
# Stage 1: Build Dependencies
# ========================================
FROM node:18-alpine AS build-deps

# Install build dependencies only when needed
RUN apk add --no-cache \
    python3 \
    make \
    g++ \
    git \
    && rm -rf /var/cache/apk/*

WORKDIR /app
COPY package*.json ./

# Install all dependencies for building
RUN npm ci --include=dev

# ========================================
# Stage 2: Source Build
# ========================================
FROM build-deps AS builder

# Copy source code
COPY . .

# Build application
RUN npm run build

# Remove development dependencies
RUN npm prune --production

# Remove unnecessary files
RUN rm -rf \
    src \
    tests \
    docs \
    *.md \
    .git \
    .github \
    .vscode \
    .eslintrc* \
    .prettierrc* \
    tsconfig.json \
    jest.config.js

# ========================================
# Stage 3: Runtime Base
# ========================================
FROM node:18-alpine AS runtime-base

# Install only runtime dependencies
RUN apk add --no-cache \
    dumb-init \
    curl \
    ca-certificates \
    && rm -rf /var/cache/apk/* /tmp/* /var/tmp/*

# Create non-root user
RUN addgroup -g 10001 -S appgroup && \
    adduser -S appuser -u 10001 -G appgroup

# ========================================
# Stage 4: Production Image
# ========================================
FROM runtime-base AS production

WORKDIR /app

# Copy production dependencies
COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules

# Copy built application
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=builder --chown=appuser:appgroup /app/public ./public
COPY --from=builder --chown=appuser:appgroup /app/package.json ./package.json

# Switch to non-root user
USER appuser

# Configure runtime
ENV NODE_ENV=production \
    PORT=3000 \
    NODE_OPTIONS="--enable-source-maps --max-old-space-size=512"

EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:3000/health || exit 1

# Start application
ENTRYPOINT ["dumb-init", "--"]
CMD ["node", "dist/server.js"]

# Optimization results:
# - Build stage isolation prevents dev dependencies in production
# - Alpine Linux reduces base image size by 90%+
# - Multi-stage reduces final image size by 70%+
# - Layer optimization improves build cache hits
# - Security hardening with non-root user
EOF

    echo "✅ Optimized Dockerfile created"
}

# ========================================
# Layer Optimization Analysis
# ========================================
analyze_image_layers() {
    local image="${1:-myapp:latest}"

    echo "📊 Analyzing image layers for $image..."

    # Show layer information
    echo "📋 Layer breakdown:"
    docker history "$image" --human --no-trunc

    echo ""
    echo "💾 Layer sizes:"
    docker history "$image" --format "table {{.CreatedBy}}\t{{.Size}}" | head -20

    # Use dive if available for detailed analysis
    if command -v dive &> /dev/null; then
        echo ""
        echo "🔍 Running detailed layer analysis with dive..."
        dive "$image" --ci
    else
        echo ""
        echo "💡 Install dive for detailed layer analysis:"
        echo "    curl -OL https://github.com/wagoodman/dive/releases/download/v0.10.0/dive_0.10.0_linux_amd64.deb"
        echo "    sudo apt install ./dive_0.10.0_linux_amd64.deb"
        echo ""
        echo "Or run with Docker:"
        echo "    docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive:latest $image"
    fi
}

optimize_layer_caching() {
    echo "🚀 Optimizing Dockerfile for layer caching..."

    cat > Dockerfile.cache-optimized << 'EOF'
FROM node:18-alpine AS base

# ========================================
# Install system dependencies (changes rarely)
# ========================================
RUN apk update && apk upgrade && \
    apk add --no-cache \
        dumb-init \
        curl \
        ca-certificates \
    && rm -rf /var/cache/apk/*

# ========================================
# Create user (changes rarely)
# ========================================
RUN addgroup -g 10001 -S appgroup && \
    adduser -S appuser -u 10001 -G appgroup

WORKDIR /app

# ========================================
# Install dependencies (changes when package.json changes)
# ========================================
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# ========================================
# Copy application code (changes frequently)
# ========================================
COPY --chown=appuser:appgroup . .

# ========================================
# Build application (changes when code changes)
# ========================================
RUN npm run build

# ========================================
# Runtime configuration (changes rarely)
# ========================================
USER appuser
ENV NODE_ENV=production PORT=3000
EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:3000/health || exit 1

ENTRYPOINT ["dumb-init", "--"]
CMD ["node", "dist/server.js"]

# Layer optimization strategy:
# 1. System packages first (rarely change)
# 2. User creation next (rarely changes)
# 3. Dependencies based on package.json (moderate changes)
# 4. Application code last (changes frequently)
# Result: 80%+ cache hit rate on typical builds
EOF

    echo "✅ Cache-optimized Dockerfile created"
}

# ========================================
# Dependency Optimization
# ========================================
optimize_dependencies() {
    echo "📦 Optimizing application dependencies..."

    # Analyze dependency tree
    echo "🔍 Current dependency analysis:"
    npm ls --depth=0 2>/dev/null || true

    # Check for duplicate dependencies
    echo ""
    echo "🔍 Checking for duplicate dependencies:"
    npm ls --depth=0 | grep -E '(deduped|extraneous)' || echo "No duplicates found"

    # Bundle analyzer for size analysis
    if [ -f "package.json" ]; then
        echo ""
        echo "📊 Running bundle size analysis..."

        # Install bundle analyzer if not present
        if ! npm list webpack-bundle-analyzer &>/dev/null; then
            npm install --save-dev webpack-bundle-analyzer
        fi

        # Generate bundle analysis
        npm run build 2>/dev/null || echo "Build script not found"

        # Create dependency optimization script
        cat > optimize-deps.js << 'EOF'
const fs = require('fs');
const path = require('path');

class DependencyOptimizer {
    constructor() {
        this.packageJson = JSON.parse(fs.readFileSync('package.json', 'utf8'));
    }

    analyzeUnusedDependencies() {
        console.log('🔍 Analyzing unused dependencies...');

        const dependencies = Object.keys(this.packageJson.dependencies || {});
        const devDependencies = Object.keys(this.packageJson.devDependencies || {});

        const usedDeps = new Set();

        // Scan source files for imports
        this.scanDirectory('src', usedDeps);
        this.scanDirectory('dist', usedDeps);

        const unusedDeps = dependencies.filter(dep => !usedDeps.has(dep));
        const unusedDevDeps = devDependencies.filter(dep => !usedDeps.has(dep));

        if (unusedDeps.length > 0) {
            console.log('📦 Potentially unused production dependencies:');
            unusedDeps.forEach(dep => console.log(`  - ${dep}`));
        }

        if (unusedDevDeps.length > 0) {
            console.log('🔧 Potentially unused development dependencies:');
            unusedDevDeps.forEach(dep => console.log(`  - ${dep}`));
        }

        return { unusedDeps, unusedDevDeps };
    }

    scanDirectory(dir, usedDeps) {
        if (!fs.existsSync(dir)) return;

        const files = fs.readdirSync(dir, { withFileTypes: true });

        files.forEach(file => {
            const fullPath = path.join(dir, file.name);

            if (file.isDirectory()) {
                this.scanDirectory(fullPath, usedDeps);
            } else if (file.name.match(/\.(js|ts|jsx|tsx)$/)) {
                const content = fs.readFileSync(fullPath, 'utf8');

                // Find require() and import statements
                const importMatches = content.match(/(?:require|import).*?['"]([^'"]+)['"]/g) || [];

                importMatches.forEach(match => {
                    const moduleName = match.match(/['"]([^'"]+)['"]/)[1];

                    // Extract package name (handle scoped packages)
                    let packageName = moduleName.split('/')[0];
                    if (packageName.startsWith('@')) {
                        packageName = moduleName.split('/').slice(0, 2).join('/');
                    }

                    usedDeps.add(packageName);
                });
            }
        });
    }

    optimizePackageJson() {
        console.log('⚡ Creating optimized package.json...');

        const optimized = {
            ...this.packageJson,
            scripts: {
                start: this.packageJson.scripts.start,
                build: this.packageJson.scripts.build,
                test: this.packageJson.scripts.test
            }
        };

        // Remove unnecessary fields
        delete optimized.devDependencies;
        delete optimized.scripts.dev;
        delete optimized.scripts['build:dev'];

        fs.writeFileSync('package.prod.json', JSON.stringify(optimized, null, 2));
        console.log('✅ Optimized package.json created as package.prod.json');
    }
}

const optimizer = new DependencyOptimizer();
optimizer.analyzeUnusedDependencies();
optimizer.optimizePackageJson();
EOF

        node optimize-deps.js
        rm optimize-deps.js
    fi

    echo "✅ Dependency optimization completed"
}

# ========================================
# Image Compression and Registry Optimization
# ========================================
compress_and_optimize_image() {
    local image="${1:-myapp:latest}"
    local optimized_tag="${image%:*}:optimized"

    echo "🗜️  Compressing and optimizing image: $image"

    # Method 1: docker-slim for automatic optimization
    if command -v docker-slim &> /dev/null; then
        echo "Using docker-slim for automatic optimization..."

        docker-slim build \
            --target "$image" \
            --tag "$optimized_tag" \
            --http-probe=false \
            --continue-after=10 \
            --include-path='/app' \
            --include-path='/usr/local/bin/node' \
            --include-path='/usr/local/lib/node_modules' \
            "$image"

    else
        echo "💡 Install docker-slim for automatic image optimization:"
        echo "    curl -L https://downloads.dockerslim.com/releases/1.40.0/dist_linux.tar.gz | tar -xzf -"
        echo "    sudo mv dist_linux/docker-slim /usr/local/bin/"
        echo "    sudo mv dist_linux/docker-slim-sensor /usr/local/bin/"
    fi

    # Method 2: Manual optimization with flattening
    echo "Creating manually optimized image..."

    # Export and import to flatten layers
    docker export $(docker create "$image") | \
        docker import \
            --change 'USER appuser' \
            --change 'WORKDIR /app' \
            --change 'ENV NODE_ENV=production' \
            --change 'EXPOSE 3000' \
            --change 'ENTRYPOINT ["dumb-init", "--"]' \
            --change 'CMD ["node", "dist/server.js"]' \
            - "${image%:*}:flattened"

    # Compare sizes
    echo ""
    echo "📊 Image size comparison:"
    docker images | grep -E "(${image%:*}|SIZE)" | head -5

    echo "✅ Image optimization completed"
}

# ========================================
# Performance Testing
# ========================================
test_image_performance() {
    local image="${1:-myapp:latest}"

    echo "⚡ Testing image performance..."

    # Test 1: Container startup time
    echo "🚀 Testing container startup time..."

    for i in {1..5}; do
        local start_time=$(date +%s%N)

        docker run -d --name "perf-test-$i" "$image" >/dev/null

        # Wait for container to be ready
        while ! docker exec "perf-test-$i" curl -f http://localhost:3000/health &>/dev/null; do
            sleep 0.1
        done

        local end_time=$(date +%s%N)
        local startup_time=$(( (end_time - start_time) / 1000000 )) # Convert to milliseconds

        echo "  Run $i: ${startup_time}ms"

        docker stop "perf-test-$i" &>/dev/null
        docker rm "perf-test-$i" &>/dev/null
    done

    # Test 2: Memory usage
    echo ""
    echo "💾 Testing memory usage..."

    docker run -d --name memory-test "$image"
    sleep 10

    local memory_usage=$(docker stats memory-test --no-stream --format "{{.MemUsage}}")
    echo "  Memory usage: $memory_usage"

    docker stop memory-test &>/dev/null
    docker rm memory-test &>/dev/null

    # Test 3: Image pull time
    echo ""
    echo "⬇️  Testing image pull performance..."

    docker rmi "$image" &>/dev/null || true

    local pull_start=$(date +%s)
    docker pull "$image" >/dev/null
    local pull_end=$(date +%s)
    local pull_time=$((pull_end - pull_start))

    echo "  Pull time: ${pull_time}s"

    echo "✅ Performance testing completed"
}

# Command routing
case "${1:-help}" in
    create-dockerfile)
        create_optimized_dockerfile
        ;;
    analyze)
        analyze_image_layers "${2:-myapp:latest}"
        ;;
    cache-optimize)
        optimize_layer_caching
        ;;
    optimize-deps)
        optimize_dependencies
        ;;
    compress)
        compress_and_optimize_image "${2:-myapp:latest}"
        ;;
    test-performance)
        test_image_performance "${2:-myapp:latest}"
        ;;
    full-optimize)
        echo "🚀 Running full image optimization..."
        create_optimized_dockerfile
        optimize_layer_caching
        optimize_dependencies
        docker build -t myapp:optimized -f Dockerfile.optimized .
        compress_and_optimize_image "myapp:optimized"
        test_image_performance "myapp:optimized"
        echo "✅ Full optimization completed"
        ;;
    help|*)
        cat << EOF
Image Optimization Manager

Usage: $0 <command> [options]

Commands:
    create-dockerfile           Create optimized multi-stage Dockerfile
    analyze [image]            Analyze image layers and sizes
    cache-optimize             Create cache-optimized Dockerfile
    optimize-deps              Optimize application dependencies
    compress [image]           Compress and optimize existing image
    test-performance [image]   Test image performance metrics
    full-optimize              Run complete optimization workflow

Examples:
    $0 analyze myapp:v1.2.3    # Analyze image structure
    $0 compress myapp:latest   # Compress existing image
    $0 full-optimize           # Complete optimization workflow
EOF
        ;;
esac

Registry Management: Enterprise-Grade Container Storage

Professional Container Registry Operations

Registry management that scales with your organization:

#!/bin/bash
# registry-management.sh - Enterprise container registry management

set -euo pipefail

# Configuration
REGISTRY_HOST="${REGISTRY_HOST:-registry.company.com}"
REGISTRY_PORT="${REGISTRY_PORT:-5000}"
DOCKER_REGISTRY_VERSION="${DOCKER_REGISTRY_VERSION:-2.8}"

# ========================================
# Private Registry Setup
# ========================================
setup_private_registry() {
    echo "🏢 Setting up private Docker registry..."

    # Create registry directories
    mkdir -p registry/{data,certs,auth,config}

    # Generate TLS certificates
    generate_registry_certificates

    # Set up authentication
    setup_registry_authentication

    # Create registry configuration
    create_registry_config

    # Start registry with Docker Compose
    create_registry_compose_file
    docker-compose -f registry-compose.yml up -d

    # Verify registry is running
    test_registry_connectivity

    echo "✅ Private registry setup completed"
    echo "🌐 Registry available at: https://$REGISTRY_HOST:$REGISTRY_PORT"
}

generate_registry_certificates() {
    echo "🔐 Generating TLS certificates for registry..."

    # Generate CA private key
    openssl genrsa -out registry/certs/ca-key.pem 4096

    # Generate CA certificate
    openssl req -new -x509 -days 365 -key registry/certs/ca-key.pem \
        -sha256 -out registry/certs/ca.pem \
        -subj "/C=US/ST=CA/L=San Francisco/O=Company/CN=Registry CA"

    # Generate server private key
    openssl genrsa -out registry/certs/server-key.pem 4096

    # Generate server certificate signing request
    openssl req -subj "/CN=$REGISTRY_HOST" -sha256 -new \
        -key registry/certs/server-key.pem \
        -out registry/certs/server.csr

    # Create extensions file
    cat > registry/certs/server-extensions.cnf << EOF
subjectAltName = DNS:$REGISTRY_HOST,IP:127.0.0.1,IP:0.0.0.0
extendedKeyUsage = serverAuth
EOF

    # Generate server certificate
    openssl x509 -req -days 365 -sha256 \
        -in registry/certs/server.csr \
        -CA registry/certs/ca.pem \
        -CAkey registry/certs/ca-key.pem \
        -out registry/certs/server.pem \
        -extfile registry/certs/server-extensions.cnf \
        -CAcreateserial

    # Set appropriate permissions
    chmod 400 registry/certs/*-key.pem
    chmod 444 registry/certs/*.pem

    echo "✅ TLS certificates generated"
}

setup_registry_authentication() {
    echo "🔑 Setting up registry authentication..."

    # Install htpasswd if not available
    if ! command -v htpasswd &> /dev/null; then
        apt-get update && apt-get install -y apache2-utils
    fi

    # Create user accounts
    local users=("admin:admin123" "developer:dev123" "ci:ci123")

    for user_pass in "${users[@]}"; do
        local user="${user_pass%%:*}"
        local pass="${user_pass##*:}"

        if [ ! -f "registry/auth/htpasswd" ]; then
            htpasswd -Bbn "$user" "$pass" > registry/auth/htpasswd
        else
            htpasswd -Bb registry/auth/htpasswd "$user" "$pass"
        fi

        echo "Created user: $user"
    done

    echo "✅ Authentication configured"
}

create_registry_config() {
    echo "⚙️  Creating registry configuration..."

    cat > registry/config/config.yml << EOF
version: 0.1
log:
  fields:
    service: registry
  level: info
storage:
  cache:
    blobdescriptor: inmemory
  filesystem:
    rootdirectory: /var/lib/registry
  delete:
    enabled: true
  maintenance:
    uploadpurging:
      enabled: true
      age: 168h
      interval: 24h
      dryrun: false
http:
  addr: :5000
  headers:
    X-Content-Type-Options: [nosniff]
    Access-Control-Allow-Origin: ['*']
    Access-Control-Allow-Methods: ['HEAD', 'GET', 'OPTIONS', 'DELETE']
    Access-Control-Allow-Headers: ['Authorization', 'Accept']
    Access-Control-Max-Age: [1728000]
    Access-Control-Allow-Credentials: [true]
    Access-Control-Expose-Headers: ['Docker-Content-Digest']
  tls:
    certificate: /certs/server.pem
    key: /certs/server-key.pem
auth:
  htpasswd:
    realm: basic-realm
    path: /auth/htpasswd
validation:
  manifests:
    urls:
      allow:
        - ^https?://
health:
  storagedriver:
    enabled: true
    interval: 10s
    threshold: 3
proxy:
  remoteurl: https://registry-1.docker.io
  username: [username]
  password: [password]
EOF

    echo "✅ Registry configuration created"
}

create_registry_compose_file() {
    echo "🐳 Creating Docker Compose configuration..."

    cat > registry-compose.yml << EOF
version: '3.8'

services:
  registry:
    image: registry:${DOCKER_REGISTRY_VERSION}
    container_name: docker-registry
    restart: unless-stopped
    ports:
      - "${REGISTRY_PORT}:5000"
    environment:
      - REGISTRY_CONFIG_PATH=/etc/registry/config.yml
    volumes:
      - ./registry/data:/var/lib/registry
      - ./registry/certs:/certs:ro
      - ./registry/auth:/auth:ro
      - ./registry/config/config.yml:/etc/registry/config.yml:ro
    healthcheck:
      test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "https://localhost:5000/v2/"]
      interval: 30s
      timeout: 10s
      retries: 3

  registry-ui:
    image: joxit/docker-registry-ui:latest
    container_name: registry-ui
    restart: unless-stopped
    ports:
      - "8080:80"
    environment:
      - REGISTRY_TITLE=Company Docker Registry
      - REGISTRY_URL=https://registry:5000
      - DELETE_IMAGES=true
      - SHOW_CONTENT_DIGEST=true
      - NGINX_PROXY_PASS_URL=https://registry:5000
    depends_on:
      - registry

  registry-cleaner:
    image: mortensrasmussen/docker-registry-cleaner:latest
    container_name: registry-cleaner
    restart: unless-stopped
    environment:
      - REGISTRY_URL=https://registry:5000
      - REGISTRY_USERNAME=admin
      - REGISTRY_PASSWORD=admin123
      - CLEAN_INTERVAL=86400  # 24 hours
      - KEEP_TAGS=10
      - DRY_RUN=false
    depends_on:
      - registry

volumes:
  registry-data:
    driver: local
EOF

    echo "✅ Docker Compose file created"
}

# ========================================
# Registry Operations
# ========================================
push_image_to_registry() {
    local image="${1:-myapp:latest}"
    local registry_image="$REGISTRY_HOST:$REGISTRY_PORT/myapp:${image##*:}"

    echo "📤 Pushing image to private registry..."

    # Tag image for private registry
    docker tag "$image" "$registry_image"

    # Login to private registry
    echo "🔐 Logging in to registry..."
    docker login "$REGISTRY_HOST:$REGISTRY_PORT"

    # Push image
    echo "⬆️  Pushing $registry_image..."
    docker push "$registry_image"

    # Verify push
    echo "✅ Image pushed successfully"
    echo "📋 Image available at: $registry_image"

    # Show image information
    get_image_info "$registry_image"
}

pull_image_from_registry() {
    local image="${1:-myapp:latest}"
    local registry_image="$REGISTRY_HOST:$REGISTRY_PORT/myapp:$image"

    echo "📥 Pulling image from private registry..."

    # Login to private registry
    docker login "$REGISTRY_HOST:$REGISTRY_PORT"

    # Pull image
    docker pull "$registry_image"

    # Tag as local image
    docker tag "$registry_image" "myapp:$image"

    echo "✅ Image pulled and tagged as myapp:$image"
}

list_registry_repositories() {
    echo "📋 Listing registry repositories..."

    # Get repository list
    curl -s -k -u admin:admin123 \
        "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/_catalog" | \
        jq -r '.repositories[]' | \
        sort
}

list_image_tags() {
    local repository="${1:-myapp}"

    echo "🏷️  Listing tags for repository: $repository"

    curl -s -k -u admin:admin123 \
        "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/$repository/tags/list" | \
        jq -r '.tags[]' | \
        sort -V
}

get_image_info() {
    local image="${1:-myapp:latest}"
    local repository="${image%:*}"
    local tag="${image##*:}"

    echo "ℹ️  Getting image information for $image..."

    # Get manifest
    local manifest=$(curl -s -k -u admin:admin123 \
        -H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
        "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/$repository/manifests/$tag")

    echo "📊 Image details:"
    echo "$manifest" | jq '{
        schemaVersion: .schemaVersion,
        mediaType: .mediaType,
        size: (.config.size + ([.layers[].size] | add)),
        layers: .layers | length,
        created: (.history[0].v1Compatibility | fromjson | .created)
    }'
}

# ========================================
# Registry Maintenance
# ========================================
cleanup_registry() {
    local keep_tags="${1:-5}"
    local repository="${2:-}"

    echo "🧹 Cleaning up registry (keeping $keep_tags recent tags per repository)..."

    if [ -n "$repository" ]; then
        cleanup_repository "$repository" "$keep_tags"
    else
        # Cleanup all repositories
        list_registry_repositories | while read -r repo; do
            cleanup_repository "$repo" "$keep_tags"
        done
    fi

    # Garbage collect to free space
    echo "🗑️  Running garbage collection..."
    docker exec docker-registry registry garbage-collect \
        /etc/registry/config.yml

    echo "✅ Registry cleanup completed"
}

cleanup_repository() {
    local repository="$1"
    local keep_tags="$2"

    echo "🧹 Cleaning up repository: $repository (keeping $keep_tags tags)"

    # Get all tags sorted by date
    local tags=$(curl -s -k -u admin:admin123 \
        "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/$repository/tags/list" | \
        jq -r '.tags[]' | \
        sort -V)

    local tag_count=$(echo "$tags" | wc -l)

    if [ "$tag_count" -le "$keep_tags" ]; then
        echo "  Repository has $tag_count tags, keeping all"
        return
    fi

    local delete_count=$((tag_count - keep_tags))
    echo "  Deleting $delete_count old tags..."

    # Delete old tags
    echo "$tags" | head -n "$delete_count" | while read -r tag; do
        delete_image_tag "$repository" "$tag"
    done
}

delete_image_tag() {
    local repository="$1"
    local tag="$2"

    echo "🗑️  Deleting $repository:$tag..."

    # Get digest
    local digest=$(curl -s -k -u admin:admin123 \
        -H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
        -I "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/$repository/manifests/$tag" | \
        grep -i docker-content-digest | \
        awk '{print $2}' | tr -d '\r')

    if [ -n "$digest" ]; then
        # Delete by digest
        curl -s -k -u admin:admin123 -X DELETE \
            "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/$repository/manifests/$digest"
        echo "  ✅ Deleted $repository:$tag"
    else
        echo "  ❌ Could not get digest for $repository:$tag"
    fi
}

# ========================================
# Registry Monitoring
# ========================================
monitor_registry() {
    echo "📊 Registry monitoring dashboard"

    while true; do
        clear
        echo "=================================="
        echo "Registry Status Dashboard"
        echo "$(date)"
        echo "=================================="

        # Registry health
        if curl -s -k "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/" &>/dev/null; then
            echo "✅ Registry: Online"
        else
            echo "❌ Registry: Offline"
        fi

        # Storage usage
        echo ""
        echo "💾 Storage Usage:"
        du -sh registry/data 2>/dev/null || echo "  Could not read storage usage"

        # Repository count
        echo ""
        echo "📦 Repositories:"
        local repo_count=$(list_registry_repositories 2>/dev/null | wc -l)
        echo "  Total repositories: $repo_count"

        # Recent activity (last 10 entries from logs)
        echo ""
        echo "📋 Recent Activity:"
        docker logs docker-registry --tail=5 2>/dev/null | \
            grep -E "(GET|PUT|DELETE)" | \
            tail -5

        echo ""
        echo "Press Ctrl+C to exit monitoring..."
        sleep 30
    done
}

test_registry_connectivity() {
    echo "🔍 Testing registry connectivity..."

    # Test 1: Health check
    if curl -s -k "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/" | grep -q '{}'; then
        echo "✅ Registry API responding"
    else
        echo "❌ Registry API not responding"
        return 1
    fi

    # Test 2: Authentication
    if curl -s -k -u admin:admin123 "https://$REGISTRY_HOST:$REGISTRY_PORT/v2/_catalog" &>/dev/null; then
        echo "✅ Authentication working"
    else
        echo "❌ Authentication failed"
        return 1
    fi

    # Test 3: TLS certificate
    if openssl s_client -connect "$REGISTRY_HOST:$REGISTRY_PORT" -verify_return_error &>/dev/null; then
        echo "✅ TLS certificate valid"
    else
        echo "⚠️  TLS certificate issues (may be self-signed)"
    fi

    echo "✅ Registry connectivity tests completed"
}

# Command routing
case "${1:-help}" in
    setup)
        setup_private_registry
        ;;
    push)
        push_image_to_registry "${2:-myapp:latest}"
        ;;
    pull)
        pull_image_from_registry "${2:-latest}"
        ;;
    list)
        list_registry_repositories
        ;;
    tags)
        list_image_tags "${2:-myapp}"
        ;;
    info)
        get_image_info "${2:-myapp:latest}"
        ;;
    cleanup)
        cleanup_registry "${2:-5}" "${3:-}"
        ;;
    monitor)
        monitor_registry
        ;;
    test)
        test_registry_connectivity
        ;;
    help|*)
        cat << EOF
Registry Management System

Usage: $0 <command> [options]

Commands:
    setup                       Set up private Docker registry
    push [image]               Push image to private registry
    pull [tag]                 Pull image from private registry
    list                       List all repositories
    tags [repository]          List tags for repository
    info [image]               Get detailed image information
    cleanup [keep] [repo]      Clean up old images
    monitor                    Real-time registry monitoring
    test                       Test registry connectivity

Examples:
    $0 setup                   # Set up private registry
    $0 push myapp:v1.2.3      # Push specific version
    $0 tags myapp             # List all tags for myapp
    $0 cleanup 10 myapp       # Keep 10 recent tags for myapp
EOF
        ;;
esac

Key Takeaways

Advanced containerization transforms basic Docker usage into production-grade infrastructure that scales, secures, and operates reliably under enterprise conditions. Professional container operations require thinking beyond single-host deployments to orchestration platforms, comprehensive security hardening, performance optimization, and centralized registry management.

The advanced containerization mastery mindset:

  • Orchestration enables scale: Single-host containers don’t scale—orchestration platforms handle real-world complexity
  • Security requires depth: Container security is layered—image scanning, runtime hardening, network policies, and secrets management
  • Performance needs optimization: Default containers are bloated—production requires optimized images, efficient layers, and performance testing
  • Registries need management: Docker Hub isn’t enough—enterprise requires private registries with access control, cleanup, and monitoring

What distinguishes production-grade containerization:

  • Orchestration strategies that handle automatic scaling, health checking, and zero-downtime deployments
  • Comprehensive security hardening that prevents container escapes and vulnerability exploitation
  • Image optimization techniques that reduce deployment time and runtime resource consumption
  • Enterprise registry management with authentication, cleanup policies, and operational monitoring

What’s Next

This article covered advanced containerization with orchestration concepts, production deployment strategies, security hardening, image optimization, and registry management. The next article advances to deployment infrastructure with CI/CD pipeline automation, infrastructure as code, monitoring and alerting systems, log aggregation, and disaster recovery planning that supports containerized applications at scale.

You’re no longer just running containers—you’re operating production-grade containerized infrastructure that scales, performs, and secures applications for enterprise environments. The containerization foundation is complete. Now we build the deployment and operational infrastructure around it.