Whisper Docker Setup: Complete Guide to Running OpenAI Whisper in Docker

2026-01-14SpeechToText Whisper Docker Tutorial

Eric King

Author

Introduction

Running OpenAI Whisper in Docker containers provides a consistent, isolated environment that simplifies deployment and eliminates "it works on my machine" issues. Docker makes it easy to:

Deploy anywhere - Run the same container on any Docker-compatible platform
Isolate dependencies - Avoid conflicts with system packages
Scale easily - Spin up multiple containers for parallel processing
Version control - Pin specific Whisper versions and configurations
Simplify deployment - One command to run everything

This guide covers everything you need to set up Whisper in Docker, from basic containers to production-ready configurations with GPU support.

Why Use Docker for Whisper?

Benefits of Containerization

1. Consistency

Same environment across development, staging, and production
No dependency conflicts
Reproducible builds

2. Portability

Run on any platform that supports Docker
Easy migration between servers
Cloud-agnostic deployment

3. Isolation

No interference with host system
Clean uninstall (just remove container)
Security through isolation

4. Scalability

Easy horizontal scaling
Load balancing across containers
Resource limits per container

5. DevOps Integration

Works with CI/CD pipelines
Kubernetes-ready
Cloud deployment friendly

Prerequisites

Before starting, ensure you have:

Docker installed (version 20.10+)
Docker Compose (optional, for multi-container setups)
NVIDIA Docker (optional, for GPU support)
Basic knowledge of Docker commands

Install Docker

macOS:

# Install Docker Desktop from docker.com
# Or using Homebrew
brew install --cask docker

Ubuntu/Debian:

sudo apt update
sudo apt install docker.io docker-compose
sudo systemctl start docker
sudo systemctl enable docker

Windows: Download Docker Desktop from docker.com

Verify Installation

docker --version
docker-compose --version

Basic Dockerfile for Whisper

Let's start with a simple Dockerfile that sets up Whisper:

FROM python:3.10-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
RUN pip install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio

# Copy application code (if you have custom scripts)
# COPY . .

# Set default command
CMD ["whisper", "--help"]

Build the Image

docker build -t whisper:latest .

Run Basic Container

docker run --rm whisper:latest whisper --version

Dockerfile with API Server

For production use, you'll likely want an API server. Here's a more complete Dockerfile:

FROM python:3.10-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
RUN pip install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio \
    fastapi \
    uvicorn \
    python-multipart

# Create directories for audio and output
RUN mkdir -p /app/audio /app/output

# Copy application code
COPY app.py .
COPY requirements.txt .

# Expose API port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# Run API server
CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Example API Server (app.py)

from fastapi import FastAPI, File, UploadFile
from fastapi.responses import JSONResponse
import whisper
import os

app = FastAPI()

# Load Whisper model (can be configured via env)
model_name = os.getenv("WHISPER_MODEL", "base")
model = whisper.load_model(model_name)

@app.get("/health")
def health():
    return {"status": "healthy"}

@app.post("/transcribe")
async def transcribe(file: UploadFile = File(...)):
    # Save uploaded file
    file_path = f"/app/audio/{file.filename}"
    with open(file_path, "wb") as f:
        content = await file.read()
        f.write(content)
    
    # Transcribe
    result = model.transcribe(file_path)
    
    # Clean up
    os.remove(file_path)
    
    return JSONResponse(content={
        "text": result["text"],
        "language": result["language"]
    })

requirements.txt

fastapi==0.104.1
uvicorn[standard]==0.24.0
python-multipart==0.0.6
openai-whisper
torch
torchaudio

Docker Compose Setup

For a complete setup with multiple services, use Docker Compose:

docker-compose.yml

version: '3.8'

services:
  whisper-api:
    build: .
    container_name: whisper-api
    ports:
      - "8000:8000"
    volumes:
      - ./audio:/app/audio
      - ./output:/app/output
    environment:
      - WHISPER_MODEL=base
      - CUDA_VISIBLE_DEVICES=0
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  # Optional: Redis for queue management
  redis:
    image: redis:7-alpine
    container_name: whisper-redis
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    restart: unless-stopped

volumes:
  redis-data:

Run with Docker Compose

# Start services
docker-compose up -d

# View logs
docker-compose logs -f whisper-api

# Stop services
docker-compose down

GPU Support with Docker

To use GPU acceleration, you need NVIDIA Docker runtime:

Install NVIDIA Docker

Ubuntu/Debian:

# Add NVIDIA Docker repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# Install
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

Dockerfile with GPU Support

FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

WORKDIR /app

# Install Python
RUN apt-get update && apt-get install -y \
    python3.10 \
    python3-pip \
    ffmpeg \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies with CUDA support
RUN pip3 install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio \
    --index-url https://download.pytorch.org/whl/cu118

# Install API dependencies
RUN pip3 install --no-cache-dir \
    fastapi \
    uvicorn \
    python-multipart

COPY app.py .
EXPOSE 8000

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Run with GPU

# Using docker run
docker run --gpus all -p 8000:8000 whisper-gpu:latest

# Using docker-compose

docker-compose.yml with GPU

version: '3.8'

services:
  whisper-api:
    build: .
    container_name: whisper-api-gpu
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    ports:
      - "8000:8000"
    volumes:
      - ./audio:/app/audio
      - ./output:/app/output
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Optimized Dockerfile for Production

Here's a production-ready Dockerfile with optimizations:

# Multi-stage build for smaller image
FROM python:3.10-slim as builder

WORKDIR /app

# Install build dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    g++ \
    git \
    && rm -rf /var/lib/apt/lists/*

# Install Python packages
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Final stage
FROM python:3.10-slim

WORKDIR /app

# Install runtime dependencies only
RUN apt-get update && apt-get install -y \
    ffmpeg \
    curl \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

# Copy Python packages from builder
COPY --from=builder /root/.local /root/.local

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

# Create non-root user for security
RUN useradd -m -u 1000 whisper && \
    mkdir -p /app/audio /app/output && \
    chown -R whisper:whisper /app

USER whisper

# Copy application code
COPY --chown=whisper:whisper app.py .
COPY --chown=whisper:whisper requirements.txt .

EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Benefits of Multi-Stage Build

Smaller image size - Only runtime dependencies in final image
Faster builds - Cache build dependencies separately
Better security - Non-root user, minimal attack surface

Environment Variables Configuration

Make your Docker setup configurable with environment variables:

Dockerfile

FROM python:3.10-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio \
    fastapi \
    uvicorn \
    python-multipart

COPY app.py .

# Environment variables with defaults
ENV WHISPER_MODEL=base
ENV MAX_FILE_SIZE=100MB
ENV LOG_LEVEL=INFO

EXPOSE 8000

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

docker-compose.yml with Environment Variables

version: '3.8'

services:
  whisper-api:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - ./audio:/app/audio
      - ./output:/app/output
    environment:
      - WHISPER_MODEL=small
      - MAX_FILE_SIZE=200MB
      - LOG_LEVEL=DEBUG
      - CUDA_VISIBLE_DEVICES=0
    env_file:
      - .env
    restart: unless-stopped

.env file

WHISPER_MODEL=small
MAX_FILE_SIZE=200MB
LOG_LEVEL=INFO
CUDA_VISIBLE_DEVICES=0

Volume Management

Proper volume configuration ensures data persistence:

docker-compose.yml with Volumes

version: '3.8'

services:
  whisper-api:
    build: .
    ports:
      - "8000:8000"
    volumes:
      # Bind mount for development
      - ./audio:/app/audio
      - ./output:/app/output
      
      # Named volume for model cache (persists across containers)
      - whisper-models:/root/.cache/whisper
      
      # Config volume
      - ./config:/app/config:ro
    environment:
      - WHISPER_MODEL=base

volumes:
  whisper-models:
    driver: local

Benefits

Model caching - Models downloaded once, reused across containers
Data persistence - Output files survive container restarts
Configuration - Easy to update configs without rebuilding

Health Checks and Monitoring

Dockerfile with Health Check

FROM python:3.10-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    ffmpeg \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir \
    openai-whisper \
    fastapi \
    uvicorn

COPY app.py .

# Health check endpoint
HEALTHCHECK --interval=30s \
            --timeout=10s \
            --start-period=40s \
            --retries=3 \
            CMD curl -f http://localhost:8000/health || exit 1

EXPOSE 8000
CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Health Check Endpoint

from fastapi import FastAPI
import whisper

app = FastAPI()
model = whisper.load_model("base")

@app.get("/health")
def health():
    try:
        # Quick test transcription
        return {"status": "healthy", "model": "base"}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}, 503

Common Use Cases

Use Case 1: Development Environment

version: '3.8'

services:
  whisper-dev:
    build:
      context: .
      dockerfile: Dockerfile.dev
    volumes:
      - .:/app
      - /app/__pycache__
    ports:
      - "8000:8000"
    environment:
      - WHISPER_MODEL=tiny
      - DEBUG=true
    command: uvicorn app.py:app --reload --host 0.0.0.0 --port 8000

Use Case 2: Production with Queue

version: '3.8'

services:
  whisper-api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - REDIS_URL=redis://redis:6379
      - WHISPER_MODEL=small
    depends_on:
      - redis
      - worker

  worker:
    build: .
    command: python worker.py
    environment:
      - REDIS_URL=redis://redis:6379
      - WHISPER_MODEL=small
    volumes:
      - ./audio:/app/audio
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    volumes:
      - redis-data:/data

volumes:
  redis-data:

Use Case 3: Multi-Model Setup

version: '3.8'

services:
  whisper-fast:
    build: .
    ports:
      - "8001:8000"
    environment:
      - WHISPER_MODEL=tiny
      - PORT=8000

  whisper-balanced:
    build: .
    ports:
      - "8002:8000"
    environment:
      - WHISPER_MODEL=base
      - PORT=8000

  whisper-accurate:
    build: .
    ports:
      - "8003:8000"
    environment:
      - WHISPER_MODEL=large
      - PORT=8000

Best Practices

1. Use Specific Base Images

Bad:

FROM python:latest

Good:

FROM python:3.10-slim

2. Minimize Layers

Bad:

RUN apt-get update
RUN apt-get install -y ffmpeg
RUN apt-get install -y git

Good:

RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    && rm -rf /var/lib/apt/lists/*

3. Use .dockerignore

Create .dockerignore:

__pycache__
*.pyc
*.pyo
*.pyd
.Python
.env
.venv
venv/
.git
.gitignore
README.md
*.md
.DS_Store

4. Set Resource Limits

services:
  whisper-api:
    build: .
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G
        reservations:
          cpus: '1'
          memory: 2G

5. Use Health Checks

Always include health checks for production containers:

HEALTHCHECK --interval=30s --timeout=10s CMD curl -f http://localhost:8000/health || exit 1

6. Non-Root User

Run containers as non-root:

RUN useradd -m -u 1000 whisper
USER whisper

7. Cache Models

Use volumes to cache downloaded models:

volumes:
  - whisper-models:/root/.cache/whisper

Troubleshooting Common Issues

Issue 1: Container Exits Immediately

Problem: Container starts then exits

Solution:

# Check logs
docker logs <container-id>

# Run interactively to debug
docker run -it whisper:latest /bin/bash

Issue 2: GPU Not Available

Problem: GPU not detected in container

Solution:

# Verify NVIDIA Docker
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

# Check runtime
docker info | grep -i runtime

Issue 3: Out of Memory

Problem: Container runs out of memory

Solution:

# Increase memory limit
deploy:
  resources:
    limits:
      memory: 8G

Issue 4: Slow Model Download

Problem: Models download every time container starts

Solution:

# Use volume for model cache
volumes:
  - whisper-models:/root/.cache/whisper

Issue 5: Permission Denied

Problem: Cannot write to volumes

Solution:

# Fix permissions in Dockerfile
RUN chown -R whisper:whisper /app

Performance Optimization

1. Model Preloading

Preload models in Dockerfile:

# Download model during build
RUN python -c "import whisper; whisper.load_model('base')"

2. Use Faster-Whisper

For better performance, use faster-whisper:

RUN pip install --no-cache-dir faster-whisper

3. Multi-Threading

Configure worker processes:

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]

4. Resource Allocation

Allocate appropriate resources:

deploy:
  resources:
    limits:
      cpus: '4'
      memory: 8G

Security Considerations

1. Use Official Base Images

FROM python:3.10-slim  # Official Python image

2. Scan for Vulnerabilities

docker scan whisper:latest

3. Keep Images Updated

Regularly update base images and dependencies:

FROM python:3.10-slim  # Use latest patch version
RUN pip install --upgrade pip

4. Limit Network Access

services:
  whisper-api:
    build: .
    networks:
      - internal
    # No external ports if accessed via reverse proxy

Conclusion

Dockerizing Whisper provides a robust, scalable solution for speech-to-text transcription. Key takeaways:

Start simple - Begin with a basic Dockerfile
Use Docker Compose - Simplify multi-service setups
Enable GPU - For production performance
Follow best practices - Security, optimization, monitoring
Test thoroughly - Before deploying to production

With proper Docker setup, you can deploy Whisper consistently across any environment, from local development to cloud production.

Next Steps

Build your first container - Start with the basic Dockerfile
Add GPU support - If you have NVIDIA GPUs available
Set up Docker Compose - For complete application stack
Deploy to cloud - Use cloud container services (ECS, GKE, AKS)

For more deployment strategies, check out our guides on Whisper Cloud Deployment and Whisper API vs Local Deployment.

Whisper Docker Setup: Complete Guide to Running OpenAI Whisper in Docker

Introduction

Why Use Docker for Whisper?

Benefits of Containerization

Prerequisites

Install Docker

Verify Installation

Basic Dockerfile for Whisper

Build the Image

Run Basic Container

Dockerfile with API Server

Example API Server (app.py)

requirements.txt

Docker Compose Setup

docker-compose.yml

Run with Docker Compose

GPU Support with Docker

Install NVIDIA Docker

Dockerfile with GPU Support

Run with GPU

docker-compose.yml with GPU

Optimized Dockerfile for Production

Benefits of Multi-Stage Build

Environment Variables Configuration

Dockerfile

docker-compose.yml with Environment Variables

.env file

Volume Management

docker-compose.yml with Volumes

Benefits

Health Checks and Monitoring

Dockerfile with Health Check

Health Check Endpoint

Common Use Cases

Use Case 1: Development Environment

Use Case 2: Production with Queue

Use Case 3: Multi-Model Setup

Best Practices

1. Use Specific Base Images

2. Minimize Layers

3. Use .dockerignore

4. Set Resource Limits

5. Use Health Checks

6. Non-Root User

7. Cache Models

Troubleshooting Common Issues

Issue 1: Container Exits Immediately

Issue 2: GPU Not Available

Issue 3: Out of Memory

Issue 4: Slow Model Download

Issue 5: Permission Denied

Performance Optimization

1. Model Preloading

2. Use Faster-Whisper

3. Multi-Threading

4. Resource Allocation

Security Considerations

1. Use Official Base Images

2. Scan for Vulnerabilities

3. Keep Images Updated

4. Limit Network Access

Conclusion

Next Steps

Related Posts

What Is Speech to Text and How to Use It: A Complete Beginner's Guide

How to Convert Audio to Text Online: Free & Accurate Methods (2026 Guide)

How to Remove Background Noise for STT: Complete Guide to Noise Reduction for Speech-to-Text

Try It Free Now