🎉 We're live! All services are free during our trial period—pricing plans coming soon.

Whisper Docker Setup: Complete Guide to Running OpenAI Whisper in Docker

Whisper Docker Setup: Complete Guide to Running OpenAI Whisper in Docker

Eric King

Eric King

Author


Introduction

Running OpenAI Whisper in Docker containers provides a consistent, isolated environment that simplifies deployment and eliminates "it works on my machine" issues. Docker makes it easy to:
  • Deploy anywhere - Run the same container on any Docker-compatible platform
  • Isolate dependencies - Avoid conflicts with system packages
  • Scale easily - Spin up multiple containers for parallel processing
  • Version control - Pin specific Whisper versions and configurations
  • Simplify deployment - One command to run everything
This guide covers everything you need to set up Whisper in Docker, from basic containers to production-ready configurations with GPU support.

Why Use Docker for Whisper?

Benefits of Containerization

1. Consistency
  • Same environment across development, staging, and production
  • No dependency conflicts
  • Reproducible builds
2. Portability
  • Run on any platform that supports Docker
  • Easy migration between servers
  • Cloud-agnostic deployment
3. Isolation
  • No interference with host system
  • Clean uninstall (just remove container)
  • Security through isolation
4. Scalability
  • Easy horizontal scaling
  • Load balancing across containers
  • Resource limits per container
5. DevOps Integration
  • Works with CI/CD pipelines
  • Kubernetes-ready
  • Cloud deployment friendly

Prerequisites

Before starting, ensure you have:
  • Docker installed (version 20.10+)
  • Docker Compose (optional, for multi-container setups)
  • NVIDIA Docker (optional, for GPU support)
  • Basic knowledge of Docker commands

Install Docker

macOS:
# Install Docker Desktop from docker.com
# Or using Homebrew
brew install --cask docker
Ubuntu/Debian:
sudo apt update
sudo apt install docker.io docker-compose
sudo systemctl start docker
sudo systemctl enable docker
Windows: Download Docker Desktop from docker.com

Verify Installation

docker --version
docker-compose --version

Basic Dockerfile for Whisper

Let's start with a simple Dockerfile that sets up Whisper:
FROM python:3.10-slim

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
RUN pip install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio

# Copy application code (if you have custom scripts)
# COPY . .

# Set default command
CMD ["whisper", "--help"]

Build the Image

docker build -t whisper:latest .

Run Basic Container

docker run --rm whisper:latest whisper --version

Dockerfile with API Server

For production use, you'll likely want an API server. Here's a more complete Dockerfile:
FROM python:3.10-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
RUN pip install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio \
    fastapi \
    uvicorn \
    python-multipart

# Create directories for audio and output
RUN mkdir -p /app/audio /app/output

# Copy application code
COPY app.py .
COPY requirements.txt .

# Expose API port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# Run API server
CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Example API Server (app.py)

from fastapi import FastAPI, File, UploadFile
from fastapi.responses import JSONResponse
import whisper
import os

app = FastAPI()

# Load Whisper model (can be configured via env)
model_name = os.getenv("WHISPER_MODEL", "base")
model = whisper.load_model(model_name)

@app.get("/health")
def health():
    return {"status": "healthy"}

@app.post("/transcribe")
async def transcribe(file: UploadFile = File(...)):
    # Save uploaded file
    file_path = f"/app/audio/{file.filename}"
    with open(file_path, "wb") as f:
        content = await file.read()
        f.write(content)
    
    # Transcribe
    result = model.transcribe(file_path)
    
    # Clean up
    os.remove(file_path)
    
    return JSONResponse(content={
        "text": result["text"],
        "language": result["language"]
    })

requirements.txt

fastapi==0.104.1
uvicorn[standard]==0.24.0
python-multipart==0.0.6
openai-whisper
torch
torchaudio

Docker Compose Setup

For a complete setup with multiple services, use Docker Compose:

docker-compose.yml

version: '3.8'

services:
  whisper-api:
    build: .
    container_name: whisper-api
    ports:
      - "8000:8000"
    volumes:
      - ./audio:/app/audio
      - ./output:/app/output
    environment:
      - WHISPER_MODEL=base
      - CUDA_VISIBLE_DEVICES=0
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  # Optional: Redis for queue management
  redis:
    image: redis:7-alpine
    container_name: whisper-redis
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    restart: unless-stopped

volumes:
  redis-data:

Run with Docker Compose

# Start services
docker-compose up -d

# View logs
docker-compose logs -f whisper-api

# Stop services
docker-compose down

GPU Support with Docker

To use GPU acceleration, you need NVIDIA Docker runtime:

Install NVIDIA Docker

Ubuntu/Debian:
# Add NVIDIA Docker repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# Install
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

Dockerfile with GPU Support

FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

WORKDIR /app

# Install Python
RUN apt-get update && apt-get install -y \
    python3.10 \
    python3-pip \
    ffmpeg \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies with CUDA support
RUN pip3 install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio \
    --index-url https://download.pytorch.org/whl/cu118

# Install API dependencies
RUN pip3 install --no-cache-dir \
    fastapi \
    uvicorn \
    python-multipart

COPY app.py .
EXPOSE 8000

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Run with GPU

# Using docker run
docker run --gpus all -p 8000:8000 whisper-gpu:latest

# Using docker-compose

docker-compose.yml with GPU

version: '3.8'

services:
  whisper-api:
    build: .
    container_name: whisper-api-gpu
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    ports:
      - "8000:8000"
    volumes:
      - ./audio:/app/audio
      - ./output:/app/output
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Optimized Dockerfile for Production

Here's a production-ready Dockerfile with optimizations:
# Multi-stage build for smaller image
FROM python:3.10-slim as builder

WORKDIR /app

# Install build dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    g++ \
    git \
    && rm -rf /var/lib/apt/lists/*

# Install Python packages
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Final stage
FROM python:3.10-slim

WORKDIR /app

# Install runtime dependencies only
RUN apt-get update && apt-get install -y \
    ffmpeg \
    curl \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

# Copy Python packages from builder
COPY --from=builder /root/.local /root/.local

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

# Create non-root user for security
RUN useradd -m -u 1000 whisper && \
    mkdir -p /app/audio /app/output && \
    chown -R whisper:whisper /app

USER whisper

# Copy application code
COPY --chown=whisper:whisper app.py .
COPY --chown=whisper:whisper requirements.txt .

EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Benefits of Multi-Stage Build

  • Smaller image size - Only runtime dependencies in final image
  • Faster builds - Cache build dependencies separately
  • Better security - Non-root user, minimal attack surface

Environment Variables Configuration

Make your Docker setup configurable with environment variables:

Dockerfile

FROM python:3.10-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir \
    openai-whisper \
    torch \
    torchaudio \
    fastapi \
    uvicorn \
    python-multipart

COPY app.py .

# Environment variables with defaults
ENV WHISPER_MODEL=base
ENV MAX_FILE_SIZE=100MB
ENV LOG_LEVEL=INFO

EXPOSE 8000

CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

docker-compose.yml with Environment Variables

version: '3.8'

services:
  whisper-api:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - ./audio:/app/audio
      - ./output:/app/output
    environment:
      - WHISPER_MODEL=small
      - MAX_FILE_SIZE=200MB
      - LOG_LEVEL=DEBUG
      - CUDA_VISIBLE_DEVICES=0
    env_file:
      - .env
    restart: unless-stopped

.env file

WHISPER_MODEL=small
MAX_FILE_SIZE=200MB
LOG_LEVEL=INFO
CUDA_VISIBLE_DEVICES=0

Volume Management

Proper volume configuration ensures data persistence:

docker-compose.yml with Volumes

version: '3.8'

services:
  whisper-api:
    build: .
    ports:
      - "8000:8000"
    volumes:
      # Bind mount for development
      - ./audio:/app/audio
      - ./output:/app/output
      
      # Named volume for model cache (persists across containers)
      - whisper-models:/root/.cache/whisper
      
      # Config volume
      - ./config:/app/config:ro
    environment:
      - WHISPER_MODEL=base

volumes:
  whisper-models:
    driver: local

Benefits

  • Model caching - Models downloaded once, reused across containers
  • Data persistence - Output files survive container restarts
  • Configuration - Easy to update configs without rebuilding

Health Checks and Monitoring

Dockerfile with Health Check

FROM python:3.10-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    ffmpeg \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir \
    openai-whisper \
    fastapi \
    uvicorn

COPY app.py .

# Health check endpoint
HEALTHCHECK --interval=30s \
            --timeout=10s \
            --start-period=40s \
            --retries=3 \
            CMD curl -f http://localhost:8000/health || exit 1

EXPOSE 8000
CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000"]

Health Check Endpoint

from fastapi import FastAPI
import whisper

app = FastAPI()
model = whisper.load_model("base")

@app.get("/health")
def health():
    try:
        # Quick test transcription
        return {"status": "healthy", "model": "base"}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}, 503

Common Use Cases

Use Case 1: Development Environment

version: '3.8'

services:
  whisper-dev:
    build:
      context: .
      dockerfile: Dockerfile.dev
    volumes:
      - .:/app
      - /app/__pycache__
    ports:
      - "8000:8000"
    environment:
      - WHISPER_MODEL=tiny
      - DEBUG=true
    command: uvicorn app.py:app --reload --host 0.0.0.0 --port 8000

Use Case 2: Production with Queue

version: '3.8'

services:
  whisper-api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - REDIS_URL=redis://redis:6379
      - WHISPER_MODEL=small
    depends_on:
      - redis
      - worker

  worker:
    build: .
    command: python worker.py
    environment:
      - REDIS_URL=redis://redis:6379
      - WHISPER_MODEL=small
    volumes:
      - ./audio:/app/audio
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    volumes:
      - redis-data:/data

volumes:
  redis-data:

Use Case 3: Multi-Model Setup

version: '3.8'

services:
  whisper-fast:
    build: .
    ports:
      - "8001:8000"
    environment:
      - WHISPER_MODEL=tiny
      - PORT=8000

  whisper-balanced:
    build: .
    ports:
      - "8002:8000"
    environment:
      - WHISPER_MODEL=base
      - PORT=8000

  whisper-accurate:
    build: .
    ports:
      - "8003:8000"
    environment:
      - WHISPER_MODEL=large
      - PORT=8000

Best Practices

1. Use Specific Base Images

Bad:
FROM python:latest
Good:
FROM python:3.10-slim

2. Minimize Layers

Bad:
RUN apt-get update
RUN apt-get install -y ffmpeg
RUN apt-get install -y git
Good:
RUN apt-get update && apt-get install -y \
    ffmpeg \
    git \
    && rm -rf /var/lib/apt/lists/*

3. Use .dockerignore

Create .dockerignore:
__pycache__
*.pyc
*.pyo
*.pyd
.Python
.env
.venv
venv/
.git
.gitignore
README.md
*.md
.DS_Store

4. Set Resource Limits

services:
  whisper-api:
    build: .
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G
        reservations:
          cpus: '1'
          memory: 2G

5. Use Health Checks

Always include health checks for production containers:
HEALTHCHECK --interval=30s --timeout=10s CMD curl -f http://localhost:8000/health || exit 1

6. Non-Root User

Run containers as non-root:
RUN useradd -m -u 1000 whisper
USER whisper

7. Cache Models

Use volumes to cache downloaded models:
volumes:
  - whisper-models:/root/.cache/whisper

Troubleshooting Common Issues

Issue 1: Container Exits Immediately

Problem: Container starts then exits
Solution:
# Check logs
docker logs <container-id>

# Run interactively to debug
docker run -it whisper:latest /bin/bash

Issue 2: GPU Not Available

Problem: GPU not detected in container
Solution:
# Verify NVIDIA Docker
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

# Check runtime
docker info | grep -i runtime

Issue 3: Out of Memory

Problem: Container runs out of memory
Solution:
# Increase memory limit
deploy:
  resources:
    limits:
      memory: 8G

Issue 4: Slow Model Download

Problem: Models download every time container starts
Solution:
# Use volume for model cache
volumes:
  - whisper-models:/root/.cache/whisper

Issue 5: Permission Denied

Problem: Cannot write to volumes
Solution:
# Fix permissions in Dockerfile
RUN chown -R whisper:whisper /app

Performance Optimization

1. Model Preloading

Preload models in Dockerfile:
# Download model during build
RUN python -c "import whisper; whisper.load_model('base')"

2. Use Faster-Whisper

For better performance, use faster-whisper:
RUN pip install --no-cache-dir faster-whisper

3. Multi-Threading

Configure worker processes:
CMD ["uvicorn", "app.py:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]

4. Resource Allocation

Allocate appropriate resources:
deploy:
  resources:
    limits:
      cpus: '4'
      memory: 8G

Security Considerations

1. Use Official Base Images

FROM python:3.10-slim  # Official Python image

2. Scan for Vulnerabilities

docker scan whisper:latest

3. Keep Images Updated

Regularly update base images and dependencies:
FROM python:3.10-slim  # Use latest patch version
RUN pip install --upgrade pip

4. Limit Network Access

services:
  whisper-api:
    build: .
    networks:
      - internal
    # No external ports if accessed via reverse proxy

Conclusion

Dockerizing Whisper provides a robust, scalable solution for speech-to-text transcription. Key takeaways:
  1. Start simple - Begin with a basic Dockerfile
  2. Use Docker Compose - Simplify multi-service setups
  3. Enable GPU - For production performance
  4. Follow best practices - Security, optimization, monitoring
  5. Test thoroughly - Before deploying to production
With proper Docker setup, you can deploy Whisper consistently across any environment, from local development to cloud production.

Next Steps

  • Build your first container - Start with the basic Dockerfile
  • Add GPU support - If you have NVIDIA GPUs available
  • Set up Docker Compose - For complete application stack
  • Deploy to cloud - Use cloud container services (ECS, GKE, AKS)
For more deployment strategies, check out our guides on Whisper Cloud Deployment and Whisper API vs Local Deployment.

Try It Free Now

Try our AI audio and video service! You can not only enjoy high-precision speech-to-text transcription, multilingual translation, and intelligent speaker diarization, but also realize automatic video subtitle generation, intelligent audio and video content editing, and synchronized audio-visual analysis. It covers all scenarios such as meeting recordings, short video creation, and podcast production—start your free trial now!

Convert MP3 to TextConvert Voice Recording to TextVoice Typing OnlineVoice to Text with TimestampsVoice to Text Real TimeVoice to Text for Long AudioVoice to Text for VideoVoice to Text for YouTubeVoice to Text for Video EditingVoice to Text for SubtitlesVoice to Text for PodcastsVoice to Text for InterviewsInterview Audio to TextVoice to Text for RecordingsVoice to Text for MeetingsVoice to Text for LecturesVoice to Text for NotesVoice to Text Multi LanguageVoice to Text AccurateVoice to Text FastPremiere Pro Voice to Text AlternativeDaVinci Voice to Text AlternativeVEED Voice to Text AlternativeInVideo Voice to Text AlternativeOtter.ai Voice to Text AlternativeDescript Voice to Text AlternativeTrint Voice to Text AlternativeRev Voice to Text AlternativeSonix Voice to Text AlternativeHappy Scribe Voice to Text AlternativeZoom Voice to Text AlternativeGoogle Meet Voice to Text AlternativeMicrosoft Teams Voice to Text AlternativeFireflies.ai Voice to Text AlternativeFathom Voice to Text AlternativeFlexClip Voice to Text AlternativeKapwing Voice to Text AlternativeCanva Voice to Text AlternativeSpeech to Text for Long AudioAI Voice to TextVoice to Text FreeVoice to Text No AdsVoice to Text for Noisy AudioVoice to Text with TimeGenerate Subtitles from AudioPodcast Transcription OnlineTranscribe Customer CallsTikTok Voice to TextTikTok Audio to TextYouTube Voice to TextYouTube Audio to TextMemo Voice to TextWhatsApp Voice Message to TextTelegram Voice to TextDiscord Call TranscriptionTwitch Voice to TextSkype Voice to TextMessenger Voice to TextLINE Voice Message to TextTranscribe Vlogs to TextConvert Sermon Audio to TextConvert Talking to WritingTranslate Audio to TextTurn Audio Notes to TextVoice TypingVoice Typing for MeetingsVoice Typing for YouTubeSpeak to TypeHands-Free TypingVoice to WordsSpeech to WordsSpeech to Text OnlineSpeech to Text for MeetingsFast Speech to TextTikTok Speech to TextTikTok Sound to TextTalking to WordsTalk to TextAudio to TypingSound to TextVoice Writing ToolSpeech Writing ToolVoice DictationLegal Transcription ToolMedical Voice Dictation ToolJapanese Audio TranscriptionKorean Meeting TranscriptionMeeting Transcription ToolMeeting Audio to TextLecture to Text ConverterLecture Audio to TextVideo to Text TranscriptionSubtitle Generator for TikTokCall Center TranscriptionReels Audio to Text ToolTranscribe MP3 to TextTranscribe WAV File to TextCapCut Voice to TextCapCut Speech to TextVoice to Text in EnglishAudio to Text EnglishVoice to Text in SpanishVoice to Text in FrenchAudio to Text FrenchVoice to Text in GermanAudio to Text GermanVoice to Text in JapaneseAudio to Text JapaneseVoice to Text in KoreanAudio to Text KoreanVoice to Text in PortugueseVoice to Text in ArabicVoice to Text in ChineseVoice to Text in HindiVoice to Text in RussianWeb Voice Typing ToolVoice Typing Website