open-webui/Dockerfile-cuda

# syntax=docker/dockerfile:1
ARG CUDA_VERSION=12.3.2

######## WebUI frontend ########
FROM node:21-alpine3.19 as build

WORKDIR /app

COPY package.json package-lock.json ./
RUN npm ci

COPY . .
RUN npm run build

######## CPU-only WebUI backend ########
# To support both CPU and GPU backend, we need to keep the ability to build the CPU-only image.
#FROM python:3.11-slim-bookworm as base
#FROM --platform=linux/amd64 ubuntu:22.04 AS cpu-builder-amd64
#FROM --platform=linux/amd64 cpu-builder-amd64 AS cpu-build-amd64
#RUN OPENWEBUI_CPU_TARGET="cpu" sh gen_linux.sh
#FROM --platform=linux/amd64 cpu-builder-amd64 AS cpu_avx-build-amd64
#RUN OPENWEBUI_CPU_TARGET="cpu_avx" sh gen_linux.sh
#FROM --platform=linux/amd64 cpu-builder-amd64 AS cpu_avx2-build-amd64
#RUN OPENWEBUI_CPU_TARGET="cpu_avx2" sh gen_linux.sh

######## CUDA WebUI backend ########
FROM --platform=linux/amd64 nvidia/cuda:"$CUDA_VERSION"-devel-ubuntu22.04 AS cuda-build-amd64

# Set environment variables for NVIDIA Container Toolkit
ENV LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 \
    NVIDIA_DRIVER_CAPABILITIES=all \
    NVIDIA_VISIBLE_DEVICES=all

ENV ENV=prod \
    PORT=8080

## Base URL Config ##
ENV OLLAMA_BASE_URL="/ollama" \
    OPENAI_API_BASE_URL=""

## API Key and Security Config ##
ENV OPENAI_API_KEY="" \
    WEBUI_SECRET_KEY="" \
    SCARF_NO_ANALYTICS=true \
    DO_NOT_TRACK=true

######## Preloaded models ########
# whisper TTS Settings
ENV WHISPER_MODEL="base" \
    WHISPER_MODEL_DIR="/app/backend/data/cache/whisper/models"

# RAG Embedding Model Settings
# any sentence transformer model; models to use can be found at https://huggingface.co/models?library=sentence-transformers
# Leaderboard: https://huggingface.co/spaces/mteb/leaderboard 
# for better performance and multilangauge support use "intfloat/multilingual-e5-large" (~2.5GB) or "intfloat/multilingual-e5-base" (~1.5GB)
# IMPORTANT: If you change the default model (all-MiniLM-L6-v2) and vice versa, you aren't able to use RAG Chat with your previous documents loaded in the WebUI! You need to re-embed them.
ENV RAG_EMBEDDING_MODEL="all-MiniLM-L6-v2" \
    # device type for whisper tts and embedding models - "cpu" (default), "cuda" (NVIDIA GPU and CUDA required), or "mps" (apple silicon) - choosing this right can lead to better performance
    RAG_EMBEDDING_MODEL_DEVICE_TYPE="cuda" \
    RAG_EMBEDDING_MODEL_DIR="/app/backend/data/cache/embedding/models" \
    SENTENCE_TRANSFORMERS_HOME=$RAG_EMBEDDING_MODEL_DIR

######## Preloaded models ########
WORKDIR /app/backend

# Install Python & dependencies in the container
RUN apt-get update && \
    apt-get install -y --no-install-recommends python3.11 python3-pip ffmpeg libsm6 libxext6 pandoc netcat-openbsd && \
    rm -rf /var/lib/apt/lists/*

COPY ./backend/requirements.txt ./requirements.txt
RUN pip3 install torch torchvision torchaudio --no-cache-dir && \
    pip3 install -r requirements.txt --no-cache-dir

# copy built frontend files
COPY --from=build /app/build /app/build
COPY --from=build /app/CHANGELOG.md /app/CHANGELOG.md
COPY --from=build /app/package.json /app/package.json

# copy backend files
COPY ./backend .

EXPOSE 8080

CMD [ "bash", "start.sh"]
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00			`# syntax=docker/dockerfile:1`
Parametrize CUDA_VERSION in Dockerfile Standardized CUDA_VERSION as a global ARG to ensure consistency and facilitate version updates across the Dockerfile. This change allows the CUDA version to be defined once at the beginning and reused, reducing the chance of mismatched versions and easing maintenance when changing CUDA versions. It further streamlines the build process for potential multi-stage builds with varying CUDA dependencies. Refs #nvidia-update 2024-03-17 07:27:06 +01:00			`ARG CUDA_VERSION=12.3.2`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
			`######## WebUI frontend ########`
			`FROM node:21-alpine3.19 as build`

			`WORKDIR /app`

			`COPY package.json package-lock.json ./`
			`RUN npm ci`

			`COPY . .`
			`RUN npm run build`

Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`######## CPU-only WebUI backend ########`
			`# To support both CPU and GPU backend, we need to keep the ability to build the CPU-only image.`
			`#FROM python:3.11-slim-bookworm as base`
			`#FROM --platform=linux/amd64 ubuntu:22.04 AS cpu-builder-amd64`
			`#FROM --platform=linux/amd64 cpu-builder-amd64 AS cpu-build-amd64`
			`#RUN OPENWEBUI_CPU_TARGET="cpu" sh gen_linux.sh`
			`#FROM --platform=linux/amd64 cpu-builder-amd64 AS cpu_avx-build-amd64`
			`#RUN OPENWEBUI_CPU_TARGET="cpu_avx" sh gen_linux.sh`
			`#FROM --platform=linux/amd64 cpu-builder-amd64 AS cpu_avx2-build-amd64`
			`#RUN OPENWEBUI_CPU_TARGET="cpu_avx2" sh gen_linux.sh`

			`######## CUDA WebUI backend ########`
Parametrize CUDA_VERSION in Dockerfile Standardized CUDA_VERSION as a global ARG to ensure consistency and facilitate version updates across the Dockerfile. This change allows the CUDA version to be defined once at the beginning and reused, reducing the chance of mismatched versions and easing maintenance when changing CUDA versions. It further streamlines the build process for potential multi-stage builds with varying CUDA dependencies. Refs #nvidia-update 2024-03-17 07:27:06 +01:00			`FROM --platform=linux/amd64 nvidia/cuda:"$CUDA_VERSION"-devel-ubuntu22.04 AS cuda-build-amd64`

Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00			`# Set environment variables for NVIDIA Container Toolkit`
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`ENV LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 \`
			`NVIDIA_DRIVER_CAPABILITIES=all \`
			`NVIDIA_VISIBLE_DEVICES=all`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`ENV ENV=prod \`
			`PORT=8080`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`## Base URL Config ##`
			`ENV OLLAMA_BASE_URL="/ollama" \`
			`OPENAI_API_BASE_URL=""`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`## API Key and Security Config ##`
			`ENV OPENAI_API_KEY="" \`
			`WEBUI_SECRET_KEY="" \`
			`SCARF_NO_ANALYTICS=true \`
			`DO_NOT_TRACK=true`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
			`######## Preloaded models ########`
			`# whisper TTS Settings`
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`ENV WHISPER_MODEL="base" \`
			`WHISPER_MODEL_DIR="/app/backend/data/cache/whisper/models"`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
			`# RAG Embedding Model Settings`
			`# any sentence transformer model; models to use can be found at https://huggingface.co/models?library=sentence-transformers`
			`# Leaderboard: https://huggingface.co/spaces/mteb/leaderboard`
			`# for better performance and multilangauge support use "intfloat/multilingual-e5-large" (~2.5GB) or "intfloat/multilingual-e5-base" (~1.5GB)`
			`# IMPORTANT: If you change the default model (all-MiniLM-L6-v2) and vice versa, you aren't able to use RAG Chat with your previous documents loaded in the WebUI! You need to re-embed them.`
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`ENV RAG_EMBEDDING_MODEL="all-MiniLM-L6-v2" \`
			`# device type for whisper tts and embedding models - "cpu" (default), "cuda" (NVIDIA GPU and CUDA required), or "mps" (apple silicon) - choosing this right can lead to better performance`
			`RAG_EMBEDDING_MODEL_DEVICE_TYPE="cuda" \`
			`RAG_EMBEDDING_MODEL_DIR="/app/backend/data/cache/embedding/models" \`
			`SENTENCE_TRANSFORMERS_HOME=$RAG_EMBEDDING_MODEL_DIR`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
			`######## Preloaded models ########`
			`WORKDIR /app/backend`

			`# Install Python & dependencies in the container`
			`RUN apt-get update && \`
			`apt-get install -y --no-install-recommends python3.11 python3-pip ffmpeg libsm6 libxext6 pandoc netcat-openbsd && \`
			`rm -rf /var/lib/apt/lists/*`

			`COPY ./backend/requirements.txt ./requirements.txt`
Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`RUN pip3 install torch torchvision torchaudio --no-cache-dir && \`
			`pip3 install -r requirements.txt --no-cache-dir`
Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00
			`# copy built frontend files`
			`COPY --from=build /app/build /app/build`
			`COPY --from=build /app/CHANGELOG.md /app/CHANGELOG.md`
			`COPY --from=build /app/package.json /app/package.json`

			`# copy backend files`
			`COPY ./backend .`

Optimize Dockerfile for CUDA support Refactored the Dockerfile to better organize and streamline environment variable settings, emphasizing support for a CUDA-based WebUI backend while retaining the ability to build a CPU-only image. Consolidated ENV commands to reduce layers, improving build efficiency, and set a default PORT environment to enhance container usability. Enabled exposure of the backend service on port 8080 and leveraged combined RUN directives to minimize the image footprint. These changes facilitate a more robust deployment process, catering to both CPU and CUDA environments. 2024-03-17 06:55:37 +01:00			`EXPOSE 8080`

Create Dockerfile-cuda +Dockerfile-cuda I created this file to help add CUDA support to open-webui for access to a GPU during embedding operations. 2024-03-17 00:26:21 +01:00			`CMD [ "bash", "start.sh"]`