text-generation-webui/requirements.txt

accelerate==1.1.*
bitsandbytes==0.44.*
colorama
datasets
einops
fastapi==0.112.4
gradio==4.26.*
jinja2==3.1.4
markdown
numba==0.59.*
numpy==1.26.*
pandas
peft==0.12.*
Pillow>=9.5.0
psutil
pydantic==2.8.2
pyyaml
requests
rich
safetensors==0.4.*
scipy
sentencepiece
tensorboard
transformers==4.46.*
tqdm
wandb

# API
SpeechRecognition==3.10.0
flask_cloudflared==0.0.14
sse-starlette==1.6.5
tiktoken

# llama-cpp-python (CPU only, AVX2)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"

# llama-cpp-python (CUDA, no tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# llama-cpp-python (CUDA, tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# CUDA wheels
https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
https://github.com/oobabooga/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu122torch2.4.1cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu122torch2.4.1cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu12torch2.4cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu12torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
Update accelerate requirement from ==1.0.* to ==1.1.* (#6515) 2024-11-18 16:00:24 +01:00			`accelerate==1.1.*`
Bump bitsandbytes to 0.44 2024-09-28 01:59:30 +02:00			`bitsandbytes==0.44.*`
Add 4-bit LoRA support (#1200) 2023-04-17 04:26:52 +02:00			`colorama`
New yaml character format (#337 from TheTerrasque/feature/yaml-characters) This doesn't break backward compatibility with JSON characters. 2023-04-03 01:34:25 +02:00			`datasets`
Falcon support (trust-remote-code and autogptq checkboxes) (#2367) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com> 2023-05-29 15:20:18 +02:00			`einops`
Pin fastapi/pydantic requirement versions 2024-09-07 03:38:39 +02:00			`fastapi==0.112.4`
Update gradio requirement from ==4.25.* to ==4.26.* (#5832) 2024-04-11 07:24:53 +02:00			`gradio==4.26.*`
Bump jinja2 from 3.1.2 to 3.1.4 (#6172) 2024-06-28 02:12:39 +02:00			`jinja2==3.1.4`
Sort the requirements 2023-03-15 16:40:03 +01:00			`markdown`
Add numba to requirements.txt 2024-03-11 00:13:29 +01:00			`numba==0.59.*`
Update numpy requirement from ==1.24.* to ==1.26.* (#5490) 2024-02-13 20:26:35 +01:00			`numpy==1.26.*`
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-21 05:20:33 +02:00			`pandas`
Update peft requirement from ==0.8.* to ==0.12.* (#6292) 2024-08-20 04:33:56 +02:00			`peft==0.12.*`
Add Pillow as a requirement 2023-04-08 23:48:46 +02:00			`Pillow>=9.5.0`
requirements: add psutil (#5819) 2024-04-07 04:02:20 +02:00			`psutil`
Pin fastapi/pydantic requirement versions 2024-09-07 03:38:39 +02:00			`pydantic==2.8.2`
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-21 05:20:33 +02:00			`pyyaml`
Add requests to requirements.txt 2023-03-11 18:47:30 +01:00			`requests`
Add rich requirement 2023-12-20 06:58:36 +01:00			`rich`
Bump safetensors version 2024-02-05 03:40:25 +01:00			`safetensors==0.4.*`
Add 'scipy' to requirements.txt #2335 (#2343) Unlisted dependency of bitsandbytes 2023-05-26 04:26:25 +02:00			`scipy`
Add CUDA wheels for llama-cpp-python by jllllll 2023-07-20 04:31:19 +02:00			`sentencepiece`
Add Tensorboard/Weights and biases integration for training (#2624) 2023-07-12 16:53:31 +02:00			`tensorboard`
Bump transformers to 4.46 2024-10-24 20:09:09 +02:00			`transformers==4.46.*`
Add CUDA wheels for llama-cpp-python by jllllll 2023-07-20 04:31:19 +02:00			`tqdm`
			`wandb`
Pin aiofiles version to fix statvfs issue 2023-08-09 17:07:55 +02:00
Do not install extensions requirements by default (#5621) 2024-03-04 08:46:39 +01:00			`# API`
			`SpeechRecognition==3.10.0`
			`flask_cloudflared==0.0.14`
Revert sse-starlette version bump because it breaks API request cancellation (#5873) 2024-04-18 20:05:00 +02:00			`sse-starlette==1.6.5`
Do not install extensions requirements by default (#5621) 2024-03-04 08:46:39 +01:00			`tiktoken`

Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 14:11:31 +02:00			`# llama-cpp-python (CPU only, AVX2)`
Bump llama-cpp-python to 0.3.2 2024-11-18 15:51:06 +01:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.3.2+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 14:11:31 +02:00
			`# llama-cpp-python (CUDA, no tensor cores)`
Bump llama-cpp-python to 0.3.2 2024-11-18 15:51:06 +01:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.3.2+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Bump llama-cpp-python to 0.2.83, add back tensorcore wheels Also add back the progress bar patch 2024-07-23 03:05:11 +02:00
			`# llama-cpp-python (CUDA, tensor cores)`
Bump llama-cpp-python to 0.3.2 2024-11-18 15:51:06 +01:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.3.2+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 14:11:31 +02:00
Create alternative requirements.txt with AMD and Metal wheels (#4052) 2023-09-24 14:58:29 +02:00			`# CUDA wheels`
Bump exllamav2 to 0.2.4 2024-11-18 15:51:56 +01:00			`https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu121.torch2.4.1-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"`
Bump flash-attention to 2.7.0.post2 2024-11-18 15:55:29 +01:00			`https://github.com/oobabooga/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu122torch2.4.1cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu122torch2.4.1cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu12torch2.4cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.0.post2/flash_attn-2.7.0.post2+cu12torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`