text-generation-webui/requirements.txt

accelerate==0.32.*
aqlm[gpu,cpu]==1.1.6; platform_system == "Linux"
auto-gptq==0.7.1
bitsandbytes==0.43.*
colorama
datasets
einops
gradio==4.26.*
hqq==0.1.8
jinja2==3.1.4
lm_eval==0.3.0
markdown
numba==0.59.*
numpy==1.26.*
optimum==1.17.*
pandas
peft==0.8.*
Pillow>=9.5.0
psutil
pyyaml
requests
rich
safetensors==0.4.*
scipy
sentencepiece
tensorboard
transformers==4.42.*
tqdm
wandb

# API
SpeechRecognition==3.10.0
flask_cloudflared==0.0.14
sse-starlette==1.6.5
tiktoken

# llama-cpp-python (CPU only, AVX2)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"

# llama-cpp-python (CUDA, no tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# llama-cpp-python (CUDA, tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# CUDA wheels
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
autoawq==0.2.5; platform_system == "Linux" or platform_system == "Windows"
Update accelerate requirement from ==0.31.* to ==0.32.* (#6217) 2024-07-12 00:56:42 +02:00			`accelerate==0.32.*`
Bump aqlm[cpu,gpu] from 1.1.5 to 1.1.6 (#6157) 2024-06-28 02:13:02 +02:00			`aqlm[gpu,cpu]==1.1.6; platform_system == "Linux"`
Backend cleanup (#6025) 2024-05-21 18:32:02 +02:00			`auto-gptq==0.7.1`
Bump bitsandbytes to 0.43, add official Windows wheel 2024-03-10 16:30:53 +01:00			`bitsandbytes==0.43.*`
Add 4-bit LoRA support (#1200) 2023-04-17 04:26:52 +02:00			`colorama`
New yaml character format (#337 from TheTerrasque/feature/yaml-characters) This doesn't break backward compatibility with JSON characters. 2023-04-03 01:34:25 +02:00			`datasets`
Falcon support (trust-remote-code and autogptq checkboxes) (#2367) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com> 2023-05-29 15:20:18 +02:00			`einops`
Update gradio requirement from ==4.25.* to ==4.26.* (#5832) 2024-04-11 07:24:53 +02:00			`gradio==4.26.*`
Bump hqq from 0.1.7.post3 to 0.1.8 (#6238) 2024-07-20 23:20:26 +02:00			`hqq==0.1.8`
Bump jinja2 from 3.1.2 to 3.1.4 (#6172) 2024-06-28 02:12:39 +02:00			`jinja2==3.1.4`
Pin lm_eval package version 2023-12-24 18:22:31 +01:00			`lm_eval==0.3.0`
Sort the requirements 2023-03-15 16:40:03 +01:00			`markdown`
Add numba to requirements.txt 2024-03-11 00:13:29 +01:00			`numba==0.59.*`
Update numpy requirement from ==1.24.* to ==1.26.* (#5490) 2024-02-13 20:26:35 +01:00			`numpy==1.26.*`
Update optimum requirement from ==1.16.* to ==1.17.* (#5548) 2024-02-19 23:15:21 +01:00			`optimum==1.17.*`
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-21 05:20:33 +02:00			`pandas`
Revert "Update peft requirement from ==0.8.* to ==0.9.* (#5626)" This reverts commit 72a498ddd44a895205e53b5696742dc4ded9e12e. 2024-03-05 11:56:37 +01:00			`peft==0.8.*`
Add Pillow as a requirement 2023-04-08 23:48:46 +02:00			`Pillow>=9.5.0`
requirements: add psutil (#5819) 2024-04-07 04:02:20 +02:00			`psutil`
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-21 05:20:33 +02:00			`pyyaml`
Add requests to requirements.txt 2023-03-11 18:47:30 +01:00			`requests`
Add rich requirement 2023-12-20 06:58:36 +01:00			`rich`
Bump safetensors version 2024-02-05 03:40:25 +01:00			`safetensors==0.4.*`
Add 'scipy' to requirements.txt #2335 (#2343) Unlisted dependency of bitsandbytes 2023-05-26 04:26:25 +02:00			`scipy`
Add CUDA wheels for llama-cpp-python by jllllll 2023-07-20 04:31:19 +02:00			`sentencepiece`
Add Tensorboard/Weights and biases integration for training (#2624) 2023-07-12 16:53:31 +02:00			`tensorboard`
Bump transformers to 4.42 (for gemma support) 2024-06-27 20:26:02 +02:00			`transformers==4.42.*`
Add CUDA wheels for llama-cpp-python by jllllll 2023-07-20 04:31:19 +02:00			`tqdm`
			`wandb`
Pin aiofiles version to fix statvfs issue 2023-08-09 17:07:55 +02:00
Do not install extensions requirements by default (#5621) 2024-03-04 08:46:39 +01:00			`# API`
			`SpeechRecognition==3.10.0`
			`flask_cloudflared==0.0.14`
Revert sse-starlette version bump because it breaks API request cancellation (#5873) 2024-04-18 20:05:00 +02:00			`sse-starlette==1.6.5`
Do not install extensions requirements by default (#5621) 2024-03-04 08:46:39 +01:00			`tiktoken`

Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 14:11:31 +02:00			`# llama-cpp-python (CPU only, AVX2)`
Bump llama-cpp-python to 0.2.82 2024-07-10 15:03:24 +02:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.82+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 14:11:31 +02:00
			`# llama-cpp-python (CUDA, no tensor cores)`
Bump llama-cpp-python to 0.2.82 2024-07-10 15:03:24 +02:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.82+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 14:11:31 +02:00
			`# llama-cpp-python (CUDA, tensor cores)`
Bump llama-cpp-python to 0.2.82 2024-07-10 15:03:24 +02:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.82+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 21:30:53 +01:00
Create alternative requirements.txt with AMD and Metal wheels (#4052) 2023-09-24 14:58:29 +02:00			`# CUDA wheels`
Bump ExLlamaV2 to 0.1.7 2024-07-11 21:33:46 +02:00			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"`
Bump flash-attention to 2.6.1 2024-07-13 05:16:11 +02:00			`https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Backend cleanup (#6025) 2024-05-21 18:32:02 +02:00			`autoawq==0.2.5; platform_system == "Linux" or platform_system == "Windows"`