text-generation-webui/docs/ExLlama.md

23 lines
762 B
Markdown
Raw Permalink Normal View History

2023-06-17 01:35:38 +02:00
# ExLlama
2023-06-25 01:23:01 +02:00
### About
2023-06-17 01:35:38 +02:00
2023-06-25 01:23:01 +02:00
ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
2023-06-17 01:35:38 +02:00
2023-06-25 01:23:01 +02:00
### Usage
Configure text-generation-webui to use exllama via the UI or command line:
- In the "Model" tab, set "Loader" to "exllama"
- Specify `--loader exllama` on the command line
### Manual setup
2023-06-25 01:25:34 +02:00
No additional installation steps are necessary since an exllama package is already included in the requirements.txt. If this package fails to install for some reason, you can install it manually by cloning the original repository into your `repositories/` folder:
2023-06-17 01:35:38 +02:00
```
2023-06-17 01:40:12 +02:00
mkdir repositories
2023-06-17 01:35:38 +02:00
cd repositories
git clone https://github.com/turboderp/exllama
```