Update ExLlama.md (#2729)

Add details for configuring exllama
This commit is contained in:
Jonathan Yankovich 2023-06-16 21:46:25 -05:00 committed by GitHub
parent b27f83c0e9
commit a1ca1c04a1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -2,7 +2,7 @@
## About ## About
ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code. ExLlama is an extremely optimized GPTQ backend ("loader") for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
## Installation: ## Installation:
@ -15,3 +15,7 @@ git clone https://github.com/turboderp/exllama
``` ```
2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama 2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama
3) Configure text-generation-webui to use exllama via the UI or command line:
- In the "Model" tab, set "Loader" to "exllama"
- Specify `--loader exllama` on the command line