From a1ca1c04a15d32e9d0c035077c0020858091573a Mon Sep 17 00:00:00 2001
From: Jonathan Yankovich <jonathan.yankovich@gmail.com>
Date: Fri, 16 Jun 2023 21:46:25 -0500
Subject: [PATCH] Update ExLlama.md (#2729)

Add details for configuring exllama
---
 docs/ExLlama.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/docs/ExLlama.md b/docs/ExLlama.md
index a9fd016d..1a51f188 100644
--- a/docs/ExLlama.md
+++ b/docs/ExLlama.md
@@ -2,7 +2,7 @@
 
 ## About
 
-ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
+ExLlama is an extremely optimized GPTQ backend ("loader") for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
 
 ## Installation:
 
@@ -15,3 +15,7 @@ git clone https://github.com/turboderp/exllama
 ```
 
 2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama
+
+3) Configure text-generation-webui to use exllama via the UI or command line:
+   - In the "Model" tab, set "Loader" to "exllama"
+   - Specify `--loader exllama` on the command line