mirror of
https://github.com/oobabooga/text-generation-webui.git
synced 2024-12-23 21:18:00 +01:00
Update GPTQ-models-(4-bit-mode).md
This commit is contained in:
parent
b8d2f6d876
commit
540a161a08
@ -6,7 +6,7 @@ AutoGPTQ is the recommended way to create new quantized models: https://github.c
|
||||
|
||||
### Installation
|
||||
|
||||
To load a model quantized with AutoGPTQ in the web UI, manual installation is currently necessary:
|
||||
To load a model quantized with AutoGPTQ in the web UI, you need to first manually install the AutoGPTQ library:
|
||||
|
||||
```
|
||||
conda activate textgen
|
||||
@ -14,11 +14,11 @@ git clone https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ
|
||||
pip install .
|
||||
```
|
||||
|
||||
You are going to need to have `nvcc` installed (see the [instructions below](https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#step-0-install-nvcc)).
|
||||
The last command requires `nvcc` to be installed (see the [instructions below](https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md#step-0-install-nvcc)).
|
||||
|
||||
### Usage
|
||||
|
||||
Place the output folder generated by AutoGPTQ in your `models/` folder and load it with the `--autogptq` flag:
|
||||
When you quantize a model using AutoGPTQ, a folder containing a filed called `quantize_config.json` will be generated. Place that folder inside your `models/` folder and load it with the `--autogptq` flag:
|
||||
|
||||
```
|
||||
python server.py --autogptq --model model_name
|
||||
|
Loading…
Reference in New Issue
Block a user