mirror of
https://github.com/oobabooga/text-generation-webui.git
synced 2024-11-21 23:57:58 +01:00
Document logits_all
This commit is contained in:
parent
5c0559da69
commit
322c170566
@ -327,6 +327,7 @@ Optionally, you can use the following command-line flags:
|
|||||||
| `--tensor_split TENSOR_SPLIT` | Split the model across multiple GPUs. Comma-separated list of proportions. Example: 18,17. |
|
| `--tensor_split TENSOR_SPLIT` | Split the model across multiple GPUs. Comma-separated list of proportions. Example: 18,17. |
|
||||||
| `--llama_cpp_seed SEED` | Seed for llama-cpp models. Default is 0 (random). |
|
| `--llama_cpp_seed SEED` | Seed for llama-cpp models. Default is 0 (random). |
|
||||||
| `--numa` | Activate NUMA task allocation for llama.cpp. |
|
| `--numa` | Activate NUMA task allocation for llama.cpp. |
|
||||||
|
| `--logits_all`| Needs to be set for perplexity evaluation to work. Otherwise, ignore it, as it makes prompt processing slower. |
|
||||||
| `--cache-capacity CACHE_CAPACITY` | Maximum cache capacity (llama-cpp-python). Examples: 2000MiB, 2GiB. When provided without units, bytes will be assumed. |
|
| `--cache-capacity CACHE_CAPACITY` | Maximum cache capacity (llama-cpp-python). Examples: 2000MiB, 2GiB. When provided without units, bytes will be assumed. |
|
||||||
|
|
||||||
#### ExLlama
|
#### ExLlama
|
||||||
|
@ -110,6 +110,10 @@ To use it, you need to download a tokenizer. There are two options:
|
|||||||
1) Download `oobabooga/llama-tokenizer` under "Download model or LoRA". That's a default Llama tokenizer.
|
1) Download `oobabooga/llama-tokenizer` under "Download model or LoRA". That's a default Llama tokenizer.
|
||||||
2) Place your .gguf in a subfolder of `models/` along with these 3 files: `tokenizer.model`, `tokenizer_config.json`, and `special_tokens_map.json`. This takes precedence over Option 1.
|
2) Place your .gguf in a subfolder of `models/` along with these 3 files: `tokenizer.model`, `tokenizer_config.json`, and `special_tokens_map.json`. This takes precedence over Option 1.
|
||||||
|
|
||||||
|
It has an additional parameter:
|
||||||
|
|
||||||
|
* **logits_all**: Needs to be checked if you want to evaluate the perplexity of the llama.cpp model using the "Training" > "Perplexity evaluation" tab. Otherwise, leave it unchecked, as it makes prompt processing slower.
|
||||||
|
|
||||||
### ctransformers
|
### ctransformers
|
||||||
|
|
||||||
Loads: GGUF/GGML models.
|
Loads: GGUF/GGML models.
|
||||||
|
Loading…
Reference in New Issue
Block a user