text-generation-webui/docs/LLaMA-model.md

LLaMA is a Large Language Model developed by Meta AI. 

It was trained on more tokens than previous models. The result is that the smallest version with 7 billion parameters has similar performance to GPT-3 with 175 billion parameters.

This guide will cover usage through the official `transformers` implementation. For 4-bit mode, head over to [GPTQ models (4 bit mode)
](GPTQ-models-(4-bit-mode).md).

## Getting the weights

### Option 1: pre-converted weights

* Direct download (recommended):

https://huggingface.co/Neko-Institute-of-Science/LLaMA-7B-HF

https://huggingface.co/Neko-Institute-of-Science/LLaMA-13B-HF

https://huggingface.co/Neko-Institute-of-Science/LLaMA-30B-HF

https://huggingface.co/Neko-Institute-of-Science/LLaMA-65B-HF

* Torrent:

https://github.com/oobabooga/text-generation-webui/pull/530#issuecomment-1484235789

The tokenizer files in the torrent above are outdated, in particular the files called `tokenizer_config.json` and `special_tokens_map.json`. Here you can find those files: https://huggingface.co/oobabooga/llama-tokenizer

### Option 2: convert the weights yourself

1. Install the `protobuf` library:

```
pip install protobuf==3.20.1
```

2. Use the script below to convert the model in `.pth` format that you, a fellow academic, downloaded using Meta's official link.

If you have `transformers` installed in place:

```
python -m transformers.models.llama.convert_llama_weights_to_hf --input_dir /path/to/LLaMA --model_size 7B --output_dir /tmp/outputs/llama-7b
```

Otherwise download [convert_llama_weights_to_hf.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) first and run:

```
python convert_llama_weights_to_hf.py --input_dir /path/to/LLaMA --model_size 7B --output_dir /tmp/outputs/llama-7b
```

3. Move the `llama-7b` folder inside your `text-generation-webui/models` folder.

## Starting the web UI

```python
python server.py --model llama-7b
```
Add files via upload 2023-04-22 07:34:13 +02:00			`LLaMA is a Large Language Model developed by Meta AI.`

			`It was trained on more tokens than previous models. The result is that the smallest version with 7 billion parameters has similar performance to GPT-3 with 175 billion parameters.`

			This guide will cover usage through the official `transformers` implementation. For 4-bit mode, head over to [GPTQ models (4 bit mode)
Update LLaMA-model.md 2023-04-22 07:49:54 +02:00			`](GPTQ-models-(4-bit-mode).md).`
Add files via upload 2023-04-22 07:34:13 +02:00
			`## Getting the weights`

			`### Option 1: pre-converted weights`

Update LLaMA links and info 2023-07-17 21:51:01 +02:00			`* Direct download (recommended):`
Add files via upload 2023-04-22 07:34:13 +02:00
Update LLaMA links and info 2023-07-17 21:51:01 +02:00			`https://huggingface.co/Neko-Institute-of-Science/LLaMA-7B-HF`

			`https://huggingface.co/Neko-Institute-of-Science/LLaMA-13B-HF`

			`https://huggingface.co/Neko-Institute-of-Science/LLaMA-30B-HF`

			`https://huggingface.co/Neko-Institute-of-Science/LLaMA-65B-HF`

			`* Torrent:`

			`https://github.com/oobabooga/text-generation-webui/pull/530#issuecomment-1484235789`

			The tokenizer files in the torrent above are outdated, in particular the files called `tokenizer_config.json` and `special_tokens_map.json`. Here you can find those files: https://huggingface.co/oobabooga/llama-tokenizer
Add files via upload 2023-04-22 07:34:13 +02:00
			`### Option 2: convert the weights yourself`

			1. Install the `protobuf` library:

			```
Update LLaMA-model.md (#1700) protobuf needs to be 3.20.x or lower 2023-05-02 05:44:09 +02:00			`pip install protobuf==3.20.1`
Add files via upload 2023-04-22 07:34:13 +02:00			```

Minor cleanup 2023-06-09 05:30:22 +02:00			2. Use the script below to convert the model in `.pth` format that you, a fellow academic, downloaded using Meta's official link.
Add files via upload 2023-04-22 07:34:13 +02:00
Minor cleanup 2023-06-09 05:30:22 +02:00			If you have `transformers` installed in place:
Update LLaMA-model.md (#2460) Better approach of converting LLaMA model 2023-06-07 20:34:50 +02:00
			```
			`python -m transformers.models.llama.convert_llama_weights_to_hf --input_dir /path/to/LLaMA --model_size 7B --output_dir /tmp/outputs/llama-7b`
			```

Minor cleanup 2023-06-09 05:30:22 +02:00			`Otherwise download [convert_llama_weights_to_hf.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) first and run:`
Add files via upload 2023-04-22 07:34:13 +02:00
			```
			`python convert_llama_weights_to_hf.py --input_dir /path/to/LLaMA --model_size 7B --output_dir /tmp/outputs/llama-7b`
			```

			3. Move the `llama-7b` folder inside your `text-generation-webui/models` folder.

			`## Starting the web UI`

			```python
			`python server.py --model llama-7b`
Update LLaMA-model.md 2023-04-22 07:49:54 +02:00			```