1
0
mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-24 10:29:21 +01:00
llama.cpp/examples/export-lora
Georgi Gerganov afa8a9ec9b
llama : add llama_vocab, functions -> methods, naming ()
* llama : functions -> methods ()

* llama : add struct llama_vocab to the API ()

ggml-ci

* hparams : move vocab params to llama_vocab ()

ggml-ci

* vocab : more pimpl ()

ggml-ci

* vocab : minor tokenization optimizations ()

ggml-ci

Co-authored-by: Diego Devesa <slarengh@gmail.com>

* lora : update API names ()

ggml-ci

* llama : update API names to use correct prefix ()

* llama : update API names to use correct prefix

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* minor [no ci]

* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os ()

ggml-ci

* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens ()

ggml-ci

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-01-12 11:32:42 +02:00
..
CMakeLists.txt ggml : move AMX to the CPU backend () 2024-11-29 21:54:58 +01:00
export-lora.cpp llama : add llama_vocab, functions -> methods, naming () 2025-01-12 11:32:42 +02:00
README.md export-lora : throw error if lora is quantized () 2024-08-13 11:41:14 +02:00

export-lora

Apply LORA adapters to base model and export the resulting model.

usage: llama-export-lora [options]

options:
  -m,    --model                  model path from which to load base model (default '')
         --lora FNAME             path to LoRA adapter  (can be repeated to use multiple adapters)
         --lora-scaled FNAME S    path to LoRA adapter with user defined scaling S  (can be repeated to use multiple adapters)
  -t,    --threads N              number of threads to use during computation (default: 4)
  -o,    --output FNAME           output file (default: 'ggml-lora-merged-f16.gguf')

For example:

./bin/llama-export-lora \
    -m open-llama-3b-v2.gguf \
    -o open-llama-3b-v2-english2tokipona-chat.gguf \
    --lora lora-open-llama-3b-v2-english2tokipona-chat-LATEST.gguf

Multiple LORA adapters can be applied by passing multiple --lora FNAME or --lora-scaled FNAME S command line parameters:

./bin/llama-export-lora \
    -m your_base_model.gguf \
    -o your_merged_model.gguf \
    --lora-scaled lora_task_A.gguf 0.5 \
    --lora-scaled lora_task_B.gguf 0.5