mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-02-06 08:30:33 +01:00
![piDack](/assets/img/avatar_default.png)
* add glm edge chat model * use config partial_rotary_factor as rope ratio * support for glm edge model * vision model support * remove debug info * fix format * llava.cpp trailing whitespace * remove unused AutoTokenizer * Update src/llama.cpp for not contain <|end|> or </s> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * add edge template * fix chat template * fix confict * fix confict * fix ci err * fix format err * fix template err * 9b hf chat support * format * format clip.cpp * fix format * Apply suggestions from code review * Apply suggestions from code review * Update examples/llava/clip.cpp * fix format * minor : style --------- Co-authored-by: liyuhang <yuhang.li@zhipuai.cn> Co-authored-by: piDack <pcdack@hotmail.co> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: liyuhang <yuhang.li@aminer.cn> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1.6 KiB
1.6 KiB
GLMV-EDGE
Currently this implementation supports glm-edge-v-2b and glm-edge-v-5b.
Usage
Build with cmake or run make llama-llava-cli
to build it.
After building, run: ./llama-llava-cli
to see the usage. For example:
./llama-llava-cli -m model_path/ggml-model-f16.gguf --mmproj model_path/mmproj-model-f16.gguf --image img_path/image.jpg -p "<|system|>\n system prompt <image><|user|>\n prompt <|assistant|>\n"
note: A lower temperature like 0.1 is recommended for better quality. add --temp 0.1
to the command to do so.
note: For GPU offloading ensure to use the -ngl
flag just like usual
GGUF conversion
git clone https://huggingface.co/THUDM/glm-edge-v-5b or https://huggingface.co/THUDM/glm-edge-v-2b
- Use
glmedge-surgery.py
to split the GLMV-EDGE model to LLM and multimodel projector constituents:
python ./examples/llava/glmedge-surgery.py -m ../model_path
- Use
glmedge-convert-image-encoder-to-gguf.py
to convert the GLMV-EDGE image encoder to GGUF:
python ./examples/llava/glmedge-convert-image-encoder-to-gguf.py -m ../model_path --llava-projector ../model_path/glm.projector --output-dir ../model_path
- Use
examples/convert_hf_to_gguf.py
to convert the LLM part of GLMV-EDGE to GGUF:
python convert_hf_to_gguf.py ../model_path
Now both the LLM part and the image encoder are in the model_path
directory.