mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-27 06:39:25 +01:00

History

ddh0 5b48cd53a8 Update llama-quantize ppl/file size output from LLaMA-v1 to Llama-3 values (#8058 ) Uses the values computed by @JohannesGaessler in PR #7413		2024-06-22 15:16:10 +02:00
..
CMakeLists.txt	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00
quantize.cpp	Update llama-quantize ppl/file size output from LLaMA-v1 to Llama-3 values (#8058 )	2024-06-22 15:16:10 +02:00
README.md	doc: add references to hugging face GGUF-my-repo quantisation web tool. (#7288 )	2024-05-16 15:38:43 +10:00
tests.sh	`build`: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )	2024-06-13 00:41:52 +01:00

quantize

You can also use the GGUF-my-repo space on Hugging Face to build your own quants without any setup.

Note: It is synced from llama.cpp main every 6 hours.

Llama 2 7B