mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-13 05:42:22 +01:00

History

jiez 1966eb2615

quantize : add '--keep-split' to quantize model into shards (#6688 )

* Implement '--keep-split' to quantize model into several shards

* Add test script

* Update examples/quantize/quantize.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Split model correctly even if tensor id is out-of-order

* Update llama_model_quantize_params

* Fix preci failures

---------

Co-authored-by: z5269887 <z5269887@unsw.edu.au>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2024-04-25 13:29:35 +03:00

CMakeLists.txt

build : link against build info instead of compiling against it (#3879 )

2023-11-02 08:50:16 +02:00

quantize.cpp

quantize : add '--keep-split' to quantize model into shards (#6688 )

2024-04-25 13:29:35 +03:00

README.md

chore: Fix markdown warnings (#6625 )

2024-04-12 10:52:36 +02:00

test.sh

quantize : add '--keep-split' to quantize model into shards (#6688 )