jiez 1966eb2615
quantize : add '--keep-split' to quantize model into shards (#6688)
* Implement '--keep-split' to quantize model into several shards

* Add test script

* Update examples/quantize/quantize.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Split model correctly even if tensor id is out-of-order

* Update llama_model_quantize_params

* Fix preci failures

---------

Co-authored-by: z5269887 <z5269887@unsw.edu.au>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-25 13:29:35 +03:00
..
2024-04-09 13:44:08 -04:00
2024-04-14 13:12:59 +02:00
2024-04-09 13:44:08 -04:00
2023-08-30 09:29:32 +03:00
2024-03-07 11:41:53 +02:00