Commit Graph

55 Commits

Author SHA1 Message Date
Stephan Walter
3e6e70d8e8
Add enum llama_ftype, sync ggml_type to model files (#709) 2023-04-11 15:03:51 +00:00
Slaren
64bde3ffd4 Fix ggml_init_params in quantize 2023-03-30 12:28:25 -07:00
Stephan Walter
436e561931
all : be more strict about converting float to double (#458)
* Be more strict about converting float to double

* Test equivalence of round, SILU implementations

Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.

* Fix softmax in perplexity.cpp

* all : prefer float over double where appropriate

* perplexity : add <cmath>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 19:48:20 +03:00
Stephan Walter
c1f885067c
ggml : introduce structs for the q4 data blocks (#356)
* Introduce structs for the q4 data blocks

* ggml : rename quant struct variables + fix ARM_NEON

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 18:56:03 +03:00
Georgi Gerganov
a316a425d0
Overhaul the examples structure
- main -> examples
- utils -> examples (renamed to "common")
- quantize -> examples
- separate tools for "perplexity" and "embedding"

Hope I didn't break something !
2023-03-25 20:26:40 +02:00