llama.cpp/gguf-py/gguf
Phillip Kravtsov 0e797c2fc5
llm : support Adept Persimmon 8B (#3410)
* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci
2023-10-07 10:12:43 +03:00
..
__init__.py gguf : export objects to user code (#2780) 2023-08-25 12:43:41 +03:00
gguf.py llm : support Adept Persimmon 8B (#3410) 2023-10-07 10:12:43 +03:00
py.typed convert : various script cleanups/fixes + merges and special token handling (#2842) 2023-08-30 11:25:50 +03:00