Commit Graph

5 Commits

Author SHA1 Message Date
Georgi Gerganov
e0429d38e4
convert-new.py : output gguf (#2635)
* convert-new.py : output gguf (WIP)

* convert-new.py : add gguf key-value pairs

* llama : add hparams.ctx_train + no longer print ftype

* convert-new.py : minor fixes

* convert-new.py : vocab-only option should work now

* llama : fix tokenizer to use llama_char_to_byte

* tests : add new ggml-vocab-llama.gguf

* convert-new.py : tensor name mapping

* convert-new.py : add map for skipping tensor serialization

* convert-new.py : convert script now works

* gguf.py : pick some of the refactoring from #2644

* convert-new.py : minor fixes
2023-08-17 17:19:52 +03:00
Georgi Gerganov
5ec18934ad
convert-new.py : pick #2427 for HF 70B support 2023-08-16 20:16:15 +03:00
goerch
afc4ca2889
convert : update convert-new.py with tokenizer fixes (#2614)
* Merge tokenizer fixes into the gguf branch.

* Add test vocabularies

* Adapt convert-new.py (and fix a clang-cl compiler error on windows)
2023-08-14 20:20:04 +03:00
Georgi Gerganov
d2bb3ac10b
convert.py : remove GGML vocab + other obsolete stuff 2023-07-27 16:36:35 +03:00
Georgi Gerganov
68f53485e4
convert.py : start a new simplified implementation by removing old stuff 2023-07-27 15:56:53 +03:00