201cc11afa
* add phi3 128k support in convert-hf-to-gguf * add phi3 128k support in cuda * address build warnings on llama.cpp * adjust index value in cuda long rope freq factors * add long rope support in ggml cpu backend * make freq factors only depend on ctx size * remove unused rope scaling type 'su' frin gguf converter * fix flint warnings on convert-hf-to-gguf.py * set to the short freq factor when context size is small than trained context size * add one line of comments * metal : support rope freq_factors * ggml : update ggml_rope_ext API to support freq. factors * backends : add dev messages to support rope freq. factors * minor : style * tests : update to use new rope API * backends : fix pragma semicolons * minor : cleanup * llama : move rope factors from KV header to tensors * llama : remove tmp assert * cuda : fix compile warning * convert : read/write n_head_kv * llama : fix uninitialized tensors --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> |
||
---|---|---|
.. | ||
examples | ||
gguf | ||
scripts | ||
tests | ||
LICENSE | ||
pyproject.toml | ||
README.md |
gguf
This is a Python package for writing binary files in the GGUF (GGML Universal File) format.
See convert-llama-hf-to-gguf.py as an example for its usage.
Installation
pip install gguf
API Examples/Simple Tools
examples/writer.py — Generates example.gguf
in the current directory to demonstrate generating a GGUF file. Note that this file cannot be used as a model.
scripts/gguf-dump.py — Dumps a GGUF file's metadata to the console.
scripts/gguf-set-metadata.py — Allows changing simple metadata values in a GGUF file by key.
scripts/gguf-convert-endian.py — Allows converting the endianness of GGUF files.
scripts/gguf-new-metadata.py — Copies a GGUF file with added/modified/removed metadata values.
Development
Maintainers who participate in development of this package are advised to install it in editable mode:
cd /path/to/llama.cpp/gguf-py
pip install --editable .
Note: This may require to upgrade your Pip installation, with a message saying that editable installation currently requires setup.py
.
In this case, upgrade Pip to the latest:
pip install --upgrade pip
Automatic publishing with CI
There's a GitHub workflow to make a release automatically upon creation of tags in a specified format.
- Bump the version in
pyproject.toml
. - Create a tag named
gguf-vx.x.x
wherex.x.x
is the semantic version number.
git tag -a gguf-v1.0.0 -m "Version 1.0 release"
- Push the tags.
git push origin --tags
Manual publishing
If you want to publish the package manually for any reason, you need to have twine
and build
installed:
pip install build twine
Then, follow these steps to release a new version:
- Bump the version in
pyproject.toml
. - Build the package:
python -m build
- Upload the generated distribution archives:
python -m twine upload dist/*
TODO
- Add tests
- Include conversion scripts as command line entry points in this package.