llama.cpp/gguf-py
Ashish dbceec87c0
llama : add StableLM2 12B (#6635)
* StableLM2 12B support for huggingface -> GGUF

* StableLM12 tensormapping and constants

* StableLM-2-12b model support

* fix

* Added 12B support

* Removed autoformatting; resolved bug where model_arch was not selecting StableLM2

* Formatting

* Do QK norm stacking in model conversion step

* Converge StableLM and StableLM2 code to simplify graph construction

* Fix accidental removal

* Removed warnings

* Revert formatter

* Move QK norm stack to private function so it's easier to read

* refactor stablelm graph builder to support 1.6, 3b and 12b more efficiently

* Proper check for None type for new_name to avoid crash; formatting; revert change to base class `write_tensors()`

* Format

* Formatting

* format

Co-authored-by: compilade <git@compilade.net>

* Fix incorrect check for K norm

* space after commas; Keep indentation multiple of 4 spaces

* Flake8 format

* Removed unnecessary conditional branches

* Removed unused comment

* Fixed incorrect tensor passing

* Format

---------

Co-authored-by: compilade <git@compilade.net>
2024-04-16 18:48:35 +03:00
..
examples gguf : add python reader example (#5216) 2024-02-13 19:56:38 +02:00
gguf llama : add StableLM2 12B (#6635) 2024-04-16 18:48:35 +03:00
scripts Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040) 2023-11-16 19:14:37 -07:00
tests gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) 2023-11-11 08:04:50 +03:00
LICENSE gguf : make gguf pip-installable 2023-08-25 09:26:05 +03:00
pyproject.toml ggml : mul_mat_id use the same tensor for all the experts (#6387) 2024-04-03 16:07:05 +03:00
README.md gguf-py : fix broken link 2023-12-21 23:20:36 +02:00

gguf

This is a Python package for writing binary files in the GGUF (GGML Universal File) format.

See convert-llama-hf-to-gguf.py as an example for its usage.

Installation

pip install gguf

API Examples/Simple Tools

examples/writer.py — Generates example.gguf in the current directory to demonstrate generating a GGUF file. Note that this file cannot be used as a model.

scripts/gguf-dump.py — Dumps a GGUF file's metadata to the console.

scripts/gguf-set-metadata.py — Allows changing simple metadata values in a GGUF file by key.

scripts/gguf-convert-endian.py — Allows converting the endianness of GGUF files.

Development

Maintainers who participate in development of this package are advised to install it in editable mode:

cd /path/to/llama.cpp/gguf-py

pip install --editable .

Note: This may require to upgrade your Pip installation, with a message saying that editable installation currently requires setup.py. In this case, upgrade Pip to the latest:

pip install --upgrade pip

Automatic publishing with CI

There's a GitHub workflow to make a release automatically upon creation of tags in a specified format.

  1. Bump the version in pyproject.toml.
  2. Create a tag named gguf-vx.x.x where x.x.x is the semantic version number.
git tag -a gguf-v1.0.0 -m "Version 1.0 release"
  1. Push the tags.
git push origin --tags

Manual publishing

If you want to publish the package manually for any reason, you need to have twine and build installed:

pip install build twine

Then, follow these steps to release a new version:

  1. Bump the version in pyproject.toml.
  2. Build the package:
python -m build
  1. Upload the generated distribution archives:
python -m twine upload dist/*

TODO

  • Add tests
  • Include conversion scripts as command line entry points in this package.