llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-27 04:23:06 +01:00

History

Ashish dbceec87c0 llama : add StableLM2 12B (#6635 ) * StableLM2 12B support for huggingface -> GGUF * StableLM12 tensormapping and constants * StableLM-2-12b model support * fix * Added 12B support * Removed autoformatting; resolved bug where model_arch was not selecting StableLM2 * Formatting * Do QK norm stacking in model conversion step * Converge StableLM and StableLM2 code to simplify graph construction * Fix accidental removal * Removed warnings * Revert formatter * Move QK norm stack to private function so it's easier to read * refactor stablelm graph builder to support 1.6, 3b and 12b more efficiently * Proper check for None type for new_name to avoid crash; formatting; revert change to base class `write_tensors()` * Format * Formatting * format Co-authored-by: compilade <git@compilade.net> * Fix incorrect check for K norm * space after commas; Keep indentation multiple of 4 spaces * Flake8 format * Removed unnecessary conditional branches * Removed unused comment * Fixed incorrect tensor passing * Format --------- Co-authored-by: compilade <git@compilade.net>		2024-04-16 18:48:35 +03:00
..
__init__.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
constants.py	llama : add StableLM2 12B (#6635 )	2024-04-16 18:48:35 +03:00
gguf_reader.py	gguf : add support for I64 and F64 arrays (#6062 )	2024-03-15 10:46:51 +02:00
gguf_writer.py	gguf : add special tokens metadata for FIM/Infill (#6689 )	2024-04-16 09:13:13 +03:00
gguf.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
py.typed	convert : various script cleanups/fixes + merges and special token handling (#2842 )	2023-08-30 11:25:50 +03:00
tensor_mapping.py	llama : add qwen2moe (#6074 )	2024-04-16 18:40:48 +03:00
vocab.py	fix(gguf-py): special tokens are no longer skipped when add_<token>_token is set to false (#5487 )	2024-02-15 14:14:37 +01:00