llama.cpp/tests at 15fbcb5df7b761ce8fb948ff25f18654dd1dbf42 - llama.cpp - Gitea: Git with a cup of tea

Mirrors/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-12 13:27:21 +01:00

History

Molly Sophia ee7136c6d1

llama: add support for QRWKV6 model architecture (#11001 )

llama: add support for QRWKV6 model architecture (#11001)

* WIP: Add support for RWKV6Qwen2

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* RWKV: Some graph simplification

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Add support for RWKV6Qwen2 with cpu and cuda GLA

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Fix some typos

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* code format changes

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Fix wkv test & add gla test

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Fix cuda warning

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Update README.md

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Update ggml/src/ggml-cuda/gla.cu

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Fix fused lerp weights loading with RWKV6

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* better sanity check skipping for QRWKV6 in llama-quant

thanks @compilade

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: compilade <git@compilade.net>

---------

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: compilade <git@compilade.net>

2025-01-10 09:58:08 +08:00

..

.gitignore

tests : gitignore ggml-common.h

2024-03-09 14:17:11 +02:00

CMakeLists.txt

tests: add tests for GGUF (#10830 )

2024-12-17 19:09:35 +01:00

get-model.cpp

ci : add model tests + script wrapper (#4586 )

2024-01-26 14:18:00 +02:00

get-model.h

ci : add model tests + script wrapper (#4586 )

2024-01-26 14:18:00 +02:00

run-json-schema-to-grammar.mjs

server : revamp chat UI with vuejs and daisyui (#10175 )

2024-11-07 17:31:10 -04:00

test-arg-parser.cpp

speculative : refactor and add a simpler example (#10362 )

2024-11-25 09:58:41 +02:00

test-autorelease.cpp

llama : update llama_model API names (#11063 )

2025-01-06 10:55:18 +02:00

test-backend-ops.cpp

llama: add support for QRWKV6 model architecture (#11001 )

2025-01-10 09:58:08 +08:00

test-barrier.cpp

ggml : move CPU backend to a separate file (#10144 )

2024-11-03 19:34:08 +01:00

test-c.c

Nomic Vulkan backend (#4456 )

2024-01-29 15:50:50 -05:00

test-chat-template.cpp

llama-chat : add phi 4 template (#11148 )

2025-01-09 10:07:33 +01:00

test-double-float.cpp

ggml : minor naming changes (#8433 )

2024-07-12 10:46:02 +03:00

test-gguf.cpp

GGUF: C++ refactor, backend support, misc fixes (#11030 )

2025-01-07 18:01:58 +01:00

test-grammar-integration.cpp

llama : minor grammar refactor (#10897 )

2024-12-19 17:42:13 +02:00

test-grammar-parser.cpp

llama : refactor sampling v2 (#9294 )

2024-09-07 15:16:19 +03:00

test-json-schema-to-grammar.cpp

grammar : fix JSON Schema for string regex with top-level alt. (#9903 )

2024-10-16 19:03:24 +03:00

test-llama-grammar.cpp

llama : minor grammar refactor (#10897 )

2024-12-19 17:42:13 +02:00

test-log.cpp

common : use common_ prefix for common library functions (#9805 )

2024-10-10 22:57:42 +02:00

test-lora-conversion-inference.sh

Fix HF repo commit to clone lora test models (#10649 )

2024-12-04 10:45:48 +01:00

test-model-load-cancel.cpp

llama : update llama_model API names (#11063 )

2025-01-06 10:55:18 +02:00

test-opt.cpp

ggml : inttypes.h -> cinttypes (#0 )

2024-11-17 08:30:29 +02:00

test-quantize-fns.cpp

tests : fix compile warning

2024-11-25 15:17:32 +02:00

test-quantize-perf.cpp

ggml : inttypes.h -> cinttypes (#0 )

2024-11-17 08:30:29 +02:00

test-rope.cpp

llama : add Qwen2VL support + multimodal RoPE (#10361 )

2024-12-14 14:43:46 +02:00

test-sampling.cpp

sampling : refactor + optimize penalties sampler (#10803 )

2024-12-16 12:31:14 +02:00

test-tokenizer-0.cpp

llama : update llama_model API names (#11063 )

2025-01-06 10:55:18 +02:00

test-tokenizer-0.py

py : logging and flake8 suppression refactoring (#7081 )

2024-05-05 08:07:48 +03:00

test-tokenizer-0.sh

tests : fix test-tokenizer-0.sh

2024-05-28 15:04:09 +03:00

test-tokenizer-1-bpe.cpp

llama : update llama_model API names (#11063 )

2025-01-06 10:55:18 +02:00

test-tokenizer-1-spm.cpp

llama : update llama_model API names (#11063 )

2025-01-06 10:55:18 +02:00

test-tokenizer-random.py

llama : fix pre-tokenization of non-special added tokens (#8228 )

2024-07-13 23:35:10 -04:00