llama.cpp/mmq-instance-iq3_xxs.cu at a813badbbdf0d38705f249df7a0c99af5cdee678 - llama.cpp - Gitea: Git with a cup of tea

Mirrors/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-09 20:18:57 +01:00

Johannes Gäßler 69c487f4ed

CUDA: MMQ code deduplication + iquant support (#8495 )

* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build

2024-07-20 22:25:26 +02:00

6 lines

141 B

Plaintext

Raw Blame History

 // This file has been autogenerated by generate_cu_files.py, do not edit manually.
 #include "../mmq.cuh"
 DECL_MMQ_CASE(GGML_TYPE_IQ3_XXS);