llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-02-05 16:10:42 +01:00

History

Rémy Oudompheng 66ee4f297c vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360 ) * vulkan: initial support for IQ3_S * vulkan: initial support for IQ3_XXS * vulkan: initial support for IQ2_XXS * vulkan: initial support for IQ2_XS * vulkan: optimize Q3_K by removing branches * vulkan: implement dequantize variants for coopmat2 * vulkan: initial support for IQ2_S * vulkan: vertically realign code * port failing dequant callbacks from mul_mm * Fix array length mismatches * vulkan: avoid using workgroup size before it is referenced * tests: increase timeout for Vulkan llvmpipe backend --------- Co-authored-by: Jeff Bolz <jbolz@nvidia.com>		2025-01-29 18:29:39 +01:00
..
cmake	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00
include	rpc : early register backend devices (#11262 )	2025-01-17 10:57:09 +02:00
src	vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360 )	2025-01-29 18:29:39 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00