mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-30 16:07:17 +01:00
3df784b305
* Vulkan: Implement VK_KHR_cooperative_matrix support in the matrix matrix multiplication shader * Improve performance with better q4_k and q5_k dequant and store unrolling * Add Vulkan MUL_MAT and MUL_MAT_ID accumulator precision selection * Rework mulmat shader selection and compilation logic, avoid compiling shaders that won't get used by device * Vulkan: Implement accumulator switch for specific mul mat mat shaders * Vulkan: Unroll more loops for more mul mat mat performance * Vulkan: Add VK_AMD_shader_core_properties2 support to read Compute Unit count for split_k logic * Disable coopmat support on AMD proprietary driver * Remove redundant checks * Add environment variable GGML_VK_DISABLE_COOPMAT to disable VK_KHR_cooperative_matrix support * Fix rebase typo * Fix coopmat2 MUL_MAT_ID pipeline selection |
||
---|---|---|
.. | ||
ggml-blas | ||
ggml-cann | ||
ggml-cpu | ||
ggml-cuda | ||
ggml-hip | ||
ggml-kompute | ||
ggml-metal | ||
ggml-musa | ||
ggml-rpc | ||
ggml-sycl | ||
ggml-vulkan | ||
CMakeLists.txt | ||
ggml-aarch64.c | ||
ggml-aarch64.h | ||
ggml-alloc.c | ||
ggml-backend-impl.h | ||
ggml-backend-reg.cpp | ||
ggml-backend.cpp | ||
ggml-common.h | ||
ggml-impl.h | ||
ggml-opt.cpp | ||
ggml-quants.c | ||
ggml-quants.h | ||
ggml-threading.cpp | ||
ggml-threading.h | ||
ggml.c |