llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-02-04 15:43:53 +01:00

History

Eve 64ae065511 vulkan: small mul_mat_vec optimizations (#10665 ) * double the number of rows per workgroup * Update ggml-vulkan.cpp * Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats * only increase the number of rows for amd and subgroup size 64 * fix missing NUM_ROWS for mul_mat_vec_iq4_nl_f16_f32, untested * use subgroup min and max to check for gcn (requires https://github.com/ggerganov/llama.cpp/pull/10721) * manual merge ggml-vulkan.cpp * set min and max subgroup size in any case * Also double the number of rows for Intel GPUs		2024-12-13 09:42:04 +01:00
..
include	ggml: load all backends from a user-provided search path (#10699 )	2024-12-11 01:47:21 +01:00
src	vulkan: small mul_mat_vec optimizations (#10665 )	2024-12-13 09:42:04 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00