mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-27 06:39:25 +01:00
64ae065511
* double the number of rows per workgroup * Update ggml-vulkan.cpp * Vulkan: Add VK_EXT_subgroup_size_control support to ensure full subgroups for coopmats * only increase the number of rows for amd and subgroup size 64 * fix missing NUM_ROWS for mul_mat_vec_iq4_nl_f16_f32, untested * use subgroup min and max to check for gcn (requires https://github.com/ggerganov/llama.cpp/pull/10721) * manual merge ggml-vulkan.cpp * set min and max subgroup size in any case * Also double the number of rows for Intel GPUs |
||
---|---|---|
.. | ||
include | ||
src | ||
.gitignore | ||
CMakeLists.txt |