llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders
Jeff Bolz bd38ddea01
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166)
* vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl

Shaders are based on cpy.cu.

* vulkan: support copy from q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl to f32

* ggml: copy q->f32 assumes some contiguity in the destination
2025-01-16 22:47:10 +01:00
..
acc.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
add.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
argsort.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
clamp.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
CMakeLists.txt fix: ggml: fix vulkan-shaders-gen build (#10448) 2025-01-15 14:17:42 +01:00
concat.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
contig_copy.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
copy_from_quant.comp vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 2025-01-16 22:47:10 +01:00
copy_to_quant.comp vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 2025-01-16 22:47:10 +01:00
copy.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
cos.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
dequant_f32.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_funcs_cm2.comp vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (#11206) 2025-01-16 22:23:49 +01:00
dequant_funcs.comp vulkan: small mul_mat_vec optimizations (#10665) 2024-12-13 09:42:04 +01:00
dequant_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_iq4_nl.comp vulkan: copy iq4_nl LUT into shared memory (#10409) 2024-11-20 08:40:18 +01:00
dequant_q2_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q3_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_k.comp Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (#10798) 2024-12-12 18:36:00 +01:00
dequant_q5_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_k.comp Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (#10798) 2024-12-12 18:36:00 +01:00
dequant_q6_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q8_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
diag_mask_inf.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
div.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
flash_attn_cm2.comp vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206) 2024-12-05 20:15:05 +01:00
gelu_quick.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
gelu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_binary_head.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
generic_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_unary_head.comp vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 2025-01-16 22:47:10 +01:00
get_rows_quant.comp vulkan: small mul_mat_vec optimizations (#10665) 2024-12-13 09:42:04 +01:00
get_rows.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
group_norm.comp vulkan: fix group_norm (#10496) 2024-11-26 16:45:05 +01:00
im2col.comp vulkan: im2col and matmul optimizations for stable diffusion (#10942) 2024-12-29 10:16:34 +01:00
leaky_relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_split_k_reduce.comp vulkan: optimize and reenable split_k (#10637) 2024-12-03 20:29:54 +01:00
mul_mat_vec_base.comp vulkan: optimize mul_mat for small values of N (#10991) 2024-12-30 18:27:11 +01:00
mul_mat_vec_nc.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_p021.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_q2_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q3_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q4_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q5_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q6_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec.comp Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (#11161) 2025-01-10 06:39:33 +01:00
mul_mm_cm2.comp vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206) 2024-12-05 20:15:05 +01:00
mul_mm.comp Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (#10597) 2024-12-07 10:24:15 +01:00
mul.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
pad.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
pool2d.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
repeat.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
rms_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_head.comp vulkan: request round-to-even for fp16 in im2col/rope_head (#10767) 2024-12-10 21:23:17 +01:00
rope_neox.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
scale.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
silu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
sin.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
soft_max.comp Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (#11161) 2025-01-10 06:39:33 +01:00
square.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
sum_rows.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
tanh.comp Vulkan: fix NaN in tanh.comp with AMD proprietary driver on Windows (#10723) 2024-12-08 19:19:19 +01:00
test_coopmat2_support.comp vulkan: compile a test shader in cmake to check for coopmat2 support (#10713) 2024-12-08 09:05:55 +01:00
test_coopmat_support.comp Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (#11117) 2025-01-08 09:18:13 +01:00
timestep_embedding.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
types.comp vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (#11206) 2025-01-16 22:23:49 +01:00
upscale.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
vulkan-shaders-gen.cpp vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 2025-01-16 22:47:10 +01:00
wkv6.comp rwkv6: add wkv6 support for Vulkan backend (#10829) 2024-12-16 22:00:46 +01:00