llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-10 12:30:50 +01:00

History

vulkan: optimize mul_mat for small values of N (#10991 )

Make the mul_mat_vec shaders support N>1 (as a spec constant, NUM_COLS) where
the batch_strides are overloaded to hold the row strides. Put the loads from the
B matrix in the innermost loop because it should cache better.

Share some code for reducing the result values to memory in mul_mat_vec_base.

2024-12-30 18:27:11 +01:00

include

tts : add OuteTTS support (#10784 )

2024-12-18 19:27:21 +02:00

src

vulkan: optimize mul_mat for small values of N (#10991 )

2024-12-30 18:27:11 +01:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

ggml : fix arm build (#10890 )

2024-12-18 23:21:42 +01:00