llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-10 20:40:24 +01:00

History

Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.

2024-12-03 20:29:54 +01:00

2024-11-29 21:54:58 +01:00

2024-12-03 20:29:54 +01:00

.gitignore

2024-07-13 18:12:39 +02:00

CMakeLists.txt

2024-12-01 16:12:41 +01:00