llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-02-05 16:10:42 +01:00

History

Johannes Gäßler 864a0b67a6 CUDA: use mma PTX instructions for FlashAttention (#11583 ) * CUDA: use mma PTX instructions for FlashAttention * __shfl_sync workaround for movmatrix * add __shfl_sync to HIP Co-authored-by: Diego Devesa <slarengh@gmail.com>		2025-02-02 19:31:09 +01:00
..
cmake	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00
include	CUDA: use mma PTX instructions for FlashAttention (#11583 )	2025-02-02 19:31:09 +01:00
src	CUDA: use mma PTX instructions for FlashAttention (#11583 )	2025-02-02 19:31:09 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00