llama.cpp/ggml
a3sh 8faa1d4dd4
CUDA: faster non-contiguous concat (#10760)
* faster uncontiguous concat

* Use a lambda to avoid code duplication

Co-authored-by: Diego Devesa <slarengh@gmail.com>

* Update ggml/src/ggml-cuda/concat.cu

* add constexpr  and static assert

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>
2024-12-12 19:09:50 +01:00
..
include ggml: load all backends from a user-provided search path (#10699) 2024-12-11 01:47:21 +01:00
src CUDA: faster non-contiguous concat (#10760) 2024-12-12 19:09:50 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00