llama.cpp/ggml/src
Jeff Bolz a91a41364b
vulkan: optimize coopmat2 dequant functions (#10855)
Change the code to do 16b loads when possible and extract the appropriate
component late, so the code is effectively decoding a pair of elements and
then selecting one. This can allow more commoning to happen in the compiler
when neighboring elements are loaded.
2024-12-21 08:04:45 +01:00
..
ggml-blas ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-cann llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-cpu ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0() (#10874) 2024-12-21 00:33:37 +01:00
ggml-cuda llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-hip ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-kompute llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-metal llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-musa mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516) 2024-11-26 17:00:41 +01:00
ggml-opencl Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693) 2024-12-13 12:23:52 -08:00
ggml-rpc ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-sycl SYCL: Migrate away from deprecated ggml_tensor->backend (#10840) 2024-12-20 23:31:28 +08:00
ggml-vulkan vulkan: optimize coopmat2 dequant functions (#10855) 2024-12-21 08:04:45 +01:00
CMakeLists.txt Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693) 2024-12-13 12:23:52 -08:00
ggml-alloc.c ggml : remove return from ggml_gallocr_allocate_node (ggml/1048) 2024-12-17 18:35:49 +02:00
ggml-backend-impl.h ggml : automatic selection of best CPU backend (#10606) 2024-12-01 16:12:41 +01:00
ggml-backend-reg.cpp Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693) 2024-12-13 12:23:52 -08:00
ggml-backend.cpp ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
ggml-common.h CUDA: rename macros to avoid conflicts with WinAPI (#10736) 2024-12-10 18:23:24 +01:00
ggml-impl.h tests: add tests for GGUF (#10830) 2024-12-17 19:09:35 +01:00
ggml-opt.cpp ggml-opt: fix data corruption (ggml/1022) 2024-11-21 09:22:02 +02:00
ggml-quants.c ggml : refactor online repacking (#10446) 2024-12-07 14:37:50 +02:00
ggml-quants.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
ggml.c tts : add OuteTTS support (#10784) 2024-12-18 19:27:21 +02:00