llama.cpp/ggml/src
Nicolò Scipione 40c6d79fb5
SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584)
* [SYCL] Move to Compile Time backend selection on oneMKL Interface for NVIDIA backend

Move to compile time selection to backend to avoid latency at run time.
Add it to all mkl gemm calls and only for NVIDIA backend.

Signed-off-by: nscipione <nicolo.scipione@codeplay.com>

* Formatting

* Address PR comments to increase readibility

---------

Signed-off-by: nscipione <nicolo.scipione@codeplay.com>
2024-12-04 09:29:20 +08:00
..
ggml-blas ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-cann CANN: RoPE operator optimization (#10563) 2024-11-29 14:46:55 +08:00
ggml-cpu ggml : automatic selection of best CPU backend (#10606) 2024-12-01 16:12:41 +01:00
ggml-cuda CUDA: remove unnecessary warp reduce in FA (ggml/1032) 2024-12-03 20:04:49 +02:00
ggml-hip ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-kompute kompute : improve backend to pass test_backend_ops (#10542) 2024-11-28 12:51:38 +01:00
ggml-metal feat: add GGML_UNARY_OP_ARGMAX Metal kernel (ggml/1019) 2024-12-03 20:04:49 +02:00
ggml-musa mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516) 2024-11-26 17:00:41 +01:00
ggml-rpc ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-sycl SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584) 2024-12-04 09:29:20 +08:00
ggml-vulkan vulkan: optimize and reenable split_k (#10637) 2024-12-03 20:29:54 +01:00
CMakeLists.txt ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
ggml-aarch64.c ggml : optimize Q4_0 into Q4_0_X_Y repack (#10324) 2024-11-16 01:53:37 +01:00
ggml-aarch64.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-alloc.c ggml: new optimization interface (ggml/988) 2024-11-17 08:30:29 +02:00
ggml-backend-impl.h ggml : automatic selection of best CPU backend (#10606) 2024-12-01 16:12:41 +01:00
ggml-backend-reg.cpp ggml : automatic selection of best CPU backend (#10606) 2024-12-01 16:12:41 +01:00
ggml-backend.cpp ggml : move AMX to the CPU backend (#10570) 2024-11-29 21:54:58 +01:00
ggml-common.h ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541) 2024-11-28 13:52:03 +01:00
ggml-impl.h Avoid using __fp16 on ARM with old nvcc (#10616) 2024-12-04 01:41:37 +01:00
ggml-opt.cpp ggml-opt: fix data corruption (ggml/1022) 2024-11-21 09:22:02 +02:00
ggml-quants.c ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-quants.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml.c ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541) 2024-11-28 13:52:03 +01:00