llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-27 20:43:07 +01:00

History

Nicolò Scipione 40c6d79fb5 SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584 ) * [SYCL] Move to Compile Time backend selection on oneMKL Interface for NVIDIA backend Move to compile time selection to backend to avoid latency at run time. Add it to all mkl gemm calls and only for NVIDIA backend. Signed-off-by: nscipione <nicolo.scipione@codeplay.com> * Formatting * Address PR comments to increase readibility --------- Signed-off-by: nscipione <nicolo.scipione@codeplay.com>		2024-12-04 09:29:20 +08:00
..
ggml-blas	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-cann	CANN: RoPE operator optimization (#10563 )	2024-11-29 14:46:55 +08:00
ggml-cpu	ggml : automatic selection of best CPU backend (#10606 )	2024-12-01 16:12:41 +01:00
ggml-cuda	CUDA: remove unnecessary warp reduce in FA (ggml/1032)	2024-12-03 20:04:49 +02:00
ggml-hip	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-kompute	kompute : improve backend to pass test_backend_ops (#10542 )	2024-11-28 12:51:38 +01:00
ggml-metal	feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)	2024-12-03 20:04:49 +02:00
ggml-musa	mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516 )	2024-11-26 17:00:41 +01:00
ggml-rpc	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-sycl	SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584 )	2024-12-04 09:29:20 +08:00
ggml-vulkan	vulkan: optimize and reenable split_k (#10637 )	2024-12-03 20:29:54 +01:00
CMakeLists.txt	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
ggml-aarch64.c	ggml : optimize Q4_0 into Q4_0_X_Y repack (#10324 )	2024-11-16 01:53:37 +01:00
ggml-aarch64.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-alloc.c	ggml: new optimization interface (ggml/988)	2024-11-17 08:30:29 +02:00
ggml-backend-impl.h	ggml : automatic selection of best CPU backend (#10606 )	2024-12-01 16:12:41 +01:00
ggml-backend-reg.cpp	ggml : automatic selection of best CPU backend (#10606 )	2024-12-01 16:12:41 +01:00
ggml-backend.cpp	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
ggml-common.h	ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541 )	2024-11-28 13:52:03 +01:00
ggml-impl.h	Avoid using __fp16 on ARM with old nvcc (#10616 )	2024-12-04 01:41:37 +01:00
ggml-opt.cpp	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-quants.c	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml.c	ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541 )	2024-11-28 13:52:03 +01:00