llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-02-06 08:30:33 +01:00

History

Jeff Bolz a91a41364b vulkan: optimize coopmat2 dequant functions (#10855 ) Change the code to do 16b loads when possible and extract the appropriate component late, so the code is effectively decoding a pair of elements and then selecting one. This can allow more commoning to happen in the compiler when neighboring elements are loaded.		2024-12-21 08:04:45 +01:00
..
ggml-blas	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-cann	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-cpu	ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0() (#10874 )	2024-12-21 00:33:37 +01:00
ggml-cuda	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-hip	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-kompute	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-metal	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-musa	mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516 )	2024-11-26 17:00:41 +01:00
ggml-opencl	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-rpc	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-sycl	SYCL: Migrate away from deprecated ggml_tensor->backend (#10840 )	2024-12-20 23:31:28 +08:00
ggml-vulkan	vulkan: optimize coopmat2 dequant functions (#10855 )	2024-12-21 08:04:45 +01:00
CMakeLists.txt	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-alloc.c	ggml : remove return from ggml_gallocr_allocate_node (ggml/1048)	2024-12-17 18:35:49 +02:00
ggml-backend-impl.h	ggml : automatic selection of best CPU backend (#10606 )	2024-12-01 16:12:41 +01:00
ggml-backend-reg.cpp	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (#10693 )	2024-12-13 12:23:52 -08:00
ggml-backend.cpp	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
ggml-common.h	CUDA: rename macros to avoid conflicts with WinAPI (#10736 )	2024-12-10 18:23:24 +01:00
ggml-impl.h	tests: add tests for GGUF (#10830 )	2024-12-17 19:09:35 +01:00
ggml-opt.cpp	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-quants.c	ggml : refactor online repacking (#10446 )	2024-12-07 14:37:50 +02:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
ggml.c	tts : add OuteTTS support (#10784 )	2024-12-18 19:27:21 +02:00