llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-27 12:33:06 +01:00

History

Changyeon Kim 8f275a7c45 ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763 ) * ggml: Add POOL2D OP for GPU ACC to the Vulkan. - The MobileVLM model now supports inference acceleration through GPU by utilizing the Vulkan backend. - A GGML_OP_POOL_2D shader has been added. (Pooling) - The encoding performance of the CLIP model improved from 2.8s on the CPU to 0.7s on the GPU. Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com> * [fix] Correct the incorrect order of the parameters. fix casting to int. Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com> --------- Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>		2024-10-29 09:52:56 +01:00
..
cmake	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
include	[CANN] Adapt to dynamically loadable backends mechanism (#9970 )	2024-10-22 16:16:01 +08:00
src	ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763 )	2024-10-29 09:52:56 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	add amx kernel for gemm (#8998 )	2024-10-18 13:34:36 +08:00