llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-10 12:30:50 +01:00

History

Basic Vulkan Multi-GPU implementation (#5321 )

* Initial Vulkan multi-gpu implementation

Move most global variables into backend context

* Add names to backend device functions

* Add further missing cleanup code

* Reduce code duplication in tensor split layer assignment

* generalize LLAMA_SPLIT_LAYER for all backends, do not expose device count and memory in llama.h

* Only do device info print in the beginning and initialize one backend for cpu assist

Add missing cleanup code

* Rework backend memory management to make sure devices and buffers get properly allocated and freed

* Rename cpu assist free function

---------

Co-authored-by: slaren <slarengh@gmail.com>

2024-02-07 07:54:50 +01:00

base64.hpp

llava : expose as a shared library for downstream projects (#3613 )

2023-11-07 00:36:23 +03:00

build-info.cpp.in

build : link against build info instead of compiling against it (#3879 )

2023-11-02 08:50:16 +02:00

CMakeLists.txt

cmake : fix ld warning duplicate libraries libllama.a (#4671 )

2023-12-29 16:39:15 +02:00

common.cpp

Basic Vulkan Multi-GPU implementation (#5321 )