mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-25 13:58:46 +01:00
ee1628bdfe
* Initial Vulkan multi-gpu implementation Move most global variables into backend context * Add names to backend device functions * Add further missing cleanup code * Reduce code duplication in tensor split layer assignment * generalize LLAMA_SPLIT_LAYER for all backends, do not expose device count and memory in llama.h * Only do device info print in the beginning and initialize one backend for cpu assist Add missing cleanup code * Rework backend memory management to make sure devices and buffers get properly allocated and freed * Rename cpu assist free function --------- Co-authored-by: slaren <slarengh@gmail.com> |
||
---|---|---|
.. | ||
base64.hpp | ||
build-info.cpp.in | ||
CMakeLists.txt | ||
common.cpp | ||
common.h | ||
console.cpp | ||
console.h | ||
grammar-parser.cpp | ||
grammar-parser.h | ||
log.h | ||
sampling.cpp | ||
sampling.h | ||
stb_image.h | ||
train.cpp | ||
train.h |