llama.cpp/ggml/src/ggml-vulkan
Jeff Bolz f095a649ec
vulkan: get the first command buffer submitted sooner (#10499)
This is an incremental improvement over #9118 to get work to the GPU a bit
sooner. The first part is to start with a smaller number of nodes before
the first submit, and ramp it up to the current 100 nodes/submit. The
second part is to reduce the dryrun overhead for all the nodes that just
need to request descriptor space.

With these changes I get around 1-2% speedup on RTX 4070 combined with my
old Haswell-era CPU.
2024-11-29 07:18:02 +01:00
..
vulkan-shaders vulkan: define all quant data structures in types.comp (#10440) 2024-11-27 08:32:54 +01:00
CMakeLists.txt ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-vulkan.cpp vulkan: get the first command buffer submitted sooner (#10499) 2024-11-29 07:18:02 +01:00