llama.cpp/examples/parallel
Kerfuffle 5df7d06c42
llama : allow exporting a view of the KV cache (#4180)
* Allow exporting a view of the KV cache

* Allow dumping the sequences per cell in common

* Track max contiguous cells value and position as well

* Fix max contiguous empty cells index calculation

Make dump functions deal with lengths or sequences counts > 10 better

* Fix off by one error in dump_kv_cache_view

* Add doc comments for KV cache view functions

Eliminate cell sequence struct; use llama_seq_id directly

Minor cleanups
2023-11-23 18:31:20 +02:00
..
CMakeLists.txt build : link against build info instead of compiling against it (#3879) 2023-11-02 08:50:16 +02:00
parallel.cpp llama : allow exporting a view of the KV cache (#4180) 2023-11-23 18:31:20 +02:00
README.md Fix some documentation typos/grammar mistakes (#4032) 2023-11-11 23:04:58 -07:00

llama.cpp/example/parallel

Simplified simulation of serving incoming requests in parallel