llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-15 06:40:45 +01:00

History

Jan Boon beea6e1b16 llama : save and restore kv cache for single seq id (#6341 ) * llama : save and restore kv cache for single seq id * remove trailing whitespace * respond error in case there's no space in the kv cache * add kv seq save restore to test case * add --slot-save-path arg to enable save restore and restrict save location * Returning 0 for some cases, instead of asserting. * cleanup error cases * rename sequence state functions * rename state get set functions * add previous function names back in with DEPRECATED notice * update doc * adjust endpoints to preferred style * fix restoring zero cell count * handle seq rm return value * unused param * keep in the size check * fix return types * add server test case for slot save restore * cleanup * add cake * cleanup style * add special * removing a whole sequence never fails * move sequence state file functionality from server to llama to match session api and add version tags * catch exceptions on save as well * error log messages * check types for stricter restore * update server doc * readme : update API changes date * strict filename validation * move include, reject bom as well * also reject empty filename * reject whitespace and trailing dot --------- Co-authored-by: Martin Evans <martindevans@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2024-04-08 15:43:30 +03:00
..
steps	llama : save and restore kv cache for single seq id (#6341 )	2024-04-08 15:43:30 +03:00
embeddings.feature	common: llama_load_model_from_url using --model-url (#6098 )	2024-03-17 19:12:37 +01:00
environment.py	server tests : more pythonic process management; fix bare `except:` (#6146 )	2024-03-20 06:33:49 +01:00
issues.feature	server: tests: passkey challenge / self-extend with context shift demo (#5832 )	2024-03-02 22:00:14 +01:00
parallel.feature	common: llama_load_model_from_url split support (#6192 )	2024-03-23 18:07:00 +01:00
passkey.feature	server: tests: passkey challenge / self-extend with context shift demo (#5832 )	2024-03-02 22:00:14 +01:00
security.feature	json-schema-to-grammar improvements (+ added to server) (#5978 )	2024-03-21 11:50:43 +00:00
server.feature	common: llama_load_model_from_url split support (#6192 )	2024-03-23 18:07:00 +01:00
slotsave.feature	llama : save and restore kv cache for single seq id (#6341 )	2024-04-08 15:43:30 +03:00
wrong_usages.feature	server: tests: passkey challenge / self-extend with context shift demo (#5832 )	2024-03-02 22:00:14 +01:00