llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-10 12:30:50 +01:00

History

Georgi Gerganov ac2219fef3

llama : fix session saving/loading (#3400 )

* llama : fix session saving/loading

* llama : temp fix for clearing "future" tokens from the KV cache

* llama : fix handling of "future" tokens when loading sessions

* llama : fix comments for llama_kv_cache API

2023-10-03 21:04:01 +03:00

CMakeLists.txt

speculative : PoC for speeding-up inference via speculative sampling (#2926 )

2023-09-03 15:12:08 +03:00

speculative.cpp

llama : fix session saving/loading (#3400 )

2023-10-03 21:04:01 +03:00