slaren
5fb5e24811
llama : minor sampling refactor (2) ( #9386 )
2024-09-09 17:10:46 +02:00
slaren
19f4a7b296
llama : refactor samplers internal implementation ( #9370 )
2024-09-08 15:52:07 +02:00
Georgi Gerganov
f12295b8a9
llama : fix empty ring buffer push ( #9358 )
2024-09-08 00:33:33 +03:00
Georgi Gerganov
df270ef745
llama : refactor sampling v2 ( #9294 )
...
- Add `struct llama_sampler` and `struct llama_sampler_i`
- Add `llama_sampler_` API
- Add `llama_sampler_chain_` API for chaining multiple samplers
- Remove `LLAMA_API_INTERNAL`
- Add `llama_perf_` API and remove old `llama_print_timings` and `llama_reset_timings`
2024-09-07 15:16:19 +03:00
Liu Jia
2589292cde
Fix a spelling mistake ( #9001 )
2024-08-12 11:46:03 +02:00
Georgi Gerganov
938943cdbf
llama : move vocab, grammar and sampling into separate files ( #8508 )
...
* llama : move sampling code into llama-sampling
ggml-ci
* llama : move grammar code into llama-grammar
ggml-ci
* cont
ggml-ci
* cont : pre-fetch rules
* cont
ggml-ci
* llama : deprecate llama_sample_grammar
* llama : move tokenizers into llama-vocab
ggml-ci
* make : update llama.cpp deps [no ci]
* llama : redirect external API to internal APIs
ggml-ci
* llama : suffix the internal APIs with "_impl"
ggml-ci
* llama : clean-up
2024-07-23 13:10:17 +03:00