llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-02-06 00:20:34 +01:00

History

Olivier Chafik cde3833239 `tool-call`: allow `--chat-template chatml` w/ `--jinja`, default to chatml upon parsing issue, avoid double bos (#11616 ) * tool-call: allow `--jinja --chat-template chatml` * fix double bos issue (drop bos/eos tokens from jinja template) * add missing try catch around jinja parsing to default to chatml * Simplify default chatml logic		2025-02-03 23:49:27 +00:00
..
test_basic.py	server : add flag to disable the web-ui (#10762 ) (#10751 )	2024-12-10 18:22:34 +01:00
test_chat_completion.py	`tool-call`: allow `--chat-template chatml` w/ `--jinja`, default to chatml upon parsing issue, avoid double bos (#11616 )	2025-02-03 23:49:27 +00:00
test_completion.py	server : Fixed wrong function name in llamacpp server unit test (#11473 )	2025-01-29 00:03:42 +01:00
test_ctx_shift.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_embedding.py	server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967 )	2024-12-24 21:33:04 +01:00
test_infill.py	server : fix extra BOS in infill endpoint (#11106 )	2025-01-06 15:36:08 +02:00
test_lora.py	server : allow using LoRA adapters per-request (#10994 )	2025-01-02 15:05:18 +01:00
test_rerank.py	server : fill usage info in embeddings and rerank responses (#10852 )	2024-12-17 18:00:24 +02:00
test_security.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_slot_save.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_speculative.py	server : allow using LoRA adapters per-request (#10994 )	2025-01-02 15:05:18 +01:00
test_tokenize.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_tool_call.py	`tool-call`: allow `--chat-template chatml` w/ `--jinja`, default to chatml upon parsing issue, avoid double bos (#11616 )	2025-02-03 23:49:27 +00:00