llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-30 05:43:03 +01:00

History

Djip007 2cd43f4900 ggml : more perfo with llamafile tinyblas on x86_64 (#10714 ) * more perfo with llamafile tinyblas on x86_64. - add bf16 suport - change dispache strategie (thanks: https://github.com/ikawrakow/ik_llama.cpp/pull/71 ) - reduce memory bandwidth simple tinyblas dispache and more cache freindly * tinyblas dynamic dispaching * sgemm: add M blocs. * - git 2.47 use short id of len 9. - show-progress is not part of GNU Wget2 * remove not stable test		2024-12-24 18:54:49 +01:00
..
test_basic.py	server : add flag to disable the web-ui (#10762 ) (#10751 )	2024-12-10 18:22:34 +01:00
test_chat_completion.py	server : add system_fingerprint to chat/completion (#10917 )	2024-12-23 12:02:44 +01:00
test_completion.py	ggml : more perfo with llamafile tinyblas on x86_64 (#10714 )	2024-12-24 18:54:49 +01:00
test_ctx_shift.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_embedding.py	server : fix logprobs, make it OAI-compatible (#10783 )	2024-12-19 15:40:08 +01:00
test_infill.py	server : fix format_infill (#10724 )	2024-12-08 23:04:29 +01:00
test_lora.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_rerank.py	server : fill usage info in embeddings and rerank responses (#10852 )	2024-12-17 18:00:24 +02:00
test_security.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_slot_save.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_speculative.py	server : fix speculative decoding with context shift (#10641 )	2024-12-04 22:38:20 +02:00
test_tokenize.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00