Olivier Chafik
6171c9d258
Add Jinja template support ( #11016 )
...
* Copy minja from 58f0ca6dd7
* Add --jinja and --chat-template-file flags
* Add missing <optional> include
* Avoid print in get_hf_chat_template.py
* No designated initializers yet
* Try and work around msvc++ non-macro max resolution quirk
* Update test_chat_completion.py
* Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template
* Refactor test-chat-template
* Test templates w/ minja
* Fix deprecation
* Add --jinja to llama-run
* Update common_chat_format_example to use minja template wrapper
* Test chat_template in e2e test
* Update utils.py
* Update test_chat_completion.py
* Update run.cpp
* Update arg.cpp
* Refactor common_chat_* functions to accept minja template + use_jinja option
* Attempt to fix linkage of LLAMA_CHATML_TEMPLATE
* Revert LLAMA_CHATML_TEMPLATE refactor
* Normalize newlines in test-chat-templates for windows tests
* Forward decl minja::chat_template to avoid eager json dep
* Flush stdout in chat template before potential crash
* Fix copy elision warning
* Rm unused optional include
* Add missing optional include to server.cpp
* Disable jinja test that has a cryptic windows failure
* minja: fix vigogne (https://github.com/google/minja/pull/22 )
* Apply suggestions from code review
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Finish suggested renamings
* Move chat_templates inside server_context + remove mutex
* Update --chat-template-file w/ recent change to --chat-template
* Refactor chat template validation
* Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr)
* Warn against missing eos / bos tokens when jinja template references them
* rename: common_chat_template[s]
* reinstate assert on chat_templates.template_default
* Update minja to b8437df626
* Update minja to https://github.com/google/minja/pull/25
* Update minja from https://github.com/google/minja/pull/27
* rm unused optional header
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-01-21 13:18:51 +00:00
Georgi Gerganov
f26c874179
scripts : restore hf.sh ( #11288 )
...
ggml-ci
2025-01-18 13:18:32 +02:00
Georgi Gerganov
f11cfdfd7f
ci : use -no-cnv in gguf-split tests ( #11254 )
...
* ci : use -no-cnv in gguf-split tests
ggml-ci
* ci : use -no-cnv in requantize tests
ggml-ci
* scripts : fix [no ci]
2025-01-15 18:28:35 +02:00
Georgi Gerganov
44d1e796d0
sync : ggml
2025-01-14 10:39:42 +02:00
Georgi Gerganov
a4f3f5d8e6
scripts : sync gguf (cont)
2025-01-14 09:40:52 +02:00
Georgi Gerganov
48e1ae0e61
scripts : sync gguf
2025-01-14 09:36:58 +02:00
Georgi Gerganov
d00a80e89d
scripts : sync opencl
2025-01-14 09:19:58 +02:00
Georgi Gerganov
99a3755a3c
sync : ggml
2025-01-08 13:40:30 +02:00
Georgi Gerganov
78c6785175
sync : ggml
2025-01-04 16:09:53 +02:00
Djip007
2cd43f4900
ggml : more perfo with llamafile tinyblas on x86_64 ( #10714 )
...
* more perfo with llamafile tinyblas on x86_64.
- add bf16 suport
- change dispache strategie (thanks:
https://github.com/ikawrakow/ik_llama.cpp/pull/71 )
- reduce memory bandwidth
simple tinyblas dispache and more cache freindly
* tinyblas dynamic dispaching
* sgemm: add M blocs.
* - git 2.47 use short id of len 9.
- show-progress is not part of GNU Wget2
* remove not stable test
2024-12-24 18:54:49 +01:00
Georgi Gerganov
5437d4aaf5
sync : ggml
2024-12-17 18:36:02 +02:00
Georgi Gerganov
87cf323cef
scripts : change build path to "build-bench" for compare-commits.sh ( #10836 )
2024-12-15 18:44:47 +02:00
Georgi Gerganov
0cd182ebcc
sync : ggml
2024-12-05 13:27:42 +02:00
Diego Devesa
59f4db1088
ggml : add predefined list of CPU backend variants to build ( #10626 )
...
* ggml : add predefined list of CPU backend variants to build
* update CPU dockerfiles
2024-12-04 14:45:40 +01:00
Georgi Gerganov
1cd3df46bd
scripts : remove amx sync
...
ggml-ci
2024-12-03 20:04:49 +02:00
Georgi Gerganov
c505471857
sync : ggml
2024-12-03 20:04:49 +02:00
Georgi Gerganov
8648c52101
make : deprecate ( #10514 )
...
* make : deprecate
ggml-ci
* ci : disable Makefile builds
ggml-ci
* docs : remove make references [no ci]
* ci : disable swift build
ggml-ci
* docs : remove obsolete make references, scripts, examples
ggml-ci
* basic fix for compare-commits.sh
* update build.md
* more build.md updates
* more build.md updates
* more build.md updates
* Update Makefile
Co-authored-by: Diego Devesa <slarengh@gmail.com>
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-12-02 21:22:53 +02:00
Diego Devesa
3420909dff
ggml : automatic selection of best CPU backend ( #10606 )
...
* ggml : automatic selection of best CPU backend
* amx : minor opt
* add GGML_AVX_VNNI to enable avx-vnni, fix checks
2024-12-01 16:12:41 +01:00
Georgi Gerganov
fee824a1a1
sync : ggml
2024-11-27 11:10:42 +02:00
Georgi Gerganov
87a533be57
sync : ggml
2024-11-21 09:22:11 +02:00
Georgi Gerganov
9fe0fb0626
sync : ggml
2024-11-19 20:03:21 +02:00
Georgi Gerganov
5c9a8b22b1
scripts : update sync
2024-11-17 08:30:29 +02:00
Johannes Gäßler
4e54be0ec6
llama/ex: remove --logdir argument ( #10339 )
2024-11-16 23:00:41 +01:00
Georgi Gerganov
f245cc28d4
scripts : fix missing key in compare-llama-bench.py ( #10332 )
2024-11-16 10:32:50 +02:00
Johannes Gäßler
4047be74da
scripts: update compare-llama-bench.py ( #10319 )
2024-11-15 21:19:03 +01:00
Georgi Gerganov
cbf5541a82
sync : ggml
2024-11-15 15:44:06 +02:00
Georgi Gerganov
4802ad350b
scripts : fix regex in sync [no ci]
2024-11-15 08:38:43 +02:00
Georgi Gerganov
5ea926dad7
sync : ggml
2024-11-13 18:11:54 +02:00
Georgi Gerganov
eec4d71737
scripts : add amx to sync-ggml.sh [no ci]
2024-11-07 23:11:36 +02:00
Georgi Gerganov
3b08828674
sync : ggml
2024-11-07 23:08:24 +02:00
Georgi Gerganov
a2c6fd747c
scripts : sync update
2024-11-07 23:07:55 +02:00
Georgi Gerganov
ce027adfb3
sync : ggml
2024-11-04 10:33:37 +02:00
Georgi Gerganov
815fe72adc
sync : ggml
2024-11-01 10:28:24 +02:00
Diego Devesa
c5b0f4b5d9
llama : refactor model loader with backend registry ( #10026 )
2024-10-30 02:01:23 +01:00
Georgi Gerganov
8d8ff71536
llama : remove Tail-Free sampling ( #10071 )
...
ggml-ci
2024-10-29 10:42:05 +02:00
Georgi Gerganov
cc2983d375
sync : ggml
2024-10-26 10:34:08 +03:00
Georgi Gerganov
9e4a2563ea
scripts : fix amx sync [no ci]
2024-10-26 10:33:31 +03:00
Georgi Gerganov
190a37d797
sync : ggml
2024-10-23 17:23:55 +03:00
Georgi Gerganov
17bb928080
readme : remove --memory-f32 references ( #9925 )
2024-10-17 23:43:05 +03:00
Georgi Gerganov
0e41b300ed
sync : ggml
2024-10-16 11:28:14 +03:00
standby24x7
fa42aa6d89
scripts : fix spelling typo in messages and comments ( #9782 )
...
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-10-08 09:19:53 +03:00
Georgi Gerganov
b6d6c5289f
sync : llama.cpp
2024-10-06 12:53:28 +03:00
Georgi Gerganov
58b16695e1
sync : ggml
2024-10-05 15:53:49 +03:00
Georgi Gerganov
17880771ad
sync : ggml
2024-10-04 18:50:25 +03:00
Georgi Gerganov
1bb8a64ebf
sync : ggml
2024-10-03 21:17:49 +03:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces ( #9707 )
...
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
Georgi Gerganov
f1b8c42711
sync : ggml
2024-10-01 16:09:42 +03:00
Georgi Gerganov
d0b1d663e4
sync : ggml
2024-09-29 21:16:07 +03:00
Georgi Gerganov
bb5f819975
sync : ggml
2024-09-24 11:01:18 +03:00
Georgi Gerganov
4301535326
sync : ggml
...
ggml-ci
2024-09-20 21:15:05 +03:00