Olivier Chafik
e474ef1df4
update llama-rpc-server bin name + doc
2024-06-11 14:42:03 +01:00
Olivier Chafik
ee3a086fdf
Merge pull request #2 from HanClinto/bins-nits-2
...
Bins nits again
2024-06-11 02:36:25 +01:00
ochafik
2a9c4cd7ba
Merge remote-tracking branch 'origin/master' into bins
2024-06-11 02:35:01 +01:00
Olivier Chafik
b61eb9644d
json: refine constraint for whitespace to avoid runaways yet allow pretty print ( #7866 )
2024-06-11 02:22:57 +01:00
Olivier Chafik
396b18dfec
json
: document schema conversion in GBNF readme, align manual grammar examples & converters (#7841 )
...
* json: fix char pattern in grammar converters
* json: prevent number precision & whitespace runaways in example grammars
* json: add doc to grammar readme
2024-06-11 01:00:30 +01:00
HanClinto
1f5ec2c0b4
Updating two small main
references missed earlier in the finetune docs.
2024-06-10 16:12:50 -07:00
HanClinto
70de0debab
Updating documentation references for lookup-merge and export-lora
2024-06-10 15:32:21 -07:00
HanClinto
72660c357c
Updating run-with-preset.py
to use new binary names.
...
Updating docs around `perplexity` binary rename.
2024-06-10 15:23:32 -07:00
HanClinto
2fd66b2ce2
Updating a few lingering doc references for rename of main to llama-cli
2024-06-10 14:53:23 -07:00
HanClinto
e7e03733b2
Updating docs for eval-callback binary to use new llama-
prefix.
2024-06-10 14:44:46 -07:00
Olivier Chafik
f9cfd04bd4
address gbnf-validator unused fread warning (switched to C++ / ifstream)
2024-06-10 17:38:36 +01:00
Olivier Chafik
b8436395b4
rename: llama-cli-cmake-pkg(.exe)
2024-06-10 16:23:45 +01:00
Olivier Chafik
4881a94bee
fix test-eval-callback
2024-06-10 16:21:14 +01:00
Olivier Chafik
b8cb44e812
more llama-cli(.exe)
2024-06-10 16:08:06 +01:00
Olivier Chafik
daeaeb1222
Merge remote-tracking branch 'origin/master' into bins
2024-06-10 15:38:41 +01:00
Olivier Chafik
5265c15d4c
rename llama|main -> llama-cli; consistent RPM bin prefixes
2024-06-10 15:34:14 +01:00
Georgi Gerganov
c28a83902c
examples : remove --instruct remnants ( #7846 )
2024-06-10 15:00:15 +03:00
Georgi Gerganov
d9da0e4986
server : improve "prompt" handling ( #7847 )
2024-06-10 14:59:55 +03:00
Georgi Gerganov
e95beeb1fc
imatrix : handle partial entries ( #7833 )
2024-06-09 20:19:35 +03:00
mgroeber9110
3e2ee44315
server: do not remove whitespace at the start of a completion chunk ( #7830 )
2024-06-09 20:50:35 +10:00
slaren
fe1e3917cf
Revert "[SYCL] Update rpc-server.cpp to include SYCL backend ( #7682 )" ( #7808 )
...
This reverts commit 9422c5e34b
.
2024-06-09 01:43:39 +02:00
Olivier Chafik
b0eb3b88e9
rm bin files
2024-06-08 14:16:32 +01:00
Olivier Chafik
eef922e02e
sort cmake example subdirs
2024-06-08 14:09:28 +01:00
Olivier Chafik
b648243496
add/fix gbnf-validator subfolder to cmake
2024-06-08 14:07:56 +01:00
Olivier Chafik
81222f02db
prefix more cmake targets w/ llama-
2024-06-08 14:05:34 +01:00
Olivier Chafik
10650b692d
rename {main->llama}-cmake-pkg binary
2024-06-08 13:57:06 +01:00
Olivier Chafik
78bca8cb07
fix main refs
2024-06-08 13:52:03 +01:00
Olivier Chafik
ab5efbb3b6
Prefix all example bins w/ llama-
2024-06-08 13:42:01 +01:00
Olivier Chafik
23d0df5bd5
main: target name -> llama-cli
2024-06-08 12:50:35 +01:00
Olivier Chafik
fe93cc96cc
Merge remote-tracking branch 'origin/master' into bins
2024-06-08 12:04:52 +01:00
sasha0552
7a16ce7db2
server : smart slot selection using Longest Common Prefix ( #7728 )
...
* server : Smart selection of available slot using Longest Common Substring
* add usage
* remove trailing whitespaces
* Use Longest Common Prefix (LCP) instead of LCS
* Rename argument
2024-06-08 10:50:31 +03:00
Christian Zhou-Zheng
c00fad71e5
gguf-split : change binary multi-byte units to decimal ( #7803 )
2024-06-07 15:56:01 +03:00
Johannes Gäßler
7027b27d76
server: update cache_prompt documentation [no ci] ( #7745 )
2024-06-07 11:15:49 +02:00
ochafik
7fbe6006c9
update straggling refs
2024-06-07 09:42:21 +01:00
woodx
a5cabd7649
server : do not get prompt in infill mode ( #7286 )
...
* avoid to get prompt in infill mode and embedding mode
* remove embedding mode
* refactor format
---------
Co-authored-by: wudexiang <wudexiang@bytedance.com>
2024-06-07 10:09:45 +03:00
slaren
c9ee7118d5
check for nans in imatrix and quantize ( #7807 )
...
* imatrix : detect nan/inf values
* quantize : check imatrix for nan/inf values
2024-06-07 09:01:29 +03:00
ochafik
8695baebc0
update more names
2024-06-07 00:21:01 +01:00
Olivier Chafik
9a03341094
main/server: fix targets
2024-06-06 15:53:25 +01:00
Olivier Chafik
8b7c734473
main: update refs -> llama
...
fix examples/main ref
2024-06-06 15:44:51 +01:00
Olivier Chafik
f298cc63d2
server: update refs -> llama-server
...
gitignore llama-server
2024-06-06 15:44:40 +01:00
Olivier Chafik
849842916d
main
/server
: rename to llama
/ llama-server
for consistency w/ homebrew
2024-06-06 15:28:27 +01:00
Georgi Gerganov
f83351f9a6
imatrix : migrate to gpt_params ( #7771 )
...
* imatrix : migrate to gpt_params
ggml-ci
* imatrix : add --save-frequency cli arg
* common : fix --no-ppl
2024-06-06 16:30:58 +03:00
Olivier Chafik
55b2d0849d
grammars: x{min,max} repetition operator ( #6640 )
...
* grammars: x{min,max} repetition operator + tweak +/*/? to avoid duplication of original over alternates
* grammars: handle `x{n}` and fix `x{n,n}`
* grammars: document new repetition operators
* grammars: uniform use of int for min & max
* grammars: refactor parser test
* grammar: parsing tests w/ natural pretty print of updated expectations
* grammars: much prettier print of expectations (+ TEST_GRAMMAR_PARSER_PRINT_ALL=1 to force all)
* grammars: improve test pretty print again
* grammars: pretty print rules and chars
* grammars: fix copy rule skipping
* grammars: disallow `a{,}` (not allowed in regexps)
* Update common/grammar-parser.cpp
Co-authored-by: Clint Herron <hanclinto@gmail.com>
* grammars: fix copy rule skipping (again) & display of expectations
* grammars: more test cases
* grammars: update reps parsing to bring ? / * / + closer to before
* json: use new GBNF repetitions{m,n} syntax
* grammars: update performance gotchas w/ repetition advice
* Update examples/json_schema_to_grammar.py
Co-authored-by: Clint Herron <hanclinto@gmail.com>
* Update examples/server/public/json-schema-to-grammar.mjs
Co-authored-by: Clint Herron <hanclinto@gmail.com>
* grammars: comment on rule repetitions
* grammars: ensure unambiguous number alternatives
* grammar: nit typo switched error msgs
* grammar: nit numbering in comment
* json: update numeric rule to be unambiguous
* Apply suggestions from code review
Co-authored-by: Clint Herron <hanclinto@gmail.com>
* Update examples/server/public/json-schema-to-grammar.mjs
Co-authored-by: Clint Herron <hanclinto@gmail.com>
* json: fix integral-part
* grammar: add repetition tests
---------
Co-authored-by: Clint Herron <hanclinto@gmail.com>
2024-06-06 10:07:06 +01:00
Georgi Gerganov
2b3389677a
ggml : refactor rope norm/neox ( #7634 )
...
* ggml : unify rope norm/neox (CPU)
* ggml : fix compile warning
* ggml : remove GLM rope mode
ggml-ci
* metal : better rope implementation
ggml-ci
* cuda : better rope implementation
ggml-ci
* naming : n_orig_ctx -> n_ctx_orig
ggml-ci
* dev : add reminders to update backends
ggml-ci
* vulkan : fix ggml_rope_ext() usage
* cuda : fix array size + indents
ggml-ci
2024-06-05 11:29:20 +03:00
arch-btw
9973e81c5c
readme : remove -ins ( #7759 )
...
-ins and --instruct were moved in https://github.com/ggerganov/llama.cpp/pull/7675
I have adjusted the README accordingly.
There was no trace of --chatml in the README.
2024-06-05 09:40:49 +03:00
Georgi Gerganov
1442677f92
common : refactor cli arg parsing ( #7675 )
...
* common : gpt_params_parse do not print usage
* common : rework usage print (wip)
* common : valign
* common : rework print_usage
* infill : remove cfg support
* common : reorder args
* server : deduplicate parameters
ggml-ci
* common : add missing header
ggml-ci
* common : remote --random-prompt usages
ggml-ci
* examples : migrate to gpt_params
ggml-ci
* batched-bench : migrate to gpt_params
* retrieval : migrate to gpt_params
* common : change defaults for escape and n_ctx
* common : remove chatml and instruct params
ggml-ci
* common : passkey use gpt_params
2024-06-04 21:23:39 +03:00
Georgi Gerganov
554c247caf
ggml : remove OpenCL ( #7735 )
...
ggml-ci
2024-06-04 21:23:20 +03:00
Georgi Gerganov
0cd6bd3483
llama : remove beam search ( #7736 )
2024-06-04 21:23:05 +03:00
slaren
adc9ff3841
llama-bench : allow using a different printer for stderr with -oe ( #7722 )
...
compare-commits.sh : hide stdout, use -oe to print markdown
2024-06-04 14:32:42 +02:00
nickp27
9422c5e34b
[SYCL] Update rpc-server.cpp to include SYCL backend ( #7682 )
...
* Update rpc-server.cpp to include SYCL backend
Draft PR to address inclusion of SYCL backend for RPC server
* Update rpc-server.cpp
2024-06-02 12:13:54 +03:00