llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-01 00:39:00 +01:00

Author	SHA1	Message	Date
Daniel Bevenius	49271efbaf	llama : fix typo in xcda_array_view comment [no ci] (#9132 )	2024-08-31 10:50:22 +03:00
Daniel Bevenius	8455340b87	llama : std::move llm_bigram_bpe from work_queue (#9062 ) * llama : std::move llm_bigram_bpe from work_queue This commit updates the retrieval of llm_bigram_bpe objects from work_queue.top() by using std::move. The motivation for this is to avoid the copying of the std::string `text` member of the llm_bigram_bpe struct. * squash! llama : std::move llm_bigram_bpe from work_queue Introduced a MovablePriorityQueue class to allow moving elements out of the priority queue for llm_bigram_bpe. * squash! llama : std::move llm_bigram_bpe from work_queue Rename MovablePriorityQueue to lama_priority_queue. * squash! llama : std::move llm_bigram_bpe from work_queue Rename lama_priority_queue -> llama_priority_queue.	2024-08-21 10:32:58 +03:00
Minsoo Cheong	c679e0cb5c	llama : add EXAONE model support (#9025 ) * add exaone model support * add chat template * fix whitespace Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add ftype * add exaone pre-tokenizer in `llama-vocab.cpp` Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com> * fix lint Co-Authored-By: compilade <113953597+compilade@users.noreply.github.com> * add `EXAONE` to supported models in `README.md` * fix space Co-authored-by: compilade <git@compilade.net> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: compilade <113953597+compilade@users.noreply.github.com> Co-authored-by: compilade <git@compilade.net>	2024-08-16 09:35:18 +03:00
Zhenwei Jin	4af8420afb	common : remove duplicate function llama_should_add_bos_token (#8778 )	2024-08-15 10:23:23 +03:00
Esko Toivonen	6bda7ce6c3	llama : add pre-tokenizer regexes for BLOOM and gpt3-finnish (#8850 )	2024-08-15 10:17:12 +03:00
Georgi Gerganov	45a55b91aa	llama : better replace_all (cont) (#8926 ) * llama : better replace_all (cont) ggml-ci * code : deduplicate replace_all ggml-ci	2024-08-09 18:23:52 +03:00
Douglas Hanley	cdd1889de6	convert : add support for XLMRoberta embedding models (#8658 ) * add conversion for bge-m3; small fix in unigram tokenizer * clean up and simplify XLMRoberta conversion	2024-08-06 10:20:54 +03:00
fairydreaming	d3f0c7166a	Stop the generation when <\|eom_id\|> token is encountered - needed for Llama 3.1 tool call support (#8858 ) * gguf-py, llama : add constants and methods related to Llama-3.1 <\|eom_id\|> token * llama : find Llama-3.1 <\|eom_id\|> token id during vocab loading * llama-vocab : add Llama-3.1 <\|eom_id\|> token to the set of tokens stopping the generation --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>	2024-08-05 09:38:01 +02:00
slaren	2b1f616b20	ggml : reduce hash table reset cost (#8698 ) * ggml : reduce hash table reset cost * fix unreachable code warnings after GGML_ASSERT(false) * GGML_ASSERT(false) -> GGML_ABORT("fatal error") * GGML_ABORT use format string	2024-07-27 04:41:55 +02:00
Georgi Gerganov	938943cdbf	llama : move vocab, grammar and sampling into separate files (#8508 ) * llama : move sampling code into llama-sampling ggml-ci * llama : move grammar code into llama-grammar ggml-ci * cont ggml-ci * cont : pre-fetch rules * cont ggml-ci * llama : deprecate llama_sample_grammar * llama : move tokenizers into llama-vocab ggml-ci * make : update llama.cpp deps [no ci] * llama : redirect external API to internal APIs ggml-ci * llama : suffix the internal APIs with "_impl" ggml-ci * llama : clean-up	2024-07-23 13:10:17 +03:00

10 Commits