llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-15 06:40:45 +01:00

History

fairydreaming 9394bbd484 llama : Add support for DeepSeek V3 (#11049 ) * convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type * vocab : add DeepSeek V3 pre-tokenizer regexes * unicode : handle ACCENT_MARK and SYMBOL categories in regex * llama : add DeepSeek V3 chat template, handle new model parameters and tensor types --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>	2025-01-04 21:06:11 +01:00
..
llama-cpp.h	llama : refactor `src/llama.cpp` (#10902 )	2025-01-03 10:18:53 +02:00
llama.h	llama : Add support for DeepSeek V3 (#11049 )	2025-01-04 21:06:11 +01:00

llama : Add support for DeepSeek V3 (#11049 )

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>

2025-01-04 21:06:11 +01:00

llama-cpp.h llama : refactor src/llama.cpp (#10902 ) 2025-01-03 10:18:53 +02:00

llama.h llama : Add support for DeepSeek V3 (#11049 ) 2025-01-04 21:06:11 +01:00