llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-22 09:39:08 +01:00

History

Daniel Bevenius 06943a69f6 ggml : move rope type enum to ggml.h (#8949 ) * ggml : move rope type enum to ggml.h This commit moves the `llama_rope_type` enum from `llama.h` to `ggml.h` and changes its name to `ggml_rope_type`. The motivation for this change is to address the TODO in `llama.h` and use the enum in ggml. Note: This commit does not change the `mode` parameter to be of type `enum ggml_rope_type`. The name `mode` and its usage suggest that it might be more generic and possibly used as a bit field for multiple flags. Further investigation/discussion may be needed to determine if `mode` should be restricted to RoPE types. * squash! ggml : move rope type enum to ggml.h This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from ggml.h, and back the llama_rope_type enum. I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is safe to remove it yet. * squash! ggml : move rope type enum to ggml.h This commit removes the enum ggml_rope_type from ggml.h and replaces it with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has been updated to reflect this change. * squash! ggml : move rope type enum to ggml.h This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX macro/define to be passed to the shader compiler. * squash! ggml : move rope type enum to ggml.h This commit fixes the editorconfig-checker warnings. * squash! ggml : move rope type enum to ggml.h Update comment for ggml_rope function. * Revert "squash! ggml : move rope type enum to ggml.h" This reverts commit `6261222bd0`. * squash! ggml : move rope type enum to ggml.h Add GGML_ROPE_TYPE_NEOX to rope_common.comp. * remove extra line --------- Co-authored-by: slaren <slarengh@gmail.com>		2024-08-13 21:13:15 +02:00
..
dpct	[SYCL] Updated SYCL device filtering (#8901 )	2024-08-07 11:25:36 +01:00
backend.hpp	[SYCL] Add `TIMESTEP_EMBEDDING` OP (#8707 )	2024-07-30 14:56:51 +08:00
common.cpp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
common.hpp	ggml : reduce hash table reset cost (#8698 )	2024-07-27 04:41:55 +02:00
concat.cpp	[SYCL] add concat through dim 1/2 (#8483 )	2024-07-15 19:32:15 +08:00
concat.hpp	[SYCL] add concat through dim 1/2 (#8483 )	2024-07-15 19:32:15 +08:00
conv.cpp	[SYCL] add conv support (#8688 )	2024-07-29 10:50:27 +08:00
conv.hpp	[SYCL] add conv support (#8688 )	2024-07-29 10:50:27 +08:00
convert.cpp	[SYCL] Use multi_ptr to clean up deprecated warnings (#8256 )	2024-07-10 16:10:49 +01:00
convert.hpp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
dequantize.hpp	Dequant improvements rebase (#8255 )	2024-07-03 09:55:34 +08:00
dmmv.cpp	ggml : reduce hash table reset cost (#8698 )	2024-07-27 04:41:55 +02:00
dmmv.hpp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
mmq.cpp	ggml : reduce hash table reset cost (#8698 )	2024-07-27 04:41:55 +02:00
mmq.hpp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
mmvq.cpp	[SYCL] Fixing wrong VDR iq4nl value (#8812 )	2024-08-02 08:55:17 +08:00
mmvq.hpp	llama : reorganize source code + improve CMake (#8006 )	2024-06-26 18:33:02 +03:00
norm.cpp	ggml : add epsilon as a parameter for group_norm (#8818 )	2024-08-06 10:26:46 +03:00
norm.hpp	[SYCL] Fix the sub group size of Intel (#8106 )	2024-07-02 10:16:00 +08:00
presets.hpp	[SYCL] Add `TIMESTEP_EMBEDDING` OP (#8707 )	2024-07-30 14:56:51 +08:00
rope.cpp	ggml : move rope type enum to ggml.h (#8949 )	2024-08-13 21:13:15 +02:00
rope.hpp	[SYCL] Update SYCL-Rope op and Refactor (#8157 )	2024-07-01 19:39:06 +08:00
softmax.cpp	[SYCL] fix scratch size of softmax (#8642 )	2024-07-23 15:43:28 +08:00
softmax.hpp	[SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266 )	2024-07-05 13:06:13 +08:00
tsembd.cpp	[SYCL] Add `TIMESTEP_EMBEDDING` OP (#8707 )	2024-07-30 14:56:51 +08:00
tsembd.hpp	[SYCL] Add `TIMESTEP_EMBEDDING` OP (#8707 )	2024-07-30 14:56:51 +08:00
vecdotq.hpp	CUDA: refactor and optimize IQ MMVQ (#8215 )	2024-07-01 20:39:06 +02:00