mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-25 05:48:47 +01:00
15606309a0
* New Feature: 1. Sum_Rows: fix cuda kernel overflow fix block shape error when nrows too big 2. Im2Col: Support Batch in cuda Support f32 to f32 both in cpu && cuda 3. DepthWiseConv: Support by Im2Col && MulMat 4. Pool_2d: Supoort avg pooling in cuda 5. HardSigmoid: Imp in cuda 6. HardSwish: Imp in cuda * fix tabs instead of spaces * code clean * CUDA POOL2D * ADD POOL2D test case in test-backend-ops.cpp * code clean * fix pool2d_kernel nits * fix bug in pool2d kernel * fix avg pooling, count_include_pad nits * test-backend-ops : add more pool_2d tests * cuda : fix warnings and formatting * ggml : check types in release builds too in pool_2d * test-backend-ops : remove f16 pool_2d tests * cuda : more style fixes * Add assert in ggml_cuda_op_pool2d * pool2d float padding fallback * test-backend-ops : add dst_type to im2col --------- Co-authored-by: slaren <slarengh@gmail.com> |
||
---|---|---|
.. | ||
.gitignore | ||
CMakeLists.txt | ||
get-model.cpp | ||
get-model.h | ||
test-autorelease.cpp | ||
test-backend-ops.cpp | ||
test-c.c | ||
test-double-float.cpp | ||
test-grad0.cpp | ||
test-grammar-parser.cpp | ||
test-llama-grammar.cpp | ||
test-model-load-cancel.cpp | ||
test-opt.cpp | ||
test-quantize-fns.cpp | ||
test-quantize-perf.cpp | ||
test-rope.cpp | ||
test-sampling.cpp | ||
test-tokenizer-0-falcon.cpp | ||
test-tokenizer-0-falcon.py | ||
test-tokenizer-0-llama.cpp | ||
test-tokenizer-0-llama.py | ||
test-tokenizer-1-bpe.cpp | ||
test-tokenizer-1-llama.cpp |