llama.cpp/conv-transpose-1d.cuh at 13dca2a54a394757d56fdd652b9f0df08f44ea22 - llama.cpp - Gitea: Git with a cup of tea

Mirrors/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-04 01:57:53 +01:00

John Balis fde13b3bb9 feat: cuda implementation for ggml_conv_transpose_1d (ggml/854)

* conv transpose 1d passing test for 1d input and kernel

* working for different input and output channel counts, added test for variable stride

* initial draft appears to work with stride other than 1

* working with all old and new conv1d  tests

* added a test for large tensors

* removed use cuda hardcoding

* restored test-conv-transpose.c

* removed unused arugments, and fixed bug where test failure would cause subsequent tests to fail

* fixed accumulator bug

* added test to test-backend-ops

* fixed mistake

* addressed review

* fixed includes

* removed blank lines

* style and warning fixes

* return failure when test fails

* fix supports_op

---------

Co-authored-by: slaren <slarengh@gmail.com>

2024-07-08 12:23:00 +03:00

6 lines

158 B

Plaintext

Raw Blame History

 #include "common.cuh"
 #define CUDA_CONV_TRANPOSE_1D_BLOCK_SIZE 256
 void ggml_cuda_op_conv_transpose_1d(ggml_backend_cuda_context & ctx, ggml_tensor * dst);