Johannes Gäßler d50f8897a7
CUDA: stream-k decomposition for MMQ (#8018)
* CUDA: stream-k decomposition for MMQ

* fix undefined memory reads for small matrices
2024-06-20 14:39:21 +02:00
..
2024-06-05 16:53:00 +02:00
2024-03-29 17:45:46 +02:00
2024-04-30 12:16:08 +03:00
2024-06-05 11:29:20 +03:00
2024-06-17 00:23:04 +02:00
2024-06-17 00:23:04 +02:00