slaren
0d56246f4b
ggml : group all experts in a single ggml_mul_mat_id ( #6505 )
...
* ggml : group all experts in a single ggml_mul_mat_id
cuda : improve mmid row copy
* cuda : fix bin bcast with non-cont src0
* test-backend-ops : only run all mul mat tests for base types
* llama : disable moe offloading with SYCL
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-18 15:18:48 +02:00
..
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-04-03 16:07:05 +03:00
2024-03-25 13:50:23 +01:00
2024-04-18 15:18:48 +02:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-04-09 11:16:13 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-04-18 15:18:48 +02:00
2024-04-09 11:16:13 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-04-09 11:16:13 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-04-09 11:16:13 +03:00
2024-03-29 17:45:46 +02:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-26 15:21:27 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-04-09 11:16:13 +03:00
2024-04-09 11:16:13 +03:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-25 13:50:23 +01:00
2024-03-26 15:21:27 +01:00