slaren
0d56246f4b
ggml : group all experts in a single ggml_mul_mat_id ( #6505 )
...
* ggml : group all experts in a single ggml_mul_mat_id
cuda : improve mmid row copy
* cuda : fix bin bcast with non-cont src0
* test-backend-ops : only run all mul mat tests for base types
* llama : disable moe offloading with SYCL
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-18 15:18:48 +02:00
..
2024-02-25 12:09:09 +02:00
2024-03-22 09:36:03 +02:00
2024-04-05 21:34:53 +03:00
2024-02-16 11:31:07 +02:00
2024-02-16 11:31:07 +02:00
2024-03-09 15:53:59 +02:00
2024-03-27 09:16:02 +02:00
2024-04-09 13:44:08 -04:00
2024-04-13 11:33:52 +02:00
2024-02-17 23:03:14 +02:00
2024-02-25 12:09:09 +02:00
2024-04-11 19:47:34 +01:00
2024-04-10 21:16:48 +03:00
2024-04-14 13:12:59 +02:00
2024-04-16 09:34:06 +03:00
2024-04-18 15:18:48 +02:00
2024-04-12 15:11:46 +03:00
2023-10-06 16:16:38 +03:00
2024-04-16 21:55:30 +03:00
2024-03-10 22:03:17 +02:00
2024-03-13 18:54:21 +01:00
2024-04-12 10:52:36 +02:00
2024-04-09 13:44:08 -04:00
2024-04-09 13:44:08 -04:00
2024-04-15 18:35:21 +01:00
2024-03-26 01:16:01 +01:00
2024-03-26 16:46:41 +02:00
2024-02-27 14:35:51 +02:00
2024-04-16 09:28:33 +03:00
2024-04-12 10:52:36 +02:00
2024-02-03 13:23:37 +02:00
2024-03-25 09:38:22 +02:00
2024-04-08 15:43:30 +03:00
2024-04-15 18:35:21 +01:00
2024-02-16 11:31:07 +02:00
2024-04-09 13:44:08 -04:00
2024-04-14 10:42:29 +08:00
2024-04-09 13:44:08 -04:00
2024-03-14 20:29:32 +02:00
2024-01-06 11:40:24 +02:00
2023-10-03 21:04:01 +03:00
2023-08-30 09:29:32 +03:00
2024-04-11 14:51:07 +02:00
2024-04-12 19:43:38 +01:00
2024-03-21 11:50:43 +00:00
2024-01-23 08:51:27 +02:00
2023-08-30 09:50:55 +03:00
2023-09-27 19:25:12 +03:00
2024-01-25 14:51:24 -05:00
2024-01-25 14:51:24 -05:00
2024-04-12 19:43:38 +01:00
2024-03-07 11:41:53 +02:00
2024-04-12 19:43:38 +01:00