Commit Graph

  • 7cf1ae4afb
    llama : remove unicode.h from llama-model.cpp Georgi Gerganov 2025-01-08 15:02:35 +02:00
  • c1d6ae9bd8
    Revert "ninja multi-config -> ninja" slaren 2025-01-07 20:28:57 +01:00
  • d3cbd43cc6
    test slaren 2025-01-07 20:21:29 +01:00
  • dfdc4d786a
    ninja multi-config -> ninja slaren 2025-01-07 18:41:09 +01:00
  • 6860c4bef3
    test Georgi Gerganov 2025-01-07 17:22:07 +02:00
  • 5c8d759a3f
    llama : fix llm_type enum names Georgi Gerganov 2025-01-07 16:04:56 +02:00
  • a15e22537f
    llama : pimpl llama_model Georgi Gerganov 2025-01-07 15:36:39 +02:00
  • a16daa9552
    llama : move load tensors to llama_model Georgi Gerganov 2025-01-06 17:00:16 +02:00
  • 662dd05016
    llama : add llama_model methods Georgi Gerganov 2025-01-06 16:13:01 +02:00
  • fd2672b952 squash! convert : add --print-supported-models option Daniel Bevenius 2025-01-10 08:14:39 +01:00
  • beae79455b convert : add --print-supported-models option Daniel Bevenius 2025-01-10 08:06:47 +01:00
  • 0fd5889412
    Merge 5d7bb10ee5a9815a01718d142c59719d22453064 into c3f9d25706ac84297067aeaa662c1f1af42ed443 蕭澧邦 2025-01-10 01:06:27 -06:00
  • fc82f9215f
    Merge e32da6f163308b8b23f93bb56de0d1e13a902509 into c3f9d25706ac84297067aeaa662c1f1af42ed443 jiahao su 2025-01-10 14:07:15 +08:00
  • 91ab9ed858 update test case Te993 2025-01-10 13:56:29 +08:00
  • 61777707ca add swift test cases Te993 2025-01-10 13:55:59 +08:00
  • d3eeeae218 support omnivlm for ios Te993 2025-01-10 13:53:23 +08:00
  • c3f9d25706
    Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (#11161) b4458 0cc4m 2025-01-10 06:39:33 +01:00
  • fdd4ea314f
    Merge b790a7ff290f64f865e0c66602a98ba70a901578 into ee7136c6d1e0ba7633294dad137b1573048031ec Don Mahurin 2025-01-10 00:31:01 -05:00
  • 305dc66649 vulkan: support copy from q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl to f32 Jeff Bolz 2025-01-09 22:20:32 -06:00
  • 7e766580df
    Merge 845d572b877a94c91b756e8532787f7b9507458f into ee7136c6d1e0ba7633294dad137b1573048031ec Eve 2025-01-10 02:58:40 +00:00
  • 845d572b87 little stuff Eve 2025-01-09 21:58:26 -05:00
  • 6145fc79e5 q2_k separate out Eve 2025-01-09 21:41:50 -05:00
  • 973bc4069f q3_k separate out calculation Eve 2025-01-09 21:06:05 -05:00
  • ee7136c6d1
    llama: add support for QRWKV6 model architecture (#11001) b4457 Molly Sophia 2025-01-10 09:58:08 +08:00
  • 324afba5cc better sanity check skipping for QRWKV6 in llama-quant Molly Sophia 2025-01-10 09:42:46 +08:00
  • d8a304c2ef Fix fused lerp weights loading with RWKV6 Molly Sophia 2025-01-10 08:41:32 +08:00
  • c6860cc734
    SYCL: Refactor ggml_sycl_compute_forward (#11121) b4456 Akarshan Biswas 2025-01-10 05:43:03 +05:30
  • 8837b4c07c examples/server/public/index.html.gz: npm run build Tim Janik 2025-01-09 00:23:01 +01:00
  • 96d40e1fd0 examples/server/webui/src/main.js: populate textarea from query string Tim Janik 2024-12-10 17:32:40 +01:00
  • 4981d4bd06 examples/server/webui/index.html: assign id="msg-send" to the "Send" button Tim Janik 2024-12-10 17:32:40 +01:00
  • 51b5ac507d make the caches happy Eve 2025-01-09 17:06:54 -05:00
  • 924bccc214 vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl Jeff Bolz 2025-01-09 14:46:19 -06:00
  • d9b07a1690
    vocab : more pimpl Georgi Gerganov 2025-01-09 21:21:14 +02:00
  • 5b87db0802
    Merge branch 'ggerganov:master' into master Jianlin Shi 2025-01-09 12:14:13 -07:00
  • c8844c1457
    Merge 244811d856fe6199f1c137abe1e85e0ac0c374dc into 1204f9727005974587d6fc1dcd4d4f0ead87c856 Dmitry Wolf 2025-01-09 13:48:23 -05:00
  • 543fd01eb9
    hparams : remove n_vocab_types Georgi Gerganov 2025-01-09 16:53:17 +02:00
  • c2008b568f
    hparams : remove n_vocab Georgi Gerganov 2025-01-09 16:44:49 +02:00
  • 0f0229736c
    model : avoid hardcoded chat template constant Georgi Gerganov 2025-01-09 20:02:45 +02:00
  • 983aa09b5c Merge branch 'master' into compilade/cuda-tq2_0 Francis Couture-Harpin 2025-01-09 13:02:09 -05:00
  • fb43d5e8b5 ggml-cuda : cleanup TQ2_0 Francis Couture-Harpin 2025-01-09 12:16:02 -05:00
  • 914a82da4d Fix validation error about subgroup_size_control extension 0cc4m 2025-01-09 16:24:00 +01:00
  • 5a392192c1 Vulkan: Remove float16 use in shaders 0cc4m 2025-01-09 15:57:14 +01:00
  • 1929d27954
    SYCL: Some device info print refactoring and add details of XMX availability Akarshan Biswas 2025-01-09 12:07:40 +05:30
  • c2a0a2a02d
    SYCL: add function name to noop debug Akarshan Biswas 2025-01-07 17:04:51 +05:30
  • a28663fcb2
    SYCL: add back GGML_USED(dst) to ggml_sycl_cpy Akarshan Biswas 2025-01-07 16:58:09 +05:30
  • 31f3626b16
    SYCL: refactor ggml_sycl_compute_forward Akarshan Biswas 2025-01-07 16:50:29 +05:30
  • d8931a701c
    llama.android : update to new API Georgi Gerganov 2025-01-09 16:03:09 +02:00
  • 330bd07b82
    llama : llama_n_vocab() now uses struct llama_vocab Georgi Gerganov 2025-01-09 15:57:57 +02:00
  • 68db76595e
    llama : update llama_chat_apply_template Georgi Gerganov 2025-01-09 15:47:13 +02:00
  • f2df367e09 squash! examples : add README.md to tts example [no ci] Daniel Bevenius 2025-01-09 14:36:27 +01:00
  • 22b31cd16d
    llama : expose llama_vocab in the API Georgi Gerganov 2025-01-09 15:28:52 +02:00
  • 98d4e55f5a Style: Adds missing newline Andreas Kieslinger 2025-01-09 13:09:38 +00:00
  • dd95edfcfb Refactor: Removes code permanently excluded from compilation to increase readability. Andreas Kieslinger 2025-01-09 12:44:14 +00:00
  • aefcffabb1
    model : fix Phi MoE conflicts Georgi Gerganov 2025-01-09 14:33:32 +02:00
  • ad1923a0ce
    llama : vocab cleanup Georgi Gerganov 2025-01-08 22:17:15 +02:00
  • f784700c31
    llama : vocab pimpl cont Georgi Gerganov 2025-01-08 21:42:47 +02:00
  • 0f14663a4a
    llama : vocabl private charsmap Georgi Gerganov 2025-01-08 20:55:30 +02:00
  • c949316ca4
    llama : vocab load Georgi Gerganov 2025-01-08 20:49:26 +02:00
  • 40df96e009
    llama : vocab fix names Georgi Gerganov 2025-01-08 20:03:07 +02:00
  • 190f371001
    llama : vocab pimpl Georgi Gerganov 2025-01-08 19:47:51 +02:00
  • 2b150e0a6c
    llama : vocab Georgi Gerganov 2025-01-08 16:00:34 +02:00
  • 6fa9007059
    llama : remove unicode.h from llama-model.cpp Georgi Gerganov 2025-01-08 15:02:35 +02:00
  • f0db5ce0af
    Revert "ninja multi-config -> ninja" slaren 2025-01-07 20:28:57 +01:00
  • 141c40cd0b
    test slaren 2025-01-07 20:21:29 +01:00
  • 5db92f2e82
    ninja multi-config -> ninja slaren 2025-01-07 18:41:09 +01:00
  • e696addb4e
    test Georgi Gerganov 2025-01-07 17:22:07 +02:00
  • a48412f92b
    llama : fix llm_type enum names Georgi Gerganov 2025-01-07 16:04:56 +02:00
  • fffa6b15c4
    llama : pimpl llama_model Georgi Gerganov 2025-01-07 15:36:39 +02:00
  • c2a3fd648e
    llama : move load tensors to llama_model Georgi Gerganov 2025-01-06 17:00:16 +02:00
  • 13ba6f1136
    Merge 4fc8673d09e6a213cfaf3b098e8582c20780a886 into 1204f9727005974587d6fc1dcd4d4f0ead87c856 Diego Devesa 2025-01-09 13:30:46 +01:00
  • e188b476e6
    llama : add llama_model methods Georgi Gerganov 2025-01-06 16:13:01 +02:00
  • 0cdc133919 Refactor: Moves node graph checks and copy ops into individual function for improved readability. Andreas Kieslinger 2025-01-09 12:16:37 +00:00
  • 9091993a5e squash! examples : add README.md to tts example [no ci] Daniel Bevenius 2025-01-09 12:49:13 +01:00
  • 1204f97270
    doc: add cuda guide for fedora (#11135) Tei Home 2025-01-09 19:32:06 +08:00
  • da356c8c08 doc: add cuda guide for fedora Tei Home 2025-01-09 19:21:31 +08:00
  • 3b258c3f4b
    Merge 7006dd784c1facc919cd5cbe649308a7a45a0bf7 into 8eceb888d7b7f5e93d20a4f85ca6511022b87040 Sumandora 2025-01-09 07:59:14 -03:00
  • 0b87a2dd10 examples : add README.md to tts example [no ci] Daniel Bevenius 2025-01-09 11:41:19 +01:00
  • 8eceb888d7
    server : add tooltips to settings and themes btn (#11154) Daniel Bevenius 2025-01-09 11:28:29 +01:00
  • f8feb4b01a
    model: Add support for PhiMoE arch (#11003) b4453 Pierrick Hymbert 2025-01-09 11:21:41 +01:00
  • 3db9cea0ad rm tooltip for 3 dots button Xuan Son Nguyen 2025-01-09 11:20:46 +01:00
  • c0dd28d16a
    doc: add phimoe as supported model Pierrick HYMBERT 2024-12-29 14:57:35 +01:00
  • 3199b2f301
    doc: minor Pierrick Hymbert 2024-12-28 19:38:08 +01:00
  • e0e23b5c37
    doc: minor Pierrick Hymbert 2024-12-28 19:37:59 +01:00
  • 7385f7d7e2
    python linter Pierrick HYMBERT 2024-12-28 15:49:47 +01:00
  • 4be934c453
    model: support phimoe Pierrick HYMBERT 2024-12-28 15:11:12 +01:00
  • be0e950c91
    media : remove old img [no ci] Georgi Gerganov 2025-01-09 11:15:15 +02:00
  • 37518b7dda Refactor: Improves structure and abstractions by moving CUDA graph evaluation and capture to its own function. Andreas Kieslinger 2025-01-09 09:15:12 +00:00
  • d9feae1c06
    llama-chat : add phi 4 template (#11148) b4451 Xuan Son Nguyen 2025-01-09 10:07:33 +01:00
  • 0817bc9bdc squash! server : add tooltips to settings and themes btn Daniel Bevenius 2025-01-09 08:37:56 +01:00
  • 8e221289ce server : add tooltips to settings and themes btn Daniel Bevenius 2025-01-09 08:00:31 +01:00
  • d7e9e30a2f
    Merge b83cae088c2e1d069a83560a290b13a836828cd7 into 8d59d911711b8f1ba9ec57c4b192ccd2628af033 Georgi Gerganov 2025-01-09 06:32:40 +00:00
  • c9463641af Merge https://github.com/ggerganov/llama.cpp into vulkan Eve 2025-01-08 21:59:37 -05:00
  • 923e9a8377 q3_k use hmask simd from cpu avx version Eve 2025-01-08 21:13:09 -05:00
  • fe71a8c4a1 q3_k optimizations Eve 2025-01-07 22:05:51 -05:00
  • cc28742ca3 q2_k better dequant Eve 2025-01-07 21:20:33 -05:00
  • 91f1d9ce99 better q6_k with separate paths for all threads and partial threads in use, plus some more optimizations Eve 2025-01-07 19:57:55 -05:00
  • 6f5d62b098 q5_k Eve 2025-01-06 17:13:23 -05:00
  • cdf70cf27f better q4_k scales Eve 2025-01-05 22:43:12 -05:00
  • b4ae7005e6 unpack should be u16, add vim swap to gitignore (about time) Eve 2025-01-05 21:59:43 -05:00
  • 173077180f Revert "try precalculating products of a and q2_k scales" Eve 2025-01-05 17:01:34 -05:00