Commit Graph

  • e6468f95c1 whitespace wbpxre150 2023-04-14 12:38:56 +0800
  • aa6bca453f Fix prints. wbpxre150 2023-04-14 12:36:20 +0800
  • 241065eccd New conversion script (#545) comex 2023-04-13 21:06:30 -0700
  • e524ce99fe add macos headers jon-chuang 2023-04-14 10:22:52 +0800
  • be87b6ed20
    perplexity : add support for batch size to --perplexity (#407) master-be87b6e Gary Linscott 2023-04-13 14:50:42 -0700
  • 3f93a00d9d
    quantize-stats : fix test + add it to Makefile default Georgi Gerganov 2023-04-13 23:33:35 +0300
  • a520b33b3a
    ggml : add Q8_0 quantization for intermediate results Georgi Gerganov 2023-04-13 23:03:27 +0300
  • 3fa8837068 improve jon-chuang 2023-04-14 04:00:41 +0800
  • 02b0fe86f2 improve jon-chuang 2023-04-14 03:55:33 +0800
  • b17d54eda3 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into jon/use-hardware-cores jon-chuang 2023-04-14 03:07:49 +0800
  • e0325353be apply code review jon-chuang 2023-04-14 03:07:45 +0800
  • 315e69fd7f fix code indentation. wbpxre150 2023-04-14 01:58:38 +0800
  • fa651909bb refector code into function. wbpxre150 2023-04-14 01:24:32 +0800
  • 0e07e6a839
    common : remove unnecessary includes (#947) master-0e07e6a CRD716 2023-04-13 10:39:25 -0500
  • a3a2a0eda8
    ggml : add GGML_DEFAULT_N_THREADS master-a3a2a0e Georgi Gerganov 2023-04-13 18:36:40 +0300
  • d990e3fffc
    ggml : speed-up ggml_vec_dot_q4_1() ARM_NEON + 32-bit ARM support (#900) master-d990e3f Georgi Gerganov 2023-04-13 18:32:36 +0300
  • 99e7c9b9e6
    ggml : try to use correct ifdef Georgi Gerganov 2023-04-13 18:31:15 +0300
  • 63f7ecf47c
    ggml : fix comment Georgi Gerganov 2023-04-13 18:28:18 +0300
  • 1b59a07380
    ggml : implement vzip when missing Georgi Gerganov 2023-04-13 18:26:44 +0300
  • 23fd782d35 Update batch size for efficiency Gary Linscott 2023-04-13 08:20:54 -0700
  • be21d538e6
    ggml : implement vminvq and vmaxvq when missing Georgi Gerganov 2023-04-13 18:19:20 +0300
  • 14a0b207bc
    ggml : implement vaddvq when missing Georgi Gerganov 2023-04-13 18:16:35 +0300
  • fbcecd59a9 Merge remote-tracking branch 'origin/master' into batch_perplexity Gary Linscott 2023-04-13 08:13:09 -0700
  • decac7b124
    remove unnecessary includes CRD716 2023-04-13 10:08:29 -0500
  • 2ae3164d29
    ggml : speed-up q4_1 ARM_NEON by ~5% Georgi Gerganov 2023-04-11 20:41:15 +0300
  • 9190e8eac8
    llama : merge llama_internal.h into llama.h master-9190e8e Georgi Gerganov 2023-04-13 18:04:45 +0300
  • c85980acd0
    gitignore : benchmark Georgi Gerganov 2023-04-13 18:01:22 +0300
  • 6232f2d7fd
    ggml : optimize non-SIMD Q4_0 vector dot product (#703) master-6232f2d Stephan Walter 2023-04-13 14:59:50 +0000
  • b7f38eec58 Optimize non-SIMD Q4_0 vector dot product Stephan Walter 2023-04-01 19:05:14 +0200
  • 6c248707f5
    ggml : introduce GGML_ALIGNED_MALLOC/GGML_ALIGNED_FREE macros (#884) master-6c24870 Pavol Rusnak 2023-04-13 16:08:32 +0200
  • bac39666cd
    Introduce GGML_ALIGNED_MALLOC/GGML_ALIGNED_FREE macros Pavol Rusnak 2023-04-10 22:53:16 +0200
  • 8cda5c981d
    fix whitespace (#944) master-8cda5c9 CRD716 2023-04-13 09:03:57 -0500
  • ec29272175
    readme : remove python 3.10 warning (#929) CRD716 2023-04-13 08:59:53 -0500
  • 7e941b95eb
    readme : llama node binding (#911) Genkagaku.GPT 2023-04-13 21:54:27 +0800
  • c729ff730a
    flake.nix: add all binaries from bin (#848) Pavol Rusnak 2023-04-13 15:49:05 +0200
  • 4579af95e8
    zig : update build.zig (#872) Judd 2023-04-13 21:43:22 +0800
  • edc9ee9c0b
    Update benchmark-q4_0-matmult.c CRD716 2023-04-13 08:31:37 -0500
  • b7db959a01
    Update Makefile CRD716 2023-04-13 08:29:34 -0500
  • 8ce1126b4e
    Update benchmark-q4_0-matmult.c CRD716 2023-04-13 08:28:39 -0500
  • 8c3ffc2f04
    ggml : update cblas_sgemm columns var to be more reasonable (#838) master-8c3ffc2 Vladimir 2023-04-13 15:24:30 +0200
  • 107980d970
    examples : add -n to alpaca and gpt4all scripts (#706) niansa/tuxifan 2023-04-13 15:03:39 +0200
  • 585d91a156
    cmake : add explicit F16C option (x86) (#576) master-585d91a anzz1 2023-04-13 15:48:21 +0300
  • 95ea26f6e9
    benchmark : add tool for timing q4_0 matrix multiplication (#653) master-95ea26f SebastianApel 2023-04-13 14:46:23 +0200
  • d21e188c6a
    Merge branch 'master' into master Georgi Gerganov 2023-04-13 15:45:33 +0300
  • 902075752a Add sentencepiece processor aeslampanah 2023-04-13 07:58:45 -0400
  • 7c8ee5aec5 Updated tokenconvert.py script to add support for SentencePiece and WordPiece tokenizers, updated arguments aeslampanah 2023-04-13 07:05:29 -0400
  • ed70ea9595
    Merge pull request #1 from ggerganov/master qwopqwop200 2023-04-13 19:09:27 +0900
  • 97d7ac7565 POC: Measure rmse of 8 bit quantization Iwan Kawrakow 2023-04-13 12:00:24 +0200
  • acbab12a89 replaced use of auto with exact type to avoid using -std=c++14 Arik Poznanski 2023-04-13 12:37:05 +0300
  • 82d146df9b
    do not force the prompt file to end with a new line (#908) Pavol Rusnak 2023-04-13 11:33:16 +0200
  • 9b52fc3aa6
    do not force the prompt file to end with a new line Pavol Rusnak 2023-04-12 08:33:13 +0200
  • ca297c190f up version Concedo 2023-04-13 14:38:38 +0800
  • c1b75f38d0 try to fix noavx2 for really old devices by Concedo 2023-04-13 14:36:00 +0800
  • 6f34961559 POC: Q4_1 for groups of 16 weight Iwan Kawrakow 2023-04-13 08:31:21 +0200
  • db4b29301c
    Add files via upload qwopqwop200 2023-04-13 15:30:14 +0900
  • 8b9316be70
    fix tab qwopqwop200 2023-04-13 15:16:39 +0900
  • e01b2d04c9
    fix tab qwopqwop200 2023-04-13 15:07:02 +0900
  • 75b39c4b26
    fix tab qwopqwop200 2023-04-13 15:05:34 +0900
  • b0c6171cd7
    fix tab qwopqwop200 2023-04-13 15:03:29 +0900
  • d405209c45
    add Q4_2 qwopqwop200 2023-04-13 14:55:16 +0900
  • ff0efc747d
    add Q4_2 qwopqwop200 2023-04-13 14:54:44 +0900
  • f0b14e8c69
    add Q4_2 qwopqwop200 2023-04-13 14:53:23 +0900
  • 716bd8fcfa
    Add files via upload qwopqwop200 2023-04-13 14:52:49 +0900
  • db77f1b48a
    add q4_2 qwopqwop200 2023-04-13 14:52:19 +0900
  • 8694318c71 try-catch jon-chuang 2023-04-13 13:14:15 +0800
  • f181c28edd fix jon-chuang 2023-04-13 13:01:18 +0800
  • 1caa4dcf94 commit jon-chuang 2023-04-13 12:55:38 +0800
  • 2ff91b5570 Merge remote-tracking branch 'occam/clblast-1' into concedo Concedo 2023-04-13 11:39:35 +0800
  • 5c22f7e4c4 reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode. Concedo 2023-04-13 11:32:05 +0800
  • 5858d410fd
    Remove python 3.10 warning CRD716 2023-04-12 20:30:25 -0500
  • fe24af09ba Replaced static initialization of complex objects with a initialization on first use. This prevents an undefined behavior on program run, for example, crash in Release build, works in Debug build Arik Poznanski 2023-04-13 01:51:34 +0300
  • a1b4e48ba2 fixO wbpxre150 2023-04-13 05:46:56 +0800
  • f45dec3c2a fixes wbpxre150 2023-04-13 05:32:12 +0800
  • 67d220210f Revert buffer changes, no improvements in benchmarks 0cc4m 2023-04-12 23:09:12 +0200
  • c7e5c4f7b2 Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers 0cc4m 2023-04-11 21:53:50 +0200
  • 0d903962e5 add command line to interactive mode. You can specify diffrent values for everything in params during rumtime. wbpxre150 2023-04-13 04:58:35 +0800
  • 47d809c692 move code to signal handler wbpxre150 2023-04-13 02:03:48 +0800
  • f4257a8eef Merge branch 'master' into concedo Concedo 2023-04-12 23:25:45 +0800
  • 1bd5992da4 clean and refactor handling of flags Concedo 2023-04-12 23:25:31 +0800
  • 679e1cb6c0 POC: Even lower rmse 4-bit Q4_0 quantization Iwan Kawrakow 2023-04-12 17:10:52 +0200
  • 96846dd2ff Remove <alloca.h>. Olaf Seibert 2023-04-12 17:01:30 +0200
  • 7d21d5ebb4 Eliminate alloca from ggml_opt_lbfgs() Olaf Seibert 2023-04-12 16:47:45 +0200
  • 0260aa67fc Eliminate alloca from ggml_graph_compute() Olaf Seibert 2023-04-12 16:35:23 +0200
  • a0f3de5b84 This file doesn't even use alloca. Olaf Seibert 2023-04-12 16:32:47 +0200
  • e7f6997f89
    Don't crash on ftype (formerly f16) == 4 (#917) master-e7f6997 Stephan Walter 2023-04-12 15:06:16 +0000
  • 99da5491c5 Don't crash on ftype (formerly f16) == 4 Stephan Walter 2023-04-12 16:47:00 +0200
  • 29b83e5fd6 Various experiments, including 5-bit qunatization Iwan Kawrakow 2023-04-12 16:25:19 +0200
  • 636f8e5a8e updated the quantize files and makefile Concedo 2023-04-12 21:40:25 +0800
  • f76cb3a34d
    readme : change "GPU support" link to discussion Georgi Gerganov 2023-04-12 14:48:57 +0300
  • 782438070f
    readme : update hot topics with link to "GPU support" issue Georgi Gerganov 2023-04-12 14:31:12 +0300
  • 4faae0afa9 Merged upstream, fixed OSX compile errors, integrated noavx2 build into main Concedo 2023-04-12 18:08:55 +0800
  • bcd327c221 chore: add nodejs binding hlhr202 2023-04-12 16:27:45 +0800
  • 24f2a6e03d chore: add nodejs binding hlhr202 2023-04-12 16:27:01 +0800
  • 2444a99db5
    Fix make compile error in expose.cpp(?) (#44) rabidcopy 2023-04-12 03:19:38 -0500
  • c55eb784cb add exit call to interactive mode. wbpxre150 2023-04-12 15:29:59 +0800
  • 4dbbd40750
    readme: link to sha256sums file (#902) Nicolai Weitkemper 2023-04-12 08:46:20 +0200
  • 6bfb00a53b Further improve Q4_0 MSE Iwan Kawrakow 2023-04-12 07:38:42 +0200
  • a2a2b2cf13
    readme: link to sha256sums file Nicolai Weitkemper 2023-04-12 01:26:48 +0200
  • c59009a835 apply suggestions Tomáš Pazdiora 2023-04-11 23:19:27 +0200
  • ab73745993 update formatting Tomáš Pazdiora 2023-04-11 23:08:25 +0200