Commit Graph

  • 506cd62638 changed some defaults to hopefully increase compatibility Concedo 2023-03-25 10:40:11 +0800
  • b13a768813 added softprompt endpoint Concedo 2023-03-25 10:12:47 +0800
  • 8347bede58
    Add missing struct annotation Doomsdayrs 2023-03-24 20:25:50 -0400
  • 5d909a377c
    enable sanitizers in linux ci Green Sky 2023-03-25 01:16:14 +0100
  • fe5af95ef8
    cmake: make sanitizers link Green Sky 2023-03-23 21:46:04 +0100
  • 743ec9b221 Fix crash for 65B model with pre-allocated memory Chris Kuehl 2023-03-24 19:17:34 -0500
  • 186ecfd8a4 Remove printing of prompt and prompt tokenization at startup Slaren 2023-03-24 23:46:02 +0100
  • 8666c5aa43 Add timings for the prompt evaluation Slaren 2023-03-24 23:12:15 +0100
  • 8520fc310e
    Disable BLAS altogether - the bug is not just for qunatized mat mul master-8520fc3 Georgi Gerganov 2023-03-24 23:47:06 +0200
  • b3f460e941
    Disable BLAS branch in mul_mat - seems there is a bug master-b3f460e Georgi Gerganov 2023-03-24 23:39:17 +0200
  • 3a8e8b7a0f
    Fix typo Jed Fox 2023-03-24 17:28:34 -0400
  • a8096f3d81
    Merge branch 'master' into jed/spm Jed Fox 2023-03-24 17:27:46 -0400
  • ae3d0ff68f
    Call progress callback more frequently Jed Fox 2023-03-24 17:26:19 -0400
  • 1e3fd898a3
    Merge branch 'master' into jed/load-progress Jed Fox 2023-03-24 17:25:38 -0400
  • 04c6f5ed6f
    Immediately start processing the prompt before user input has been provided (#476) master-7a9b6c3 master-04c6f5e Georgi Gerganov 2023-03-24 23:17:58 +0200
  • 7a9b6c3a8b
    Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov 2023-03-24 23:17:37 +0200
  • 6feb572b36
    Merge branch 'master' into mem-fix Georgi Gerganov 2023-03-24 23:17:19 +0200
  • d0f7519338
    Fix KV cache size for F32 Georgi Gerganov 2023-03-24 22:58:00 +0200
  • d26a3994f4
    Immediately start processing the prompt before user input has been provided Georgi Gerganov 2023-03-24 22:44:02 +0200
  • 4aeee216fd Regroup q4_1 dot addition for better numerics. q4_1_more_accel Matvey Soloviev 2023-03-23 04:56:21 +0100
  • 580991bbed Squeeze out about 5% more performance in Q4_1 inference Matvey Soloviev 2023-03-21 22:55:35 +0100
  • 0b4e849a24
    Fix number of layers in 30B and 65B Georgi Gerganov 2023-03-24 22:15:06 +0200
  • 3634c312bc
    Reenable BLAS for quantized mul_mat Georgi Gerganov 2023-03-24 22:03:56 +0200
  • ea60d2193a
    Simpler scratch buffer usage Georgi Gerganov 2023-03-24 21:41:47 +0200
  • 9330ff0f35
    Reduce memory usage and allocate enough memory for large contexts Georgi Gerganov 2023-03-24 18:22:48 +0200
  • 8f2b6d222d Add AVX2 implementation of dequantize_row_q4_0 Slaren 2023-03-24 17:13:50 +0100
  • 31572d9665
    Temporary bump the memory buffer size - hopefully fix issues from 483bab2e master-31572d9 Georgi Gerganov 2023-03-24 18:23:56 +0200
  • f4f5362edb
    Update README.md (#444) master-863f65e Gary Mulder 2023-03-24 15:23:09 +0000
  • 863f65e2e3
    fix instruct mode (#445) rabidcopy 2023-03-24 10:22:39 -0500
  • afd220d9c6
    Properly free llama_context on failure master-afd220d master-563cdc3 master-481044d Georgi Gerganov 2023-03-24 17:21:01 +0200
  • 481044d50c
    additional optimizations for POWER9 (#454) Cameron Kaiser 2023-03-24 08:19:26 -0700
  • 563cdc391d
    Support calling mlock() on loaded model data on Linux and macOS (#453) comex 2023-03-24 08:19:05 -0700
  • 53a941c1e5
    Update llama.cpp Georgi Gerganov 2023-03-24 17:17:56 +0200
  • a65f23342d
    Merge branch 'master' into mlock Georgi Gerganov 2023-03-24 17:15:24 +0200
  • 8d4a855c24
    Add embedding mode with arg flag. Currently working (#282) master-8d4a855 Luciano 2023-03-24 08:05:13 -0700
  • 8e383f1895 gitignore Concedo 2023-03-24 23:02:25 +0800
  • 1c78ffb964
    Update README.md LostRuins 2023-03-24 22:45:54 +0800
  • e791827973 added a GUI for selection of models if none was passed in through command line. Concedo 2023-03-24 22:03:57 +0800
  • c6c60332a4 Optimizations Concedo 2023-03-24 21:33:53 +0800
  • 3879d84400 Merge branch 'master' into concedo Concedo 2023-03-24 19:28:27 +0800
  • 706e19e9b4 added ability to fast forward in time through partially duplicated prompts Concedo 2023-03-24 18:50:16 +0800
  • 8b4b1e1fb3
    Merge branch 'ggerganov:master' into fix-instruct rabidcopy 2023-03-24 03:09:53 -0500
  • b6b268d441
    Add link to Roadmap discussion Georgi Gerganov 2023-03-24 09:13:35 +0200
  • 3cd8dde0d1 Revert "Fix memory allocation issues and seg faults" master-3cd8dde Georgi Gerganov 2023-03-24 06:22:28 +0200
  • a34ba06b38
    Prevent users from using the instruct mode and interactive mode at the same time. mmyjona 2023-03-24 12:19:37 +0800
  • 2a6daccc40 additional optimizations for POWER9 Cameron Kaiser 2023-03-23 20:23:45 -0700
  • 34e8e4feef Support calling mlock() on loaded model data on Linux and macOS comex 2023-03-23 20:08:13 -0700
  • 57dc4dc68a Revert "Fix memory allocation issues and seg faults" Gary Linscott 2023-03-23 18:44:48 -0700
  • acc36eb0b5 Add AVX2 implementation of ggml_compute_forward_rms_norm_f32 Slaren 2023-03-24 01:10:46 +0100
  • 9179d089a2 Merge remote-tracking branch 'origin/master' into batch_perplexity Gary Linscott 2023-03-23 18:35:22 -0700
  • 6041736d6b
    Update README.md Kevin Kwok 2023-03-23 16:00:10 -0700
  • b64067704e
    fix instruct mode rabidcopy 2023-03-23 17:56:16 -0500
  • 3e481d05f0
    Update README.md Kevin Kwok 2023-03-23 15:51:16 -0700
  • f7de57fd3a
    Update README.md Gary Mulder 2023-03-23 22:29:52 +0000
  • 4870e455b3
    Fix memory allocation issues and seg faults master-4870e45 Georgi Gerganov 2023-03-24 00:11:53 +0200
  • 483bab2e3d
    Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) master-483bab2 Georgi Gerganov 2023-03-23 23:22:01 +0200
  • 5dd94f70b2
    cmake: make sanitizers link Green Sky 2023-03-23 21:46:04 +0100
  • 2d262ea9f0
    fix perplexity - it's memory needs dont grow, so we skip it Green Sky 2023-03-23 20:50:09 +0100
  • 404e1da38e
    Fix quantize script not finding models in parent directory (#428) Jed Fox 2023-03-23 16:42:52 -0400
  • 4cc053b6d5
    Remove oboslete command from Docker script Georgi Gerganov 2023-03-23 22:39:44 +0200
  • 0ba5a3a9a5
    Obsolete Georgi Gerganov 2023-03-23 22:32:02 +0200
  • d782609307
    Delete download-pth.py Jed Fox 2023-03-23 16:31:49 -0400
  • 2e17dfd80a
    Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) master-2e17dfd rabidcopy 2023-03-23 15:22:47 -0500
  • 4a4718e8ab
    More correct load progress Jed Fox 2023-03-23 16:18:37 -0400
  • 23035f9ba8
    Use seekg to find file size instead Jed Fox 2023-03-23 16:18:29 -0400
  • 20a1a4e09c
    Fix GPTQ converter (#423) master-ad072fc Timmy Knight 2023-03-23 10:18:13 -1000
  • ad072fc5ad
    Generate library with CMake (#430) nusu-github 2023-03-24 05:16:48 +0900
  • 1f9592baf3
    Renames Jed Fox 2023-03-23 16:15:55 -0400
  • 347592b365
    Fix comment Georgi Gerganov 2023-03-23 22:13:54 +0200
  • 8a3c34bb54
    Embeddings extraction support Georgi Gerganov 2023-03-23 22:02:14 +0200
  • 424281a4fb
    dynamic estimate of required memory usage Green Sky 2023-03-23 19:21:18 +0100
  • 55b899b8f2
    Update main.cpp rabidcopy 2023-03-23 13:47:18 -0500
  • d90112a007
    Avoid the transposed X branch in the Z = X * Y matrix multiplication Georgi Gerganov 2023-03-23 20:40:16 +0200
  • d6aa749ccf
    Swap from exclusions to allowlist Jed Fox 2023-03-23 13:58:47 -0400
  • ea10d3ded2
    Command line args bounds checking (#424) master-ea10d3d anzz1 2023-03-23 19:54:28 +0200
  • ab02a2441c
    Move llama_progress_handler into llama_context_params Jed Fox 2023-03-23 13:36:43 -0400
  • e47924fd4b
    File load progress reporting Jed Fox 2023-03-22 13:13:29 -0400
  • 927bc26e03
    Add a Package.swift for SwiftPM support Jed Fox 2023-03-22 10:05:33 -0400
  • a18c19259a Fix Nix build Ben Siraphob 2023-03-22 00:37:02 -0500
  • af5ec1ba63
    Fix Nix build Ben Siraphob 2023-03-22 00:37:02 -0500
  • 1166fda943 Merge branch 'master' into concedo Concedo 2023-03-23 23:51:07 +0800
  • bfcb4e7c92
    Turn ON PIC when BUILD_SHARED_LIBS is ON nusu-github 2023-03-24 00:23:54 +0900
  • a50e39c6fe
    Revert "Delete SHA256SUMS for now" (#429) Stephan Walter 2023-03-23 14:15:48 +0000
  • e60c31af70
    Generate library with CMake nusu-github 2023-03-23 23:12:49 +0900
  • 632a3257e1
    Add also model/tokenizer.model to SHA256SUMS + update README Pavol Rusnak 2023-03-23 15:10:32 +0100
  • d442e0210c
    Remove alpaca json Stephan Walter 2023-03-23 14:58:23 +0100
  • 2580d75522
    Remove ggml files until the can be verified Stephan Walter 2023-03-23 14:55:55 +0100
  • e0607ae91a Revert "Delete SHA256SUMS for now (#416)" Stephan Walter 2023-03-23 14:54:20 +0100
  • 128c503392
    Fix quantize script not finding models in parent directory Jed Fox 2023-03-23 09:03:26 -0400
  • f7dda362f2
    Merge branch 'ggerganov:master' into patch-1 RSereno 2023-03-23 12:51:42 +0000
  • 2eb9d043d3
    fix comment anzz1 2023-03-23 14:20:44 +0200
  • 8f0c8bcc8e
    unknown and invalid param exit codes 0 -> 1 anzz1 2023-03-23 14:09:49 +0200
  • c96a80a3c6
    feat: '--in-prefix STRING' option anzz1 2023-03-23 13:59:09 +0200
  • 2d01e60bc8
    command line args bounds checking anzz1 2023-03-23 13:49:27 +0200
  • a140219e81
    Fix Makefile echo escape codes (by removing them). (#418) master-a140219 Kerfuffle 2023-03-23 05:41:32 -0600
  • 8a3e5ef801
    Move model section from issue template to README.md (#421) Gary Mulder 2023-03-23 11:30:40 +0000
  • 76e82a815b
    Fix GPTQ converter Timmy Knight 2023-03-23 01:19:36 -1000
  • f58154abe0 Fix Makefile echo escape codes (by removing them). KerfuffleV2 2023-03-23 01:58:43 -0600
  • 8eea5ae0e5
    Delete SHA256SUMS for now (#416) anzz1 2023-03-23 12:26:19 +0200
  • dbb0683293 Updates to README.md model section Gary Mulder 2023-03-23 09:34:50 +0000