Commit Graph

  • a25f830fe1 Default streaming to false if it's not set in the request body. digiwombat 2023-05-30 20:17:18 -0400
  • 38eaf2b7f7 Removed testing fprintf calls. digiwombat 2023-05-30 19:48:43 -0400
  • 3292f057dc Changed to single API endpoint for streaming and non. digiwombat 2023-05-30 19:44:16 -0400
  • d6fff56e22 add streaming via server-sent events anon 2023-05-30 19:33:33 -0300
  • 7f172c1070
    replace auto parameters in lambda function xaedes 2023-05-31 00:25:37 +0200
  • b2fd06c6aa
    mtl : working mul_mat q4 Georgi Gerganov 2023-05-30 23:06:49 +0300
  • 03ea8f013a Fix for the regen issue. digiwombat 2023-05-30 15:48:55 -0400
  • 29bec00ba0
    mtl : another mul_mat Q4 (still does not work) Georgi Gerganov 2023-05-30 22:31:07 +0300
  • 96d005225f
    mtl : mul_mat fixes (still wrong) Georgi Gerganov 2023-05-30 22:13:43 +0300
  • 2a24994bad
    mtl : initial mul_mat Q4 kernel (wrong results) Georgi Gerganov 2023-05-30 22:02:54 +0300
  • ffb06a345e
    OpenLLaMA 3B support (#1588) master-ffb06a3 Henri Vasserman 2023-05-30 21:24:22 +0300
  • ac6b49ed45 Reduce queueing overhead for contiguous tensors by using single mul kernel call 0cc4m 2023-05-30 18:49:53 +0200
  • 64afc0b53a
    mtl : add mul kernel + confirm working Georgi Gerganov 2023-05-30 19:15:38 +0300
  • 72256ebd2b
    mtl : add rms_norm kernel + confirm working Georgi Gerganov 2023-05-30 19:03:04 +0300
  • 794704e409
    mtl : confirmed get_rows_q4_0 is working correctly Georgi Gerganov 2023-05-30 18:41:21 +0300
  • 8fd8599f61
    rename baby-llama-text to train-text-from-scratch xaedes 2023-05-30 17:07:03 +0200
  • 21b11b55d4
    remove python bindings xaedes 2023-05-30 17:03:09 +0200
  • a5317498c2
    Merge branch 'master' into text-from-scratch xaedes 2023-05-30 16:57:17 +0200
  • 56456797f4 Merge branch 'master' into concedo_experimental Concedo 2023-05-30 22:15:58 +0800
  • 1074a81e81
    add train params to specify memory size xaedes 2023-05-30 16:06:20 +0200
  • ad966da955
    remove unnecessary comments xaedes 2023-05-30 15:58:22 +0200
  • ec8e262d1d
    add train_params and command line option parser xaedes 2023-05-30 15:53:55 +0200
  • 2e84ad53ca
    remove convert.py Henri Vasserman 2023-05-30 16:42:11 +0300
  • fcbc4457d6
    add option to train with flash attention and move options to the top of the main function xaedes 2023-05-30 13:17:58 +0200
  • 70c08318af
    test flash attention backward pass xaedes 2023-05-29 23:51:40 +0200
  • 38560b6d51
    bugfixes for backward pass of flash attention xaedes 2023-05-29 23:45:58 +0200
  • a8fd9dc128
    mtl : initial get_rows_q4_0 kernel Georgi Gerganov 2023-05-29 23:12:19 +0300
  • 22a7279ffb
    implement backward pass of flash attention xaedes 2023-05-29 22:00:40 +0200
  • 248a8c3379
    mtl : move MSL code into separate file for easy editing Georgi Gerganov 2023-05-29 22:26:40 +0300
  • 897d6d8e8f
    mtl : export just a small part of the graph for now to make it easier Georgi Gerganov 2023-05-29 21:40:05 +0300
  • a792cbd0fc
    mtl : no need for mtl-export tool, add cli arg for main instead Georgi Gerganov 2023-05-29 21:28:59 +0300
  • b23fe8c9c7
    mtl : adapt the MNIST example as starter Georgi Gerganov 2023-05-29 21:09:47 +0300
  • 98c267fc77
    ci : disable temporary Georgi Gerganov 2023-05-29 20:57:24 +0300
  • f85020b19a
    mtl : export the LLaMA computation graph Georgi Gerganov 2023-05-29 20:49:24 +0300
  • 062dc6c747 Replacing call to convert-pth-to-ggml.py with convert.py Jiri Podivin 2023-05-29 19:02:45 +0200
  • 7552ac5863
    ggml : sync cgraph import / export API master-7552ac5 Georgi Gerganov 2023-05-29 19:31:44 +0300
  • 5d1830b99d
    ggml : fix bug in ggml_alibi Georgi Gerganov 2023-05-29 19:30:49 +0300
  • 7a55593af4 main: add the possibility to open the prompt cache read-only Willy Tarreau 2023-05-29 14:53:00 +0200
  • ea336bfa33 rwkv eos Concedo 2023-05-29 22:40:27 +0800
  • 6b3373cb81 revert bad fix Concedo 2023-05-29 22:06:12 +0800
  • 248367605e
    Work around for recalculating logits in cached prompts (Fixes #1585) (#1609) master-2483676 DannyDaemonic 2023-05-29 05:13:40 -0700
  • ef16d09a51 fix for older gcc, updated lite Concedo 2023-05-29 18:54:15 +0800
  • 44c83c6eba Merge remote-tracking branch 'upstream/master' into cached-logits-bandaid Danny Daemonic 2023-05-29 02:57:57 -0700
  • 3a73ebe8d2 Merge branch 'master' into concedo_experimental Concedo 2023-05-29 16:47:32 +0800
  • 254a9ff12c Merge commit 'ebc5d0651a1af44a2aecf503c1ceecede1ef99c4' into concedo_experimental Concedo 2023-05-29 16:26:24 +0800
  • 30ff1133f5 allow users to rename models for use in horde Concedo 2023-05-29 16:01:05 +0800
  • 97b39f875c fixed fstat64 build error on mac Concedo 2023-05-29 15:50:07 +0800
  • 0773028d52 1) make gpt_params_parse can jump over some predefined unknown args so we can reuse the gpt_params_parse function 2) fixed the grpc server error Liu Ming 2023-05-29 14:07:13 +0800
  • 0e730dd23b
    Adding git in container package dependencies (#1621) Jiří Podivín 2023-05-29 06:45:50 +0200
  • 96165b1201 pick from master changhz 2023-05-28 23:47:42 -0400
  • 530eb57fe4 fix the error of no ending Liu Ming 2023-05-29 08:37:34 +0800
  • 56895e28f6
    get vocabulary for exporting training checkpoint to llama compatible model file xaedes 2023-05-29 02:25:18 +0200
  • 4b81c32d5b
    add export of training checkpoint to llama compatible model file xaedes 2023-05-29 01:27:09 +0200
  • 2da5c8cf24
    set default model.type for unknown models with few layers xaedes 2023-05-29 01:20:55 +0200
  • bf4d9b3b81
    add llama_get_vocab to get the vocabulary as output parameters xaedes 2023-05-29 01:20:26 +0200
  • 42cf4d8433
    Merge branch 'master' into master Henri Vasserman 2023-05-29 01:05:19 +0300
  • 33b6957177 Fixed failing to return result on stopping token. digiwombat 2023-05-28 16:45:05 -0400
  • 89475fb320
    slightly improve how cross entropy loss is compute xaedes 2023-05-28 22:40:58 +0200
  • 5f5aa20078
    remove trailing whitespace xaedes 2023-05-28 22:00:56 +0200
  • 1fbd19abe1
    use ggml_cross_entropy_loss in text training example xaedes 2023-05-28 22:00:26 +0200
  • f056a04a80
    add tests for cross_entropy_loss backward pass xaedes 2023-05-28 21:59:17 +0200
  • 71aaf8dedf
    add ggml_cross_entropy_loss with backward pass for faster training xaedes 2023-05-28 21:57:38 +0200
  • 3b126f654f
    LLAMA_DEBUG adds debug symbols (#1617) master-3b126f6 Johannes Gäßler 2023-05-28 21:01:02 +0200
  • 6c58f64a3b --ctx_size flag to --ctx-size to match common.cpp digiwombat 2023-05-28 14:17:36 -0400
  • b38d41ef52 --memory_f32 flag to --memory-f32 to match common.cpp digiwombat 2023-05-28 13:58:25 -0400
  • 655899db89 Add ignore_eos option to generation settings. digiwombat 2023-05-28 13:49:45 -0400
  • 1b78ed2081
    Only show -ngl option when relevant + other doc/arg handling updates (#1625) master-1b78ed2 Kerfuffle 2023-05-28 11:48:57 -0600
  • 337aea1139
    examples : add --alias option to gpt_params to set use friendly model name (#1614) master-337aea1 Vladimir Zorin 2023-05-28 20:14:24 +0300
  • bb051d9723
    opencl : no need to allocate cl_mem on heap (#1612) master-bb051d9 Howard Su 2023-05-29 01:13:36 +0800
  • ca74884f66
    opencl : use strstr to check if fp16 supported (#1611) master-ca74884 Howard Su 2023-05-29 01:09:56 +0800
  • 2c9ee7a052
    Apply suggestions from code review Randall Fitzgerald 2023-05-28 09:34:11 -0700
  • 74c6f36bf1
    Editorconfig suggested fixes Henri Vasserman 2023-05-28 19:19:34 +0300
  • 05cb629c8e
    replace inefficient repeat backward pass with dedicated repeat_back operation xaedes 2023-05-28 18:00:17 +0200
  • c47df09842
    simplify backward pass for SQRT xaedes 2023-05-28 17:32:01 +0200
  • 15ddc4903b Merge remote-tracking branch 'slyecho/server_refactor' digiwombat 2023-05-28 11:09:32 -0400
  • 36758b1009 Setting the ftype argument of the script as optional Jiri Podivin 2023-05-28 16:39:51 +0200
  • 7186d655a1
    seed and gen params Henri Vasserman 2023-05-28 17:03:01 +0300
  • 7740301db9 Set unspecified generation settings back to default. (Notes below) digiwombat 2023-05-28 09:18:47 -0400
  • dda915cac4 Added capturing the stopping word and sending it along with the final JSON. digiwombat 2023-05-28 08:43:38 -0400
  • 2e5c5ee224 Changed JSON names to match the parameter name rather than the variable name. digiwombat 2023-05-28 08:12:48 -0400
  • 23928f2887 Added generation_settings to final json object. digiwombat 2023-05-28 08:04:05 -0400
  • 5eacb84223 Display a warning if -ngl is supplied without support. KerfuffleV2 2023-05-28 05:48:36 -0600
  • e8efd75492 Initial timeout code and expanded json return on completion. digiwombat 2023-05-28 07:44:31 -0400
  • 28f1196f65 adjust default rep pen range Concedo 2023-05-28 19:36:21 +0800
  • 177868e68a Changed to params/args digiwombat 2023-05-28 06:29:11 -0400
  • a70095e961 Fix derp in ngl ifdef KerfuffleV2 2023-05-28 04:17:44 -0600
  • f40f6e8252 Documentation and arg help/handling updates KerfuffleV2 2023-05-28 03:58:03 -0600
  • 764a21ce0f Only show -ngl option when relevant + add warning for --memory-f32 option KerfuffleV2 2023-05-28 03:36:52 -0600
  • 549291fe61
    keep processed from the beginning Henri Vasserman 2023-05-28 12:08:37 +0300
  • df0e0d094c
    Forgot to remove some testing code. Randall Fitzgerald 2023-05-23 06:22:30 -0700
  • f93fe36c5b
    Add all generation parameters to server.cpp and allow resetting context Randall Fitzgerald 2023-05-23 06:16:54 -0700
  • 51e09944ce
    server rewrite Henri Vasserman 2023-05-28 02:42:18 +0300
  • c01c7d2caf Adding git in container package dependencies Jiri Podivin 2023-05-28 09:33:33 +0200
  • 7d159bacd7 updated kobold lite Concedo 2023-05-28 11:23:20 +0800
  • 0d308e2ef2 remove excessive codes and prints liang 2023-05-28 08:45:51 +0800
  • 1f40a789e6
    Didn't see the already defined top_k var. Randall Fitzgerald 2023-05-27 17:10:09 -0700
  • e84b802161
    Change top_k type. Randall Fitzgerald 2023-05-27 17:07:45 -0700
  • fdce8951ac
    Merge branch 'ggerganov:master' into master Randall Fitzgerald 2023-05-27 19:57:37 -0400
  • d20f36b93c
    Removed unnecessary last_prompt_token set Randall Fitzgerald 2023-05-27 16:46:05 -0700
  • 36c86d794d
    Automate Context resetting and minor fixes Randall Fitzgerald 2023-05-27 16:43:08 -0700