Commit Graph

660 Commits

Author SHA1 Message Date
oobabooga
b53ed70a70 Make llamacpp_HF 6x faster 2023-08-01 13:18:20 -07:00
oobabooga
959feba602 When saving model settings, only save the settings for the current loader 2023-08-01 06:10:09 -07:00
oobabooga
ebb4f22028 Change a comment 2023-07-31 20:06:10 -07:00
oobabooga
8e2217a029 Minor changes to the Parameters tab 2023-07-31 19:55:11 -07:00
oobabooga
b2207f123b Update docs 2023-07-31 19:20:48 -07:00
oobabooga
84297d05c4 Add a "Filter by loader" menu to the Parameters tab 2023-07-31 19:09:02 -07:00
oobabooga
e6be25ea11 Fix a regression 2023-07-30 18:12:30 -07:00
oobabooga
5ca37765d3 Only replace {{user}} and {{char}} at generation time 2023-07-30 11:42:30 -07:00
oobabooga
6e16af34fd Save uploaded characters as yaml
Also allow yaml characters to be uploaded directly
2023-07-30 11:25:38 -07:00
oobabooga
ed80a2e7db Reorder llama.cpp params 2023-07-25 20:45:20 -07:00
oobabooga
0e8782df03 Set instruction template when switching from default/notebook to chat 2023-07-25 20:37:01 -07:00
oobabooga
1b89c304ad Update README 2023-07-25 15:46:12 -07:00
oobabooga
75c2dd38cf Remove flexgen support 2023-07-25 15:15:29 -07:00
Shouyi
031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
oobabooga
7bc408b472 Change rms_norm_eps to 5e-6 for llama-2-70b ggml
Based on https://github.com/ggerganov/llama.cpp/pull/2384
2023-07-25 14:54:57 -07:00
oobabooga
08c622df2e Autodetect rms_norm_eps and n_gqa for llama-2-70b 2023-07-24 15:27:34 -07:00
oobabooga
a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
jllllll
d7a14174a2
Remove auto-loading when only one model is available (#3187) 2023-07-18 11:39:08 -03:00
oobabooga
f83fdb9270 Don't reset LoRA menu when loading a model 2023-07-17 12:50:25 -07:00
oobabooga
2de0cedce3 Fix reload screen color 2023-07-15 22:39:39 -07:00
oobabooga
27a84b4e04 Make AutoGPTQ the default again
Purely for compatibility with more models.
You should still use ExLlama_HF for LLaMA models.
2023-07-15 22:29:23 -07:00
oobabooga
5e3f7e00a9
Create llamacpp_HF loader (#3062) 2023-07-16 02:21:13 -03:00
Panchovix
7c4d4fc7d3
Increase alpha value limit for NTK RoPE scaling for exllama/exllama_HF (#3149) 2023-07-16 01:56:04 -03:00
oobabooga
b284f2407d Make ExLlama_HF the new default for GPTQ 2023-07-14 14:03:56 -07:00
oobabooga
22341e948d Merge branch 'main' into dev 2023-07-12 14:19:49 -07:00
oobabooga
0e6295886d Fix lora download folder 2023-07-12 14:19:33 -07:00
oobabooga
eb823fce96 Fix typo 2023-07-12 13:55:19 -07:00
oobabooga
d0a626f32f Change reload screen color 2023-07-12 13:54:43 -07:00
oobabooga
c592a9b740 Fix #3117 2023-07-12 13:33:44 -07:00
Gabriel Pena
eedb3bf023
Add low vram mode on llama cpp (#3076) 2023-07-12 11:05:13 -03:00
Axiom Wolf
d986c17c52
Chat history download creates more detailed file names (#3051) 2023-07-12 00:10:36 -03:00
Salvador E. Tropea
324e45b848
[Fixed] wbits and groupsize values from model not shown (#2977) 2023-07-11 23:27:38 -03:00
oobabooga
bfafd07f44 Change a message 2023-07-11 18:29:20 -07:00
micsthepick
3708de2b1f
respect model dir for downloads (#3077) (#3079) 2023-07-11 18:55:46 -03:00
oobabooga
9aee1064a3 Block a cloudfare request 2023-07-06 22:24:52 -07:00
oobabooga
40c5722499
Fix #2998 2023-07-04 11:35:25 -03:00
oobabooga
55457549cd Add information about presets to the UI 2023-07-03 22:39:01 -07:00
Panchovix
10c8c197bf
Add Support for Static NTK RoPE scaling for exllama/exllama_hf (#2955) 2023-07-04 01:13:16 -03:00
FartyPants
eb6112d5a2
Update server.py - clear LORA after reload (#2952) 2023-07-04 00:13:38 -03:00
oobabooga
4b1804a438
Implement sessions + add basic multi-user support (#2991) 2023-07-04 00:03:30 -03:00
missionfloyd
ac0f96e785
Some more character import tweaks. (#2921) 2023-06-29 14:56:25 -03:00
oobabooga
5d2a8b31be Improve Parameters tab UI 2023-06-29 14:33:47 -03:00
oobabooga
3443219cbc
Add repetition penalty range parameter to transformers (#2916) 2023-06-29 13:40:13 -03:00
oobabooga
22d455b072 Add LoRA support to ExLlama_HF 2023-06-26 00:10:33 -03:00
oobabooga
b7c627f9a0 Set UI defaults 2023-06-25 22:55:43 -03:00
oobabooga
c52290de50
ExLlama with long context (#2875) 2023-06-25 22:49:26 -03:00
oobabooga
f0fcd1f697 Sort some imports 2023-06-25 01:44:36 -03:00
oobabooga
e6e5f546b8 Reorganize Chat settings tab 2023-06-25 01:10:20 -03:00
jllllll
bef67af23c
Use pre-compiled python module for ExLlama (#2770) 2023-06-24 20:24:17 -03:00
missionfloyd
51a388fa34
Organize chat history/character import menu (#2845)
* Organize character import menu

* Move Chat history upload/download labels
2023-06-24 09:55:02 -03:00
oobabooga
3ae9af01aa Add --no_use_cuda_fp16 param for AutoGPTQ 2023-06-23 12:22:56 -03:00
LarryVRH
580c1ee748
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777) 2023-06-21 15:31:42 -03:00
Morgan Schweers
447569e31a
Add a download progress bar to the web UI. (#2472)
* Show download progress on the model screen.

* In case of error, mark as done to clear progress bar.

* Increase the iteration block size to reduce overhead.
2023-06-20 22:59:14 -03:00
oobabooga
09c781b16f Add modules/block_requests.py
This has become unnecessary, but it could be useful in the future
for other libraries.
2023-06-18 16:31:14 -03:00
oobabooga
44f28830d1 Chat CSS: fix ul, li, pre styles + remove redefinitions 2023-06-18 15:20:51 -03:00
oobabooga
239b11c94b Minor bug fixes 2023-06-17 17:57:56 -03:00
oobabooga
1e400218e9 Fix a typo 2023-06-16 21:01:57 -03:00
oobabooga
5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga
83be8eacf0 Minor fix 2023-06-16 20:38:32 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga
dea43685b0 Add some clarifications 2023-06-16 19:10:53 -03:00
oobabooga
7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
Tom Jobbins
646b0c889f
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648) 2023-06-15 23:59:54 -03:00
oobabooga
474dc7355a Allow API requests to use parameter presets 2023-06-14 11:32:20 -03:00
FartyPants
9f150aedc3
A small UI change in Models menu (#2640) 2023-06-12 01:24:44 -03:00
oobabooga
da5d9a28d8 Fix tabbed extensions showing up at the bottom of the UI 2023-06-11 21:20:51 -03:00
oobabooga
ae5e2b3470 Reorganize a bit 2023-06-11 19:50:20 -03:00
oobabooga
f4defde752 Add a menu for installing extensions 2023-06-11 17:11:06 -03:00
oobabooga
8e73806b20 Improve "Interface mode" appearance 2023-06-11 15:29:45 -03:00
oobabooga
ac122832f7 Make dropdown menus more similar to automatic1111 2023-06-11 14:20:16 -03:00
oobabooga
6133675e0f
Add menus for saving presets/characters/instruction templates/prompts (#2621) 2023-06-11 12:19:18 -03:00
brandonj60
b04e18d10c
Add Mirostat v2 sampling to transformer models (#2571) 2023-06-09 21:26:31 -03:00
oobabooga
eb2601a8c3 Reorganize Parameters tab 2023-06-06 14:51:02 -03:00
oobabooga
f06a1387f0 Reorganize Models tab 2023-06-06 07:58:07 -03:00
oobabooga
d49d299b67 Change a message 2023-06-06 07:54:56 -03:00
oobabooga
7ed1e35fbf Reorganize Parameters tab in chat mode 2023-06-06 07:46:25 -03:00
oobabooga
00b94847da Remove softprompt support 2023-06-06 07:42:23 -03:00
oobabooga
f276d88546 Use AutoGPTQ by default for GPTQ models 2023-06-05 15:41:48 -03:00
oobabooga
6a75bda419 Assign some 4096 seq lengths 2023-06-05 12:07:52 -03:00
oobabooga
19f78684e6 Add "Start reply with" feature to chat mode 2023-06-02 13:58:08 -03:00
oobabooga
28198bc15c Change some headers 2023-06-02 11:28:43 -03:00
oobabooga
5177cdf634 Change AutoGPTQ info 2023-06-02 11:19:44 -03:00
oobabooga
8e98633efd Add a description for chat_prompt_size 2023-06-02 11:13:22 -03:00
oobabooga
5a8162a46d Reorganize models tab 2023-06-02 02:24:15 -03:00
oobabooga
2f6631195a Add desc_act checkbox to the UI 2023-06-02 01:45:46 -03:00
Morgan Schweers
1aed2b9e52
Make it possible to download protected HF models from the command line. (#2408) 2023-06-01 00:11:21 -03:00
oobabooga
486ddd62df Add tfs and top_a to the API examples 2023-05-31 23:44:38 -03:00
oobabooga
3209440b7c
Rearrange chat buttons 2023-05-30 00:17:31 -03:00
Luis Lopez
9e7204bef4
Add tail-free and top-a sampling (#2357) 2023-05-29 21:40:01 -03:00
oobabooga
1394f44e14 Add triton checkbox for AutoGPTQ 2023-05-29 15:32:45 -03:00
Honkware
204731952a
Falcon support (trust-remote-code and autogptq checkboxes) (#2367)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-29 10:20:18 -03:00
oobabooga
f27135bdd3 Add Eta Sampling preset
Also remove some presets that I do not consider relevant
2023-05-28 22:44:35 -03:00
oobabooga
00ebea0b2a Use YAML for presets and settings 2023-05-28 22:34:12 -03:00
oobabooga
fc33216477 Small fix for n_ctx in llama.cpp 2023-05-25 13:55:51 -03:00
oobabooga
37d4ad012b Add a button for rendering markdown for any model 2023-05-25 11:59:27 -03:00
DGdev91
cf088566f8
Make llama.cpp read prompt size and seed from settings (#2299) 2023-05-25 10:29:31 -03:00
oobabooga
361451ba60
Add --load-in-4bit parameter (#2320) 2023-05-25 01:14:13 -03:00
Gabriel Terrien
fc116711b0
FIX save_model_settings function to also update shared.model_config (#2282) 2023-05-24 10:01:07 -03:00
flurb18
d37a28730d
Beginning of multi-user support (#2262)
Adds a lock to generate_reply
2023-05-24 09:38:20 -03:00
Gabriel Terrien
7aed53559a
Support of the --gradio-auth flag (#2283) 2023-05-23 20:39:26 -03:00