Commit Graph

1392 Commits

Author SHA1 Message Date
oobabooga
68059d7c23 llama.cpp: minor log change & lint 2023-11-27 10:44:55 -08:00
tsukanov-as
9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used (#4728) 2023-11-27 15:42:08 -03:00
oobabooga
0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader (#4701) 2023-11-21 20:59:39 -03:00
oobabooga
2769a1fa25 Hide deprecated args from Session tab 2023-11-21 15:15:16 -08:00
oobabooga
a2e6d00128 Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga
9da7bb203d Minor LoRA bug fix 2023-11-19 07:59:29 -08:00
oobabooga
a6f1e1bcc5 Fix PEFT LoRA unloading 2023-11-19 07:55:25 -08:00
oobabooga
ab94f0d9bf Minor style change 2023-11-18 21:11:04 -08:00
oobabooga
5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
oobabooga
ef6feedeb2
Add --nowebui flag for pure API mode (#4651) 2023-11-18 23:38:39 -03:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint (#4650) 2023-11-18 23:19:31 -03:00
oobabooga
8f4f4daf8b
Add --admin-key flag for API (#4649) 2023-11-18 22:33:27 -03:00
Jordan Tucker
baab894759
fix: use system message in chat-instruct mode (#4648) 2023-11-18 20:20:13 -03:00
oobabooga
47d9e2618b Refresh the Preset menu after saving a preset 2023-11-18 14:03:42 -08:00
oobabooga
83b64e7fc1
New feature: "random preset" button (#4647) 2023-11-18 18:31:41 -03:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga
13dc3b61da Update README 2023-11-16 19:57:55 -08:00
oobabooga
8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga
6525707a7f Fix "send instruction template to..." buttons (closes #4625) 2023-11-16 18:16:42 -08:00
oobabooga
510a01ef46 Lint 2023-11-16 18:03:06 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga
58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga
cd41f8912b Warn users about n_ctx / max_seq_len 2023-11-15 18:56:42 -08:00
oobabooga
9be48e83a9 Start API when "api" checkbox is checked 2023-11-15 16:35:47 -08:00
oobabooga
a85ce5f055 Add more info messages for truncation / instruction template 2023-11-15 16:20:31 -08:00
oobabooga
883701bc40 Alternative solution to 025da386a0
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga
8ac942813c Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0.
2023-11-15 16:01:54 -08:00
oobabooga
e6f44d6d19 Print context length / instruction template to terminal when loading models 2023-11-15 16:00:51 -08:00
oobabooga
e05d8fd441 Style changes 2023-11-15 15:51:37 -08:00
Andy Bao
025da386a0
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
oobabooga
4aabff3728 Remove old API, launch OpenAI API with --api 2023-11-10 06:39:08 -08:00
oobabooga
2af7e382b1 Revert "Bump llama-cpp-python to 0.2.14"
This reverts commit 5c3eb22ce6.

The new version has issues:

https://github.com/oobabooga/text-generation-webui/issues/4540
https://github.com/abetlen/llama-cpp-python/issues/893
2023-11-09 10:02:13 -08:00
oobabooga
21ed9a260e Document the new "Custom system message" field 2023-11-08 17:54:10 -08:00
oobabooga
2358706453 Add /v1/internal/model/load endpoint (tentative) 2023-11-07 20:58:06 -08:00
oobabooga
43c53a7820 Refactor the /v1/models endpoint 2023-11-07 19:59:27 -08:00
oobabooga
1b69694fe9 Add types to the encode/decode/token-count endpoints 2023-11-07 19:32:14 -08:00
oobabooga
6e2e0317af
Separate context and system message in instruction formats (#4499) 2023-11-07 20:02:58 -03:00
oobabooga
5c0559da69 Training: fix .txt files now showing in dropdowns 2023-11-07 14:41:11 -08:00
oobabooga
af3d25a503 Disable logits_all in llamacpp_HF (makes processing 3x faster) 2023-11-07 14:35:48 -08:00
oobabooga
5c3eb22ce6 Bump llama-cpp-python to 0.2.14 2023-11-07 14:20:43 -08:00
oobabooga
ec17a5d2b7
Make OpenAI API the default API (#4430) 2023-11-06 02:38:29 -03:00
feng lui
4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
wouter van der plas
add359379e
fixed two links in the ui (#4452) 2023-11-04 13:41:42 -03:00
oobabooga
aa5d671579
Add temperature_last parameter (#4472) 2023-11-04 13:09:07 -03:00
oobabooga
1ab8700d94 Change frequency/presence penalty ranges 2023-11-03 17:38:19 -07:00
oobabooga
45fcb60e7a Make truncation_length_max apply to max_seq_len/n_ctx 2023-11-03 11:29:31 -07:00
oobabooga
7f9c1cbb30 Change min_p default to 0.0 2023-11-03 08:25:22 -07:00
oobabooga
4537853e2c Change min_p default to 1.0 2023-11-03 08:13:50 -07:00
kalomaze
367e5e6e43
Implement Min P as a sampler option in HF loaders (#4449) 2023-11-02 16:32:51 -03:00
oobabooga
fcb7017b7a Remove a checkbox 2023-11-02 12:24:09 -07:00
Julien Chaumond
fdcaa955e3
transformers: Add a flag to force load from safetensors (#4450) 2023-11-02 16:20:54 -03:00
oobabooga
c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
oobabooga
42f816312d Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-11-02 11:09:26 -07:00
oobabooga
77abd9b69b Add no_flash_attn option 2023-11-02 11:08:53 -07:00
Julien Chaumond
a56ef2a942
make torch.load a bit safer (#4448) 2023-11-02 14:07:08 -03:00
Mehran Ziadloo
aaf726dbfb
Updating the shared settings object when loading a model (#4425) 2023-11-01 01:29:57 -03:00
oobabooga
9bd0724d85 Change frequency/presence penalty ranges 2023-10-31 20:57:56 -07:00
Meheret
0707ed7677
updated wiki link (#4415) 2023-10-31 19:09:05 -03:00
oobabooga
262f8ae5bb Use default gr.Dataframe for evaluation table 2023-10-27 06:49:14 -07:00
oobabooga
839a87bac8 Fix is_ccl_available & is_xpu_available imports 2023-10-26 20:27:04 -07:00
Abhilash Majumder
778a010df8
Intel Gpu support initialization (#4340) 2023-10-26 23:39:51 -03:00
oobabooga
92b2f57095 Minor metadata bug fix (second attempt) 2023-10-26 18:57:32 -07:00
tdrussell
72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty (#4376) 2023-10-25 12:10:28 -03:00
oobabooga
ef1489cd4d Remove unused parameter in AutoAWQ 2023-10-23 20:45:43 -07:00
oobabooga
1edf321362 Lint 2023-10-23 13:09:03 -07:00
oobabooga
280ae720d7 Organize 2023-10-23 13:07:17 -07:00
oobabooga
49e5eecce4 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-10-23 12:54:05 -07:00
oobabooga
306d764ff6 Minor metadata bug fix 2023-10-23 12:46:24 -07:00
adrianfiedler
4bc411332f
Fix broken links (#4367)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-23 14:09:57 -03:00
oobabooga
92691ee626 Disable trust_remote_code by default 2023-10-23 09:57:44 -07:00
tdrussell
4440f87722
Add additive_repetition_penalty sampler setting. (#3627) 2023-10-23 02:28:07 -03:00
oobabooga
df90d03e0b Replace --mul_mat_q with --no_mul_mat_q 2023-10-22 12:23:03 -07:00
Googulator
d0c3b407b3
transformers loader: multi-LoRAs support (#3120) 2023-10-22 16:06:22 -03:00
omo
4405513ca5
Option to select/target additional linear modules/layers in LORA training (#4178) 2023-10-22 15:57:19 -03:00
oobabooga
2d1b3332e4 Ignore warnings on Colab 2023-10-21 21:45:25 -07:00
oobabooga
09f807af83 Use ExLlama_HF for GPTQ models by default 2023-10-21 20:45:38 -07:00
oobabooga
506d05aede Organize command-line arguments 2023-10-21 18:52:59 -07:00
oobabooga
fbac6d21ca Add missing exception 2023-10-20 23:53:24 -07:00
Brian Dashore
3345da2ea4
Add flash-attention 2 for windows (#4235) 2023-10-21 03:46:23 -03:00
Johan
1d5a015ce7
Enable special token support for exllamav2 (#4314) 2023-10-21 01:54:06 -03:00
turboderp
ae8cd449ae
ExLlamav2_HF: Convert logits to FP32 (#4310) 2023-10-18 23:16:05 -03:00
oobabooga
f17f7a6913 Increase the evaluation table height 2023-10-16 12:55:35 -07:00
oobabooga
8ea554bc19 Check for torch.xpu.is_available() 2023-10-16 12:53:40 -07:00
oobabooga
188d20e9e5 Reduce the evaluation table height 2023-10-16 10:53:42 -07:00
oobabooga
2d44adbb76 Clear the torch cache while evaluating 2023-10-16 10:52:50 -07:00
oobabooga
71cac7a1b2 Increase the height of the evaluation table 2023-10-15 21:56:40 -07:00
oobabooga
e14bde4946 Minor improvements to evaluation logs 2023-10-15 20:51:43 -07:00
oobabooga
b88b2b74a6 Experimental Intel Arc transformers support (untested) 2023-10-15 20:51:11 -07:00
Forkoz
8cce1f1126
Exllamav2 lora support (#4229)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-14 16:12:41 -03:00
oobabooga
773c17faec Fix a warning 2023-10-10 20:53:38 -07:00
oobabooga
f63361568c Fix safetensors kwarg usage in AutoAWQ 2023-10-10 19:03:09 -07:00
oobabooga
39f16ff83d Fix default/notebook tabs css 2023-10-10 18:45:12 -07:00
oobabooga
fae8062d39
Bump to latest gradio (3.47) (#4258) 2023-10-10 22:20:49 -03:00
oobabooga
9fab9a1ca6 Minor fix 2023-10-10 14:08:11 -07:00
oobabooga
a49cc69a4a Ignore rope_freq_base if value is 10000 2023-10-10 13:57:40 -07:00
oobabooga
3a9d90c3a1 Download models with 4 threads by default 2023-10-10 13:52:10 -07:00
Forkoz
35695e18c7
Remove import. (#4247)
For real this time.
2023-10-09 18:06:11 -03:00
Forkoz
2e471071af
Update llama_attn_hijack.py (#4231) 2023-10-08 15:16:48 -03:00
Brian Dashore
98fa73a974
Text Generation: stop if EOS token is reached (#4213) 2023-10-07 19:46:42 -03:00
Brian Dashore
7743b5e9de
Llamacpp_HF: Fix CFG cache init (#4219)
Documentation says that model.context_params should be sent when
a new context is created. The current code uses model.params which
doesn't exist.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-10-07 19:38:29 -03:00
turboderp
8a98646a21
Bump ExLlamaV2 to 0.0.5 (#4186) 2023-10-05 19:12:22 -03:00
oobabooga
7ffb424c7b Add AutoAWQ to README 2023-10-05 09:22:37 -07:00
cal066
cc632c3f33
AutoAWQ: initial support (#3999) 2023-10-05 13:19:18 -03:00
tdrussell
cb26163a20
Fix off-by-one error in exllama_hf caching logic (#4145) 2023-10-05 12:20:56 -03:00
oobabooga
ae4ba3007f
Add grammar to transformers and _HF loaders (#4091) 2023-10-05 10:01:36 -03:00
oobabooga
b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
jllllll
41a2de96e5
Bump llama-cpp-python to 0.2.11 2023-10-01 18:08:10 -05:00
oobabooga
f2d82f731a Add recommended NTKv1 alpha values 2023-09-29 13:48:38 -07:00
oobabooga
abe99cddeb Extend evaluation slider bounds 2023-09-29 13:06:26 -07:00
oobabooga
96da2e1c0d Read more metadata (config.json & quantize_config.json) 2023-09-29 06:14:16 -07:00
oobabooga
56b5a4af74 exllamav2 typical_p 2023-09-28 20:10:12 -07:00
oobabooga
f8e9733412 Minor syntax change 2023-09-28 19:32:35 -07:00
oobabooga
f931184b53 Increase truncation limits to 32768 2023-09-28 19:28:22 -07:00
oobabooga
1dd13e4643 Read Transformers config.json metadata 2023-09-28 19:19:47 -07:00
StoyanStAtanasov
7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga
87ea2d96fd Add a note about RWKV loader 2023-09-26 17:43:39 -07:00
oobabooga
0c89180966 Another minor fix 2023-09-26 06:54:21 -07:00
oobabooga
365335e1ae Minor fix 2023-09-26 06:47:19 -07:00
oobabooga
1ca54faaf0 Improve --multi-user mode 2023-09-26 06:42:33 -07:00
oobabooga
019371c0b6 Lint 2023-09-25 20:31:11 -07:00
oobabooga
814520fed1 Extension install improvements 2023-09-25 20:27:06 -07:00
oobabooga
7f1460af29 Change a warning 2023-09-25 20:22:27 -07:00
oobabooga
862b45b1c7 Extension install improvements 2023-09-25 19:48:30 -07:00
oobabooga
c8952cce55 Move documentation from UI to docs/ 2023-09-25 12:28:28 -07:00
oobabooga
d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga
b973b91d73 Automatically filter by loader (closes #4072) 2023-09-25 10:28:35 -07:00
oobabooga
63de9eb24f Clean up the transformers loader 2023-09-24 20:26:26 -07:00
oobabooga
36c38d7561 Add disable_exllama to Transformers loader (for GPTQ LoRA training) 2023-09-24 20:03:11 -07:00
oobabooga
55a685d999 Minor fixes 2023-09-24 14:15:10 -07:00
oobabooga
08cf150c0c
Add a grammar editor to the UI (#4061) 2023-09-24 18:05:24 -03:00
oobabooga
eb0b7c1053 Fix a minor UI bug 2023-09-24 07:17:33 -07:00
oobabooga
3edac43426 Remove print statement 2023-09-24 07:13:00 -07:00
oobabooga
b227e65d86 Add grammar to llama.cpp loader (closes #4019) 2023-09-24 07:10:45 -07:00
oobabooga
2e7b6b0014
Create alternative requirements.txt with AMD and Metal wheels (#4052) 2023-09-24 09:58:29 -03:00
oobabooga
7a3ca2c68f Better detect EXL2 models 2023-09-23 13:05:55 -07:00
oobabooga
b1467bd064
Move one-click-installers into the repository (#4028 from oobabooga/one-click) 2023-09-22 17:43:07 -03:00
oobabooga
c075969875 Add instructions 2023-09-22 13:10:03 -07:00
oobabooga
8ab3eca9ec Add a warning for outdated installations 2023-09-22 09:35:19 -07:00
oobabooga
95976a9d4f Fix a bug while deleting characters 2023-09-22 06:02:34 -07:00
oobabooga
d5330406fa Add a rename menu for chat histories 2023-09-21 19:16:51 -07:00
oobabooga
00ab450c13
Multiple histories for each character (#4022) 2023-09-21 17:19:32 -03:00
oobabooga
029da9563f Avoid redundant function call in llamacpp_hf 2023-09-19 14:14:40 -07:00
oobabooga
869f47fff9 Lint 2023-09-19 13:51:57 -07:00
oobabooga
13ac55fa18 Reorder some functions 2023-09-19 13:51:57 -07:00
oobabooga
03dc69edc5 ExLlama_HF (v1 and v2) prefix matching 2023-09-19 13:12:19 -07:00
oobabooga
5075087461 Fix command-line arguments being ignored 2023-09-19 13:11:46 -07:00
oobabooga
ff5d3d2d09 Add missing import 2023-09-18 16:26:54 -07:00
oobabooga
605ec3c9f2 Add a warning about ExLlamaV2 without flash-attn 2023-09-18 12:26:35 -07:00
oobabooga
f0ef971edb Remove obsolete warning 2023-09-18 12:25:10 -07:00
oobabooga
745807dc03 Faster llamacpp_HF prefix matching 2023-09-18 11:02:45 -07:00
BadisG
893a72a1c5
Stop generation immediately when using "Maximum tokens/second" (#3952)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-18 14:27:06 -03:00
Cebtenzzre
8466cf229a
llama.cpp: fix ban_eos_token (#3987) 2023-09-18 12:15:02 -03:00
oobabooga
0ede2965d5 Remove an error message 2023-09-17 18:46:08 -07:00
missionfloyd
cc8eda298a
Move hover menu shortcuts to right side (#3951) 2023-09-17 22:33:00 -03:00
oobabooga
280cca9f66 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-09-17 18:01:27 -07:00
oobabooga
b062d50c45 Remove exllama import that causes problems 2023-09-17 18:00:32 -07:00
James Braza
fee38e0601
Simplified ExLlama cloning instructions and failure message (#3972) 2023-09-17 19:26:05 -03:00
Lu Guanghua
9858acee7b
Fix unexpected extensions load after gradio restart (#3965) 2023-09-17 17:35:43 -03:00
oobabooga
d9b0f2c9c3 Fix llama.cpp double decoding 2023-09-17 13:07:48 -07:00
oobabooga
d71465708c llamacpp_HF prefix matching 2023-09-17 11:51:01 -07:00
oobabooga
37e2980e05 Recommend mul_mat_q for llama.cpp 2023-09-17 08:27:11 -07:00
oobabooga
a069f3904c Undo part of ad8ac545a5 2023-09-17 08:12:23 -07:00
oobabooga
ad8ac545a5 Tokenization improvements 2023-09-17 07:02:00 -07:00
saltacc
cd08eb0753
token probs for non HF loaders (#3957) 2023-09-17 10:42:32 -03:00
kalomaze
7c9664ed35
Allow full model URL to be used for download (#3919)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-16 10:06:13 -03:00
saltacc
ed6b6411fb
Fix exllama tokenizers (#3954)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-16 09:42:38 -03:00
missionfloyd
2ad6ca8874
Add back chat buttons with --chat-buttons (#3947) 2023-09-16 00:39:37 -03:00
oobabooga
ef04138bc0 Improve the UI tokenizer 2023-09-15 19:30:44 -07:00
oobabooga
c3e4c9fdc2 Add a simple tokenizer to the UI 2023-09-15 19:09:03 -07:00
saltacc
f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
oobabooga
5b117590ad Add some scrollbars to Parameters tab 2023-09-15 09:17:37 -07:00
Johan
fdcee0c215
Allow custom tokenizer for llamacpp_HF loader (#3941) 2023-09-15 12:38:38 -03:00
oobabooga
fd7257c7f8 Prevent code blocks from flickering while streaming 2023-09-15 07:46:26 -07:00
oobabooga
a3ecf3bb65 Add cai-chat-square chat style 2023-09-14 16:15:08 -07:00
oobabooga
3d1c0f173d User config precedence over GGUF metadata 2023-09-14 12:15:52 -07:00
oobabooga
94dc64f870 Add a border 2023-09-14 07:20:36 -07:00
oobabooga
70aafa34dc Fix blockquote markdown rendering 2023-09-14 05:57:04 -07:00
oobabooga
644a9b8765 Change the chat generate button 2023-09-14 05:16:44 -07:00
oobabooga
ecc90f9f62 Continue on Alt + Enter 2023-09-14 03:59:12 -07:00
oobabooga
1ce3c93600 Allow "Your name" field to be saved 2023-09-14 03:44:35 -07:00
oobabooga
27dbcc59f5
Make the chat input expand upwards (#3920) 2023-09-14 07:06:42 -03:00
oobabooga
6b6af74e14 Keyboard shortcuts without conflicts (hopefully) 2023-09-14 02:33:52 -07:00
oobabooga
fc11d1eff0 Add chat keyboard shortcuts 2023-09-13 19:22:40 -07:00
oobabooga
9f199c7a4c Use Noto Sans font
Copied from 6c8bd06308/public/webfonts/NotoSans
2023-09-13 13:48:05 -07:00
oobabooga
8ce94b735c Show progress on impersonate 2023-09-13 11:22:53 -07:00
oobabooga
7cd437e05c Properly close the hover menu on mobile 2023-09-13 11:10:46 -07:00
oobabooga
1b47b5c676 Change the Generate/Stop buttons 2023-09-13 09:25:26 -07:00
oobabooga
8ea28cbfe0 Reorder chat buttons 2023-09-13 08:49:11 -07:00
oobabooga
5e3d2f7d44
Reorganize chat buttons (#3892) 2023-09-13 02:36:12 -03:00
Panchovix
34dc7306b8
Fix NTK (alpha) and RoPE scaling for exllamav2 and exllamav2_HF (#3897) 2023-09-13 02:35:09 -03:00
oobabooga
b7adf290fc Fix ExLlama-v2 path issue 2023-09-12 17:42:22 -07:00
oobabooga
b190676893 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-09-12 15:06:33 -07:00
oobabooga
2f935547c8 Minor changes 2023-09-12 15:05:21 -07:00
oobabooga
18e6b275f3 Add alpha_value/compress_pos_emb to ExLlama-v2 2023-09-12 15:02:47 -07:00
Gennadij
460c40d8ab
Read more GGUF metadata (scale_linear and freq_base) (#3877) 2023-09-12 17:02:42 -03:00
oobabooga
16e1696071 Minor qol change 2023-09-12 10:44:26 -07:00
oobabooga
c2a309f56e
Add ExLlamaV2 and ExLlamav2_HF loaders (#3881) 2023-09-12 14:33:07 -03:00
oobabooga
df123a20fc Prevent extra keys from being saved to settings.yaml 2023-09-11 20:13:10 -07:00
oobabooga
dae428a967 Revamp cai-chat theme, make it default 2023-09-11 19:30:40 -07:00
oobabooga
78811dd89a Fix GGUF metadata reading for falcon 2023-09-11 15:49:50 -07:00
oobabooga
9331ab4798
Read GGUF metadata (#3873) 2023-09-11 18:49:30 -03:00
oobabooga
df52dab67b Lint 2023-09-11 07:57:38 -07:00
oobabooga
ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00
John Smith
cc7b7ba153
fix lora training with alpaca_lora_4bit (#3853) 2023-09-11 01:22:20 -03:00
Forkoz
15e9b8c915
Exllama new rope settings (#3852) 2023-09-11 01:14:36 -03:00
oobabooga
4affa08821 Do not impose instruct mode while loading models 2023-09-02 11:31:33 -07:00
oobabooga
47e490c7b4 Set use_cache=True by default for all models 2023-08-30 13:26:27 -07:00
missionfloyd
787219267c
Allow downloading single file from UI (#3737) 2023-08-29 23:32:36 -03:00
oobabooga
cec8db52e5
Add max_tokens_second param (#3533) 2023-08-29 17:44:31 -03:00
oobabooga
2b58a89f6a Clear instruction template before loading new one 2023-08-29 13:11:32 -07:00
oobabooga
36864cb3e8 Use Alpaca as the default instruction template 2023-08-29 13:06:25 -07:00
oobabooga
9a202f7fb2 Prevent <ul> lists from flickering during streaming 2023-08-28 20:45:07 -07:00
oobabooga
439dd0faab Fix stopping strings in the chat API 2023-08-28 19:40:11 -07:00
oobabooga
c75f98a6d6 Autoscroll Notebook/Default textareas during streaming 2023-08-28 18:22:03 -07:00
oobabooga
558e918fd6 Add a typing dots (...) animation to chat tab 2023-08-28 13:50:36 -07:00
oobabooga
57e9ded00c
Make it possible to scroll during streaming (#3721) 2023-08-28 16:03:20 -03:00
Cebtenzzre
2f5d769a8d
accept floating-point alpha value on the command line (#3712) 2023-08-27 18:54:43 -03:00
oobabooga
b2296dcda0 Ctrl+S to show/hide chat controls 2023-08-27 13:14:33 -07:00
Ravindra Marella
e4c3e1bdd2
Fix ctransformers model unload (#3711)
Add missing comma in model types list

Fixes marella/ctransformers#111
2023-08-27 10:53:48 -03:00
oobabooga
0c9e818bb8 Update truncation length based on max_seq_len/n_ctx 2023-08-26 23:10:45 -07:00
oobabooga
3361728da1 Change some comments 2023-08-26 22:24:44 -07:00
oobabooga
8aeae3b3f4 Fix llamacpp_HF loading 2023-08-26 22:15:06 -07:00
oobabooga
7f5370a272 Minor fixes/cosmetics 2023-08-26 22:11:07 -07:00
jllllll
4d61a7d9da
Account for deprecated GGML parameters 2023-08-26 14:07:46 -05:00
jllllll
4a999e3bcd
Use separate llama-cpp-python packages for GGML support 2023-08-26 10:40:08 -05:00
oobabooga
83640d6f43 Replace ggml occurences with gguf 2023-08-26 01:06:59 -07:00
jllllll
db42b365c9
Fix ctransformers threads auto-detection (#3688) 2023-08-25 14:37:02 -03:00
cal066
960980247f
ctransformers: gguf support (#3685) 2023-08-25 11:33:04 -03:00
oobabooga
21058c37f7 Add missing file 2023-08-25 07:10:26 -07:00
oobabooga
f4f04c8c32 Fix a typo 2023-08-25 07:08:38 -07:00
oobabooga
5c7d8bfdfd Detect CodeLlama settings 2023-08-25 07:06:57 -07:00
oobabooga
52ab2a6b9e Add rope_freq_base parameter for CodeLlama 2023-08-25 06:55:15 -07:00
oobabooga
feecd8190f Unescape inline code blocks 2023-08-24 21:01:09 -07:00
oobabooga
3320accfdc
Add CFG to llamacpp_HF (second attempt) (#3678) 2023-08-24 20:32:21 -03:00
oobabooga
d6934bc7bc
Implement CFG for ExLlama_HF (#3666) 2023-08-24 16:27:36 -03:00
oobabooga
87442c6d18 Fix Notebook Logits tab 2023-08-22 21:00:12 -07:00
oobabooga
c0b119c3a3 Improve logit viewer format 2023-08-22 20:35:12 -07:00
oobabooga
8545052c9d Add the option to use samplers in the logit viewer 2023-08-22 20:18:16 -07:00
oobabooga
25e5eaa6a6 Remove outdated training warning 2023-08-22 13:16:44 -07:00
oobabooga
335c49cc7e Bump peft and transformers 2023-08-22 13:14:59 -07:00
cal066
e042bf8624
ctransformers: add mlock and no-mmap options (#3649) 2023-08-22 16:51:34 -03:00
oobabooga
6cca8b8028 Only update notebook token counter on input
For performance during streaming
2023-08-21 05:39:55 -07:00
oobabooga
2cb07065ec Fix an escaping bug 2023-08-20 21:50:42 -07:00
oobabooga
a74dd9003f Fix HTML escaping for perplexity_colors extension 2023-08-20 21:40:22 -07:00
oobabooga
57036abc76 Add "send to default/notebook" buttons to chat tab 2023-08-20 19:54:59 -07:00
oobabooga
429cacd715 Add a token counter similar to automatic1111
It can now be found in the Default and Notebook tabs
2023-08-20 19:37:33 -07:00
oobabooga
120fb86c6a
Add a simple logit viewer (#3636) 2023-08-20 20:49:21 -03:00
oobabooga
ef17da70af Fix ExLlama truncation 2023-08-20 08:53:26 -07:00
oobabooga
ee964bcce9 Update a comment about RoPE scaling 2023-08-20 07:01:43 -07:00
missionfloyd
1cae784761
Unescape last message (#3623) 2023-08-19 09:29:08 -03:00
Cebtenzzre
942ad6067d
llama.cpp: make Stop button work with streaming disabled (#3620) 2023-08-19 00:17:27 -03:00
oobabooga
f6724a1a01 Return the visible history with "Copy last reply" 2023-08-18 13:04:45 -07:00
oobabooga
b96fd22a81
Refactor the training tab (#3619) 2023-08-18 16:58:38 -03:00
oobabooga
c4733000d7 Return the visible history with "Remove last" 2023-08-18 09:25:51 -07:00
oobabooga
7cba000421
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610) 2023-08-18 12:03:34 -03:00
oobabooga
bdb6eb5734 Restyle the chat input box + several CSS improvements
- Remove extra spacing below the last chat message
- Change the background color of code blocks in dark mode
- Remove border radius from selected header bar elements
- Make the chat scrollbar more discrete
2023-08-17 11:10:38 -07:00
oobabooga
cebe07f29c Unescape HTML inside code blocks 2023-08-16 21:08:26 -07:00
oobabooga
a4e903e932 Escape HTML in chat messages 2023-08-16 09:25:52 -07:00
oobabooga
73d9befb65 Make "Show controls" customizable through settings.yaml 2023-08-16 07:04:18 -07:00
oobabooga
2a29208224
Add a "Show controls" button to chat UI (#3590) 2023-08-16 02:39:58 -03:00
cal066
991bb57e43
ctransformers: Fix up model_type name consistency (#3567) 2023-08-14 15:17:24 -03:00
oobabooga
ccfc02a28d
Add the --disable_exllama option for AutoGPTQ (#3545 from clefever/disable-exllama) 2023-08-14 15:15:55 -03:00
oobabooga
7e57b35b5e Clean up old code 2023-08-14 10:10:39 -07:00
oobabooga
4d067e9b52 Add back a variable to keep old extensions working 2023-08-14 09:39:06 -07:00
oobabooga
d8a82d34ed Improve a warning 2023-08-14 08:46:05 -07:00
oobabooga
3e0a9f9cdb Refresh the character dropdown when saving/deleting a character 2023-08-14 08:23:41 -07:00
oobabooga
890b4abdad Fix session saving 2023-08-14 07:55:52 -07:00
oobabooga
619cb4e78b
Add "save defaults to settings.yaml" button (#3574) 2023-08-14 11:46:07 -03:00
oobabooga
a95e6f02cb Add a placeholder for custom stopping strings 2023-08-13 21:17:20 -07:00
oobabooga
ff9b5861c8 Fix impersonate when some text is present (closes #3564) 2023-08-13 21:10:47 -07:00
oobabooga
cc7e6ef645 Fix a CSS conflict 2023-08-13 19:24:09 -07:00
Eve
66c04c304d
Various ctransformers fixes (#3556)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
2023-08-13 23:09:03 -03:00
oobabooga
4a05aa92cb Add "send to" buttons for instruction templates
- Remove instruction templates from prompt dropdowns (default/notebook)
- Add 3 buttons to Parameters > Instruction template as a replacement
- Increase the number of lines of 'negative prompt' field to 3, and add a scrollbar
- When uploading a character, switch to the Character tab
- When uploading chat history, switch to the Chat tab
2023-08-13 18:35:45 -07:00
oobabooga
f6db2c78d1 Fix ctransformers seed 2023-08-13 05:48:53 -07:00
oobabooga
a1a9ec895d
Unify the 3 interface modes (#3554) 2023-08-13 01:12:15 -03:00
cal066
bf70c19603
ctransformers: move thread and seed parameters (#3543) 2023-08-13 00:04:03 -03:00
Chris Lefever
0230fa4e9c Add the --disable_exllama option for AutoGPTQ 2023-08-12 02:26:58 -04:00
oobabooga
0e05818266 Style changes 2023-08-11 16:35:57 -07:00
oobabooga
2f918ccf7c Remove unused parameter 2023-08-11 11:15:22 -07:00
oobabooga
28c8df337b Add repetition_penalty_range to ctransformers 2023-08-11 11:04:19 -07:00
cal066
7a4fcee069
Add ctransformers support (#3313)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
2023-08-11 14:41:33 -03:00
oobabooga
8dbaa20ca8 Don't replace last reply with an empty message 2023-08-10 13:14:48 -07:00
oobabooga
0789554f65 Allow --lora to use an absolute path 2023-08-10 10:03:12 -07:00
oobabooga
3929971b66 Don't show oobabooga_llama-tokenizer in the model dropdown 2023-08-10 10:02:48 -07:00
oobabooga
c7f52bbdc1 Revert "Remove GPTQ-for-LLaMa monkey patch support"
This reverts commit e3d3565b2a.
2023-08-10 08:39:41 -07:00
jllllll
d6765bebc4
Update installation documentation 2023-08-10 00:53:48 -05:00
jllllll
d7ee4c2386
Remove unused import 2023-08-10 00:10:14 -05:00
jllllll
e3d3565b2a
Remove GPTQ-for-LLaMa monkey patch support
AutoGPTQ will be the preferred GPTQ LoRa loader in the future.
2023-08-09 23:59:04 -05:00
jllllll
bee73cedbd
Streamline GPTQ-for-LLaMa support 2023-08-09 23:42:34 -05:00
oobabooga
6c6a52aaad Change the filenames for caches and histories 2023-08-09 07:47:19 -07:00
oobabooga
d8fb506aff Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
2023-08-08 21:25:48 -07:00
Friedemann Lipphardt
901b028d55
Add option for named cloudflare tunnels (#3364) 2023-08-08 22:20:27 -03:00
oobabooga
bf08b16b32 Fix disappearing profile picture bug 2023-08-08 14:09:01 -07:00
Gennadij
0e78f3b4d4
Fixed a typo in "rms_norm_eps", incorrectly set as n_gqa (#3494) 2023-08-08 00:31:11 -03:00
oobabooga
37fb719452
Increase the Context/Greeting boxes sizes 2023-08-08 00:09:00 -03:00
oobabooga
584dd33424
Fix missing example_dialogue when uploading characters 2023-08-07 23:44:59 -03:00
oobabooga
412f6ff9d3 Change alpha_value maximum and step 2023-08-07 06:08:51 -07:00
oobabooga
a373c96d59 Fix a bug in modules/shared.py 2023-08-06 20:36:35 -07:00
oobabooga
3d48933f27 Remove ancient deprecation warnings 2023-08-06 18:58:59 -07:00
oobabooga
c237ce607e Move characters/instruction-following to instruction-templates 2023-08-06 17:50:32 -07:00
oobabooga
65aa11890f
Refactor everything (#3481) 2023-08-06 21:49:27 -03:00
oobabooga
d4b851bdc8 Credit turboderp 2023-08-06 13:43:15 -07:00
oobabooga
0af10ab49b
Add Classifier Free Guidance (CFG) for Transformers/ExLlama (#3325) 2023-08-06 17:22:48 -03:00
missionfloyd
5134878344
Fix chat message order (#3461) 2023-08-05 13:53:54 -03:00
jllllll
44f31731af
Create logs dir if missing when saving history (#3462) 2023-08-05 13:47:16 -03:00
Forkoz
9dcb37e8d4
Fix: Mirostat fails on models split across multiple GPUs 2023-08-05 13:45:47 -03:00
oobabooga
8df3cdfd51
Add SSL certificate support (#3453) 2023-08-04 13:57:31 -03:00
missionfloyd
2336b75d92
Remove unnecessary chat.js (#3445) 2023-08-04 01:58:37 -03:00
oobabooga
4b3384e353 Handle unfinished lists during markdown streaming 2023-08-03 17:15:18 -07:00
Pete
f4005164f4
Fix llama.cpp truncation (#3400)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-08-03 20:01:15 -03:00
oobabooga
87dab03dc0
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432) 2023-08-03 11:00:36 -03:00
oobabooga
3e70bce576 Properly format exceptions in the UI 2023-08-03 06:57:21 -07:00
oobabooga
32c564509e Fix loading session in chat mode 2023-08-02 21:13:16 -07:00
oobabooga
0e8f9354b5 Add direct download for session/chat history JSONs 2023-08-02 19:43:39 -07:00
oobabooga
32a2bbee4a Implement auto_max_new_tokens for ExLlama 2023-08-02 11:03:56 -07:00
oobabooga
e931844fe2
Add auto_max_new_tokens parameter (#3419) 2023-08-02 14:52:20 -03:00
Pete
6afc1a193b
Add a scrollbar to notebook/default, improve chat scrollbar style (#3403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-08-02 12:02:36 -03:00
oobabooga
b53ed70a70 Make llamacpp_HF 6x faster 2023-08-01 13:18:20 -07:00
oobabooga
8d46a8c50a Change the default chat style and the default preset 2023-08-01 09:35:17 -07:00
oobabooga
959feba602 When saving model settings, only save the settings for the current loader 2023-08-01 06:10:09 -07:00
oobabooga
f094330df0 When saving a preset, only save params that differ from the defaults 2023-07-31 19:13:29 -07:00
oobabooga
84297d05c4 Add a "Filter by loader" menu to the Parameters tab 2023-07-31 19:09:02 -07:00
oobabooga
7de7b3d495 Fix newlines in exported character yamls 2023-07-31 10:46:02 -07:00
oobabooga
5ca37765d3 Only replace {{user}} and {{char}} at generation time 2023-07-30 11:42:30 -07:00
oobabooga
6e16af34fd Save uploaded characters as yaml
Also allow yaml characters to be uploaded directly
2023-07-30 11:25:38 -07:00
oobabooga
b31321c779 Define visible_text before applying chat_input extensions 2023-07-26 07:27:14 -07:00
oobabooga
b17893a58f Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e.
2023-07-26 07:06:01 -07:00
oobabooga
28779cd959 Use dark theme by default 2023-07-25 20:11:57 -07:00
oobabooga
c2e0d46616 Add credits 2023-07-25 15:49:04 -07:00
oobabooga
77d2e9f060 Remove flexgen 2 2023-07-25 15:18:25 -07:00
oobabooga
75c2dd38cf Remove flexgen support 2023-07-25 15:15:29 -07:00
Foxtr0t1337
85b3a26e25
Ignore values which are not string in training.py (#3287) 2023-07-25 19:00:25 -03:00
Shouyi
031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
Eve
f653546484
README updates and improvements (#3198) 2023-07-25 18:58:13 -03:00
oobabooga
ef8637e32d
Add extension example, replace input_hijack with chat_input_modifier (#3307) 2023-07-25 18:49:56 -03:00
oobabooga
a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
jllllll
1141987a0d
Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading (#3225) 2023-07-24 11:25:36 -03:00
Ikko Eltociear Ashimine
b2d5433409
Fix typo in deepspeed_parameters.py (#3222)
configration -> configuration
2023-07-24 11:17:28 -03:00
oobabooga
4b19b74e6c Add CUDA wheels for llama-cpp-python by jllllll 2023-07-19 19:33:43 -07:00
oobabooga
913e060348 Change the default preset to Divine Intellect
It seems to reduce hallucination while using instruction-tuned models.
2023-07-19 08:24:37 -07:00
randoentity
a69955377a
[GGML] Support for customizable RoPE (#3083)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-07-17 22:32:37 -03:00
appe233
89e0d15cf5
Use 'torch.backends.mps.is_available' to check if mps is supported (#3164) 2023-07-17 21:27:18 -03:00
oobabooga
8c1c2e0fae Increase max_new_tokens upper limit 2023-07-17 17:08:22 -07:00
oobabooga
b1a6ea68dd Disable "autoload the model" by default 2023-07-17 07:40:56 -07:00
oobabooga
a199f21799 Optimize llamacpp_hf a bit 2023-07-16 20:49:48 -07:00
oobabooga
6a3edb0542 Clean up llamacpp_hf.py 2023-07-15 22:40:55 -07:00
oobabooga
27a84b4e04 Make AutoGPTQ the default again
Purely for compatibility with more models.
You should still use ExLlama_HF for LLaMA models.
2023-07-15 22:29:23 -07:00
oobabooga
5e3f7e00a9
Create llamacpp_HF loader (#3062) 2023-07-16 02:21:13 -03:00
oobabooga
94dfcec237
Make it possible to evaluate exllama perplexity (#3138) 2023-07-16 01:52:55 -03:00