draff
804486214b
Re-implement --load-in-4bit and update --llama-bits arg description
2023-03-10 23:21:01 +00:00
ItsLogic
9ba8156a70
remove unnecessary Path()
2023-03-10 22:33:58 +00:00
draff
e6c631aea4
Replace --load-in-4bit with --llama-bits
...
Replaces --load-in-4bit with a more flexible --llama-bits arg to allow for 2 and 3 bit models as well. This commit also fixes a loading issue with .pt files which are not in the root of the models folder
2023-03-10 21:36:45 +00:00
oobabooga
026d60bd34
Remove default preset that didn't do anything
2023-03-10 14:01:02 -03:00
oobabooga
e9dbdafb14
Merge branch 'main' into pt-path-changes
2023-03-10 11:03:42 -03:00
oobabooga
706a03b2cb
Minor changes
2023-03-10 11:02:25 -03:00
oobabooga
de7dd8b6aa
Add comments
2023-03-10 10:54:08 -03:00
oobabooga
e461c0b7a0
Move the import to the top
2023-03-10 10:51:12 -03:00
deepdiffuser
9fbd60bf22
add no_split_module_classes to prevent tensor split error
2023-03-10 05:30:47 -08:00
deepdiffuser
ab47044459
add multi-gpu support for 4bit gptq LLaMA
2023-03-10 04:52:45 -08:00
rohvani
2ac2913747
fix reference issue
2023-03-09 20:13:23 -08:00
rohvani
826e297b0e
add llama-65b-4bit support & multiple pt paths
2023-03-09 18:31:32 -08:00
oobabooga
9849aac0f1
Don't show .pt models in the list
2023-03-09 21:54:50 -03:00
oobabooga
74102d5ee4
Insert to the path instead of appending
2023-03-09 20:51:22 -03:00
oobabooga
2965aa1625
Check if the .pt file exists
2023-03-09 20:48:51 -03:00
oobabooga
828a524f9a
Add LLaMA 4-bit support
2023-03-09 15:50:26 -03:00
oobabooga
59b5f7a4b7
Improve usage of stopping_criteria
2023-03-08 12:13:40 -03:00
oobabooga
add9330e5e
Bug fixes
2023-03-08 11:26:29 -03:00
Xan
5648a41a27
Merge branch 'main' of https://github.com/xanthousm/text-generation-webui
2023-03-08 22:08:54 +11:00
Xan
ad6b699503
Better TTS with autoplay
...
- Adds "still_streaming" to shared module for extensions to know if generation is complete
- Changed TTS extension with new options:
- Show text under the audio widget
- Automatically play the audio once text generation finishes
- manage the generated wav files (only keep files for finished generations, optional max file limit)
- [wip] ability to change voice pitch and speed
- added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.
2023-03-08 22:02:17 +11:00
oobabooga
33fb6aed74
Minor bug fix
2023-03-08 03:08:16 -03:00
oobabooga
ad2970374a
Readability improvements
2023-03-08 03:00:06 -03:00
oobabooga
72d539dbff
Better separate the FlexGen case
2023-03-08 02:54:47 -03:00
oobabooga
0e16c0bacb
Remove redeclaration of a function
2023-03-08 02:50:49 -03:00
oobabooga
ab50f80542
New text streaming method (much faster)
2023-03-08 02:46:35 -03:00
oobabooga
8e89bc596b
Fix encode() for RWKV
2023-03-07 23:15:46 -03:00
oobabooga
19a34941ed
Add proper streaming to RWKV
2023-03-07 18:17:56 -03:00
oobabooga
8660227e1b
Add top_k to RWKV
2023-03-07 17:24:28 -03:00
oobabooga
153dfeb4dd
Add --rwkv-cuda-on parameter, bump rwkv version
2023-03-06 20:12:54 -03:00
oobabooga
6904a507c6
Change some parameters
2023-03-06 16:29:43 -03:00
oobabooga
20bd645f6a
Fix bug in multigpu setups (attempt 3)
2023-03-06 15:58:18 -03:00
oobabooga
09a7c36e1b
Minor improvement while running custom models
2023-03-06 15:36:35 -03:00
oobabooga
24c4c20391
Fix bug in multigpu setups (attempt #2 )
2023-03-06 15:23:29 -03:00
oobabooga
d88b7836c6
Fix bug in multigpu setups
2023-03-06 14:58:30 -03:00
oobabooga
5bed607b77
Increase repetition frequency/penalty for RWKV
2023-03-06 14:25:48 -03:00
oobabooga
bf56b6c1fb
Load settings.json without the need for --settings settings.json
...
This is for setting UI defaults
2023-03-06 10:57:45 -03:00
oobabooga
e91f4bc25a
Add RWKV tokenizer
2023-03-06 08:45:49 -03:00
oobabooga
c855b828fe
Better handle <USER>
2023-03-05 17:01:47 -03:00
oobabooga
2af66a4d4c
Fix <USER> in pygmalion replies
2023-03-05 16:08:50 -03:00
oobabooga
a54b91af77
Improve readability
2023-03-05 10:21:15 -03:00
oobabooga
8e706df20e
Fix a memory leak when text streaming is on
2023-03-05 10:12:43 -03:00
oobabooga
c33715ad5b
Move towards HF LLaMA implementation
2023-03-05 01:20:31 -03:00
oobabooga
bd8aac8fa4
Add LLaMA 8-bit support
2023-03-04 13:28:42 -03:00
oobabooga
c93f1fa99b
Count the tokens more conservatively
2023-03-04 03:10:21 -03:00
oobabooga
ed8b35efd2
Add --pin-weight parameter for FlexGen
2023-03-04 01:04:02 -03:00
oobabooga
05e703b4a4
Print the performance information more reliably
2023-03-03 21:24:32 -03:00
oobabooga
5a79863df3
Increase the sequence length, decrease batch size
...
I have no idea what I am doing
2023-03-03 15:54:13 -03:00
oobabooga
a345a2acd2
Add a tokenizer placeholder
2023-03-03 15:16:55 -03:00
oobabooga
5b354817f6
Make chat minimally work with LLaMA
2023-03-03 15:04:41 -03:00
oobabooga
ea5c5eb3da
Add LLaMA support
2023-03-03 14:39:14 -03:00
oobabooga
2bff646130
Stop chat from flashing dark when processing
2023-03-03 13:19:13 -03:00
oobabooga
169209805d
Model-aware prompts and presets
2023-03-02 11:25:04 -03:00
oobabooga
7bbe32f618
Don't return a value in an iterator function
2023-03-02 00:48:46 -03:00
oobabooga
ff9f649c0c
Remove some unused imports
2023-03-02 00:36:20 -03:00
oobabooga
1a05860ca3
Ensure proper no-streaming with generation_attempts > 1
2023-03-02 00:10:10 -03:00
oobabooga
a2a3e8f797
Add --rwkv-strategy parameter
2023-03-01 20:02:48 -03:00
oobabooga
449116a510
Fix RWKV paths on Windows (attempt)
2023-03-01 19:17:16 -03:00
oobabooga
955cf431e8
Minor consistency fix
2023-03-01 19:11:26 -03:00
oobabooga
f3da6dcc8f
Merge pull request #149 from oobabooga/RWKV
...
Add RWKV support
2023-03-01 16:57:45 -03:00
oobabooga
831ac7ed3f
Add top_p
2023-03-01 16:45:48 -03:00
oobabooga
7c4d5ca8cc
Improve the text generation call a bit
2023-03-01 16:40:25 -03:00
oobabooga
2f16ce309a
Rename a variable
2023-03-01 12:33:09 -03:00
oobabooga
9e9cfc4b31
Parameters
2023-03-01 12:19:37 -03:00
oobabooga
0f6708c471
Sort the imports
2023-03-01 12:18:17 -03:00
oobabooga
e735806c51
Add a generate() function for RWKV
2023-03-01 12:16:11 -03:00
oobabooga
659bb76722
Add RWKVModel class
2023-03-01 12:08:55 -03:00
oobabooga
9c86a1cd4a
Add RWKV pip package
2023-03-01 11:42:49 -03:00
oobabooga
6837d4d72a
Load the model by name
2023-02-28 02:52:29 -03:00
oobabooga
a1429d1607
Add default extensions to the settings
2023-02-28 02:20:11 -03:00
oobabooga
19ccb2aaf5
Handle <USER> and <BOT>
2023-02-28 01:05:43 -03:00
oobabooga
626da6c731
Handle {{user}} and {{char}} in example dialogue
2023-02-28 00:59:05 -03:00
oobabooga
e861e68e38
Move the chat example dialogue to the prompt
2023-02-28 00:50:46 -03:00
oobabooga
f871971de1
Trying to get the chat to work
2023-02-28 00:25:30 -03:00
oobabooga
67ee7bead7
Add cpu, bf16 options
2023-02-28 00:09:11 -03:00
oobabooga
ebd698905c
Add streaming to RWKV
2023-02-28 00:04:04 -03:00
oobabooga
70e522732c
Move RWKV loader into a separate file
2023-02-27 23:50:16 -03:00
oobabooga
ebc64a408c
RWKV support prototype
2023-02-27 23:03:35 -03:00
oobabooga
021bd55886
Better format the prompt when generation attempts > 1
2023-02-27 21:37:03 -03:00
oobabooga
43b6ab8673
Store thumbnails as files instead of base64 strings
...
This improves the UI responsiveness for large histories.
2023-02-27 13:41:00 -03:00
oobabooga
f24b6e78a3
Fix clear history
2023-02-26 23:58:04 -03:00
oobabooga
8e3e8a070f
Make FlexGen work with the newest API
2023-02-26 16:53:41 -03:00
oobabooga
3333f94c30
Make the gallery extension work on colab
2023-02-26 12:37:26 -03:00
oobabooga
633a2b6be2
Don't regenerate/remove last message if the chat is empty
2023-02-26 00:43:12 -03:00
oobabooga
6e843a11d6
Fix FlexGen in chat mode
2023-02-26 00:36:04 -03:00
oobabooga
4548227fb5
Downgrade gradio version (file uploads are broken in 3.19.1)
2023-02-25 22:59:02 -03:00
oobabooga
9456c1d6ed
Prevent streaming with no_stream + generation attempts > 1
2023-02-25 17:45:03 -03:00
oobabooga
32f40f3b42
Bump gradio version to 3.19.1
2023-02-25 17:20:03 -03:00
oobabooga
fa58fd5559
Proper way to free the cuda cache
2023-02-25 15:50:29 -03:00
oobabooga
b585e382c0
Rename the custom prompt generator function
2023-02-25 15:13:14 -03:00
oobabooga
700311ce40
Empty the cuda cache at model.generate()
2023-02-25 14:39:13 -03:00
oobabooga
1878acd9f3
Minor bug fix in chat
2023-02-25 09:30:59 -03:00
oobabooga
e71ff959f5
Clean up some unused code
2023-02-25 09:23:02 -03:00
oobabooga
91f5852245
Move bot_picture.py inside the extension
2023-02-25 03:00:19 -03:00
oobabooga
5ac24b019e
Minor fix in the extensions implementation
2023-02-25 02:53:18 -03:00
oobabooga
85f914b9b9
Disable the hijack after using it
2023-02-25 02:36:01 -03:00
oobabooga
7e9f13e29f
Rename a variable
2023-02-25 01:55:32 -03:00
oobabooga
1741c36092
Minor fix
2023-02-25 01:47:25 -03:00
oobabooga
7c2babfe39
Rename greed to "generation attempts"
2023-02-25 01:42:19 -03:00
oobabooga
2dfb999bf1
Add greed parameter
2023-02-25 01:31:01 -03:00
oobabooga
13f2688134
Better way to generate custom prompts
2023-02-25 01:08:17 -03:00