oobabooga
c7aa51faa6
Use a list of eos_tokens instead of just a number
...
This might be the cause of LLaMA ramblings that some people have experienced.
2023-03-12 14:54:58 -03:00
Xan
b3e10e47c0
Fix merge conflict in text_generation
...
- Need to update `shared.still_streaming = False` before the final `yield formatted_outputs`, shifted the position of some yields.
2023-03-12 18:56:35 +11:00
oobabooga
341e135036
Various fixes in chat mode
2023-03-12 02:53:08 -03:00
oobabooga
b0e8cb8c88
Various fixes in chat mode
2023-03-12 02:31:45 -03:00
oobabooga
0bd5430988
Use 'with' statement to better handle streaming memory
2023-03-12 02:04:28 -03:00
oobabooga
37f0166b2d
Fix memory leak in new streaming (second attempt)
2023-03-11 23:14:49 -03:00
oobabooga
59b5f7a4b7
Improve usage of stopping_criteria
2023-03-08 12:13:40 -03:00
oobabooga
add9330e5e
Bug fixes
2023-03-08 11:26:29 -03:00
Xan
5648a41a27
Merge branch 'main' of https://github.com/xanthousm/text-generation-webui
2023-03-08 22:08:54 +11:00
Xan
ad6b699503
Better TTS with autoplay
...
- Adds "still_streaming" to shared module for extensions to know if generation is complete
- Changed TTS extension with new options:
- Show text under the audio widget
- Automatically play the audio once text generation finishes
- manage the generated wav files (only keep files for finished generations, optional max file limit)
- [wip] ability to change voice pitch and speed
- added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.
2023-03-08 22:02:17 +11:00
oobabooga
33fb6aed74
Minor bug fix
2023-03-08 03:08:16 -03:00
oobabooga
ad2970374a
Readability improvements
2023-03-08 03:00:06 -03:00
oobabooga
72d539dbff
Better separate the FlexGen case
2023-03-08 02:54:47 -03:00
oobabooga
ab50f80542
New text streaming method (much faster)
2023-03-08 02:46:35 -03:00
oobabooga
8e89bc596b
Fix encode() for RWKV
2023-03-07 23:15:46 -03:00
oobabooga
19a34941ed
Add proper streaming to RWKV
2023-03-07 18:17:56 -03:00
oobabooga
8660227e1b
Add top_k to RWKV
2023-03-07 17:24:28 -03:00
oobabooga
20bd645f6a
Fix bug in multigpu setups (attempt 3)
2023-03-06 15:58:18 -03:00
oobabooga
09a7c36e1b
Minor improvement while running custom models
2023-03-06 15:36:35 -03:00
oobabooga
24c4c20391
Fix bug in multigpu setups (attempt #2 )
2023-03-06 15:23:29 -03:00
oobabooga
d88b7836c6
Fix bug in multigpu setups
2023-03-06 14:58:30 -03:00
oobabooga
e91f4bc25a
Add RWKV tokenizer
2023-03-06 08:45:49 -03:00
oobabooga
a54b91af77
Improve readability
2023-03-05 10:21:15 -03:00
oobabooga
8e706df20e
Fix a memory leak when text streaming is on
2023-03-05 10:12:43 -03:00
oobabooga
c33715ad5b
Move towards HF LLaMA implementation
2023-03-05 01:20:31 -03:00
oobabooga
c93f1fa99b
Count the tokens more conservatively
2023-03-04 03:10:21 -03:00
oobabooga
05e703b4a4
Print the performance information more reliably
2023-03-03 21:24:32 -03:00
oobabooga
a345a2acd2
Add a tokenizer placeholder
2023-03-03 15:16:55 -03:00
oobabooga
5b354817f6
Make chat minimally work with LLaMA
2023-03-03 15:04:41 -03:00
oobabooga
ea5c5eb3da
Add LLaMA support
2023-03-03 14:39:14 -03:00
oobabooga
7bbe32f618
Don't return a value in an iterator function
2023-03-02 00:48:46 -03:00
oobabooga
ff9f649c0c
Remove some unused imports
2023-03-02 00:36:20 -03:00
oobabooga
955cf431e8
Minor consistency fix
2023-03-01 19:11:26 -03:00
oobabooga
831ac7ed3f
Add top_p
2023-03-01 16:45:48 -03:00
oobabooga
7c4d5ca8cc
Improve the text generation call a bit
2023-03-01 16:40:25 -03:00
oobabooga
0f6708c471
Sort the imports
2023-03-01 12:18:17 -03:00
oobabooga
e735806c51
Add a generate() function for RWKV
2023-03-01 12:16:11 -03:00
oobabooga
f871971de1
Trying to get the chat to work
2023-02-28 00:25:30 -03:00
oobabooga
ebd698905c
Add streaming to RWKV
2023-02-28 00:04:04 -03:00
oobabooga
70e522732c
Move RWKV loader into a separate file
2023-02-27 23:50:16 -03:00
oobabooga
ebc64a408c
RWKV support prototype
2023-02-27 23:03:35 -03:00
oobabooga
6e843a11d6
Fix FlexGen in chat mode
2023-02-26 00:36:04 -03:00
oobabooga
fa58fd5559
Proper way to free the cuda cache
2023-02-25 15:50:29 -03:00
oobabooga
700311ce40
Empty the cuda cache at model.generate()
2023-02-25 14:39:13 -03:00
oobabooga
78ad55641b
Remove duplicate max_new_tokens parameter
2023-02-24 17:19:42 -03:00
oobabooga
65326b545a
Move all gradio elements to shared (so that extensions can use them)
2023-02-24 16:46:50 -03:00
oobabooga
9ae063e42b
Fix softprompts when deepspeed is active ( #112 )
2023-02-23 20:22:47 -03:00
oobabooga
7224343a70
Improve the imports
2023-02-23 14:41:42 -03:00
oobabooga
1dacd34165
Further refactor
2023-02-23 13:28:30 -03:00
oobabooga
ce7feb3641
Further refactor
2023-02-23 13:03:52 -03:00