Reworked remove_surrounded_chars() to use regular expression ( https://regexr.com/7alb5 ) instead of repeated string concatenations for elevenlab_tts, silero_tts, sd_api_pictures. This should be both faster and more robust in handling asterisks.
Reduced the memory footprint of send_pictures and sd_api_pictures by scaling the images in the chat to 300 pixels max-side wise. (The user already has the original in case of the sent picture and there's an option to save the SD generation).
This should fix history growing annoyingly large with multiple pictures present
This should only be used with --no-stream.
The shared.still_streaming implementation was faulty by design:
output_modifier should never be called when streaming is already over.
- Change wav naming to be completely unique using timestamp instead of message ID, stops browser using cached audio when new audio is made with the same file name (eg after regenerate or clear history).
- Make the autoplay setting actually disable autoplay.
- Make Settings panel a bit more compact.
- Hide html errors when audio file of chat history is missing.
- Add button to permanently convert TTS history to normal text messages
- Changed the "show message text" toggle to affect the chat history.
- New autoplay using html tag, removed from old message when new input provided
- Add voice pitch and speed control
- Group settings together
- Use name + conversation history to match wavs to messages, minimize problems when changing characters
Current minor bugs:
- Gradio seems to cache the audio files, so using "clear history" and generating new messages will play the old audio (the new messages are saving correctly). Gradio will clear cache and use correct audio after a few messages or after a page refresh.
- Switching characters does not immediately update the message ID used for the audio. ID is updated after the first new message, but that message will use the wrong ID
- Keeping simpleaudio until audio block "autoplay" doesn't play previous messages
- Only generate audio for finished messages
- Better name for autoplay, clean up comments
- set default to unlimited wav files. Still a few bugs when wav id resets
Co-Authored-By: Christoph Hess <9931495+ChristophHess@users.noreply.github.com>
- Adds "still_streaming" to shared module for extensions to know if generation is complete
- Changed TTS extension with new options:
- Show text under the audio widget
- Automatically play the audio once text generation finishes
- manage the generated wav files (only keep files for finished generations, optional max file limit)
- [wip] ability to change voice pitch and speed
- added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.
As per your suggestion at https://github.com/oobabooga/text-generation-webui/issues/159 here's my attempt.
I'm brand new to python and github. Completely different from unreal + visual coding, so forgive my amateurish code. This essentially adds support for Eleven Labs TTS. Tested it without major issues, and I believe it's functional (hopefully).
Extra requirements: elevenlabslib https://github.com/lugia19/elevenlabslib, sounddevice0.4.6, and soundfile
Folder structure is the same as the SileroTTS Extension.