text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2024-11-25 01:09:22 +01:00

Author	SHA1	Message	Date
oobabooga	0a16224451	Update GPTQ_loader.py	2023-03-24 19:54:36 -03:00
oobabooga	a80aa65986	Update models.py	2023-03-24 19:53:20 -03:00
jllllll	817e6c681e	Update install.bat Added `cd /D "%~dp0"` in case the script is ran as admin.	2023-03-24 17:51:13 -05:00
jllllll	a80a5465f2	Update install.bat Updated Conda packages and channels to install cuda-toolkit and override 12.0 cuda packages requested by pytorch with their 11.7 equivalent. Removed Conda installation since we can use the downloaded Micromamba.exe for the same purpose with a smaller footprint. Removed redundant PATH changes. Changed %gpuchoice% comparisons to be case-insensitive. Added additional error handling and removed the use of .tmp files. Added missing extension requirements. Added GPTQ installation. Will attempt to compile locally and, if failed, will download and install a precompiled wheel. Incorporated fixes from one-click-bandaid. Fixed and expanded first sed command from one-click-bandaid. libbitsandbytes_cudaall.dll is used here as the cuda116.dll used by one-click-bandaid does not work on my 1080ti. This can be changed if needed.	2023-03-24 17:27:29 -05:00
oobabooga	507db0929d	Do not use empty user messages in chat mode This allows the bot to send messages by clicking on Generate with empty inputs.	2023-03-24 17:22:22 -03:00
oobabooga	6e1b16c2aa	Update html_generator.py	2023-03-24 17:18:27 -03:00
oobabooga	ffb0187e83	Update chat.py	2023-03-24 17:17:29 -03:00
oobabooga	c14e598f14	Merge pull request #433 from mayaeary/fix/api-reload Fix api extension duplicating	2023-03-24 16:56:10 -03:00
oobabooga	bfe960731f	Merge branch 'main' into fix/api-reload	2023-03-24 16:54:41 -03:00
oobabooga	4a724ed22f	Reorder imports	2023-03-24 16:53:56 -03:00
oobabooga	8fad84abc2	Update extensions.py	2023-03-24 16:51:27 -03:00
oobabooga	d8e950d6bd	Don't load the model twice when using --lora	2023-03-24 16:30:32 -03:00
oobabooga	fd99995b01	Make the Stop button more consistent in chat mode	2023-03-24 15:59:27 -03:00
Forkoz	b740c5b284	Add display of context when input was generated Not sure if I did this right but it does move with the conversation and seems to match value.	2023-03-24 08:56:07 -05:00
oobabooga	4f5c2ce785	Fix chat_generation_attempts	2023-03-24 02:03:30 -03:00
oobabooga	04417b658b	Update README.md	2023-03-24 01:40:43 -03:00
oobabooga	bb4cb22453	Download .pt files using download-model.py (for 4-bit models)	2023-03-24 00:49:04 -03:00
oobabooga	143b5b5edf	Mention one-click-bandaid in the README	2023-03-23 23:28:50 -03:00
EyeDeck	dcfd866402	Allow loading of .safetensors through GPTQ-for-LLaMa	2023-03-23 21:31:34 -04:00
oobabooga	8747c74339	Another missing import	2023-03-23 22:19:01 -03:00
oobabooga	7078d168c3	Missing import	2023-03-23 22:16:08 -03:00
oobabooga	d1327f99f9	Fix broken callbacks.py	2023-03-23 22:12:24 -03:00
oobabooga	9bdb3c784d	Minor fix	2023-03-23 22:02:40 -03:00
oobabooga	b0abb327d8	Update LoRA.py	2023-03-23 22:02:09 -03:00
oobabooga	bf22d16ebc	Clear cache while switching LoRAs	2023-03-23 21:56:26 -03:00
oobabooga	4578e88ffd	Stop the bot from talking for you in chat mode	2023-03-23 21:38:20 -03:00
oobabooga	9bf6ecf9e2	Fix LoRA device map (attempt)	2023-03-23 16:49:41 -03:00
oobabooga	c5ebcc5f7e	Change the default names (#518 ) * Update shared.py * Update settings-template.json	2023-03-23 13:36:00 -03:00
Φφ	483d173d23	Code reuse + indication Now shows the message in the console when unloading weights. Also reload_model() calls unload_model() first to free the memory so that multiple reloads won't overfill it.	2023-03-23 07:06:26 +03:00
Φφ	1917b15275	Unload and reload models on request	2023-03-23 07:06:26 +03:00
oobabooga	29bd41d453	Fix LoRA in CPU mode	2023-03-23 01:05:13 -03:00
oobabooga	eac27f4f55	Make LoRAs work in 16-bit mode	2023-03-23 00:55:33 -03:00
oobabooga	bfa81e105e	Fix FlexGen streaming	2023-03-23 00:22:14 -03:00
oobabooga	7b6f85d327	Fix markdown headers in light mode	2023-03-23 00:13:34 -03:00
oobabooga	de6a09dc7f	Properly separate the original prompt from the reply	2023-03-23 00:12:40 -03:00
oobabooga	d5fc1bead7	Merge pull request #489 from Brawlence/ext-fixes Extensions performance & memory optimisations	2023-03-22 16:10:59 -03:00
oobabooga	bfb1be2820	Minor fix	2023-03-22 16:09:48 -03:00
oobabooga	0abff499e2	Use image.thumbnail	2023-03-22 16:03:05 -03:00
oobabooga	104212529f	Minor changes	2023-03-22 15:55:03 -03:00
wywywywy	61346b88ea	Add "seed" menu in the Parameters tab	2023-03-22 15:40:20 -03:00
Φφ	5389fce8e1	Extensions performance & memory optimisations Reworked remove_surrounded_chars() to use regular expression ( https://regexr.com/7alb5 ) instead of repeated string concatenations for elevenlab_tts, silero_tts, sd_api_pictures. This should be both faster and more robust in handling asterisks. Reduced the memory footprint of send_pictures and sd_api_pictures by scaling the images in the chat to 300 pixels max-side wise. (The user already has the original in case of the sent picture and there's an option to save the SD generation). This should fix history growing annoyingly large with multiple pictures present	2023-03-22 11:51:00 +03:00
oobabooga	45b7e53565	Only catch proper Exceptions in the text generation function	2023-03-20 20:36:02 -03:00
oobabooga	6872ffd976	Update README.md	2023-03-20 16:53:14 -03:00
oobabooga	db4219a340	Update comments	2023-03-20 16:40:08 -03:00
oobabooga	7618f3fe8c	Add -gptq-preload for 4-bit offloading (#460 ) This works in a 4GB card now: ``` python server.py --model llama-7b-hf --gptq-bits 4 --gptq-pre-layer 20 ```	2023-03-20 16:30:56 -03:00
Vladimir Belitskiy	e96687b1d6	Do not send empty user input as part of the prompt. However, if extensions modify the empty prompt to be non-empty, it'l still work as before.	2023-03-20 14:27:39 -04:00
oobabooga	9a3bed50c3	Attempt at fixing 4-bit with CPU offload	2023-03-20 15:11:56 -03:00
oobabooga	536d0a4d93	Add an import	2023-03-20 14:00:40 -03:00
Vladimir Belitskiy	ca47e016b4	Do not display empty user messages in chat mode. There doesn't seem to be much value to them - they just take up space while also making it seem like there's still some sort of pseudo-dialogue going on, instead of a monologue by the bot.	2023-03-20 12:55:57 -04:00
oobabooga	75a7a84ef2	Exception handling (#454 ) * Update text_generation.py * Update extensions.py	2023-03-20 13:36:52 -03:00

... 43 44 45 46 47 ...

3236 Commits