text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2024-12-27 22:59:32 +01:00

Author	SHA1	Message	Date
oobabooga	f3b00dd165	Merge pull request #224 from ItsLogic/llama-bits Allow users to load 2, 3 and 4 bit llama models	2023-03-12 11:23:50 -03:00
oobabooga	65dda28c9d	Rename --llama-bits to --gptq-bits	2023-03-12 11:19:07 -03:00
oobabooga	fed3617f07	Move LLaMA 4-bit into a separate file	2023-03-12 11:12:34 -03:00
oobabooga	0ac562bdba	Add a default prompt for OpenAssistant oasst-sft-1-pythia-12b #253	2023-03-12 10:46:16 -03:00
oobabooga	78901d522b	Remove unused imports	2023-03-12 08:59:05 -03:00
Xan	b3e10e47c0	Fix merge conflict in text_generation - Need to update `shared.still_streaming = False` before the final `yield formatted_outputs`, shifted the position of some yields.	2023-03-12 18:56:35 +11:00
oobabooga	ad14f0e499	Fix regenerate (provisory way)	2023-03-12 03:42:29 -03:00
oobabooga	6e12068ba2	Merge pull request #258 from lxe/lxe/utf8 Load and save character files and chat history in UTF-8	2023-03-12 03:28:49 -03:00
oobabooga	e2da6b9685	Fix You You You appearing in chat mode	2023-03-12 03:25:56 -03:00
oobabooga	bcf0075278	Merge pull request #235 from xanthousm/Quality_of_life-main --auto-launch and "Is typing..."	2023-03-12 03:12:56 -03:00
Aleksey Smolenchuk	3f7c3d6559	No need to set encoding on binary read	2023-03-11 22:10:57 -08:00
oobabooga	341e135036	Various fixes in chat mode	2023-03-12 02:53:08 -03:00
Aleksey Smolenchuk	3baf5fc700	Load and save chat history in utf-8	2023-03-11 21:40:01 -08:00
oobabooga	b0e8cb8c88	Various fixes in chat mode	2023-03-12 02:31:45 -03:00
unknown	433f6350bc	Load and save character files in UTF-8	2023-03-11 21:23:05 -08:00
oobabooga	0bd5430988	Use 'with' statement to better handle streaming memory	2023-03-12 02:04:28 -03:00
oobabooga	37f0166b2d	Fix memory leak in new streaming (second attempt)	2023-03-11 23:14:49 -03:00
oobabooga	92fe947721	Merge branch 'main' into new-streaming	2023-03-11 19:59:45 -03:00
oobabooga	2743dd736a	Add Is typing... to impersonate as well	2023-03-11 10:50:18 -03:00
Xan	96c51973f9	--auto-launch and "Is typing..." - Added `--auto-launch` arg to open web UI in the default browser when ready. - Changed chat.py to display user input immediately and "Is typing..." as a temporary reply while generating text. Most noticeable when using `--no-stream`.	2023-03-11 22:50:59 +11:00
Xan	33df4bd91f	Merge remote-tracking branch 'upstream/main'	2023-03-11 22:40:47 +11:00
draff	28fd4fc970	Change wording to be consistent with other args	2023-03-10 23:34:13 +00:00
draff	001e638b47	Make it actually work	2023-03-10 23:28:19 +00:00
draff	804486214b	Re-implement --load-in-4bit and update --llama-bits arg description	2023-03-10 23:21:01 +00:00
ItsLogic	9ba8156a70	remove unnecessary Path()	2023-03-10 22:33:58 +00:00
draff	e6c631aea4	Replace --load-in-4bit with --llama-bits Replaces --load-in-4bit with a more flexible --llama-bits arg to allow for 2 and 3 bit models as well. This commit also fixes a loading issue with .pt files which are not in the root of the models folder	2023-03-10 21:36:45 +00:00
oobabooga	026d60bd34	Remove default preset that didn't do anything	2023-03-10 14:01:02 -03:00
oobabooga	e9dbdafb14	Merge branch 'main' into pt-path-changes	2023-03-10 11:03:42 -03:00
oobabooga	706a03b2cb	Minor changes	2023-03-10 11:02:25 -03:00
oobabooga	de7dd8b6aa	Add comments	2023-03-10 10:54:08 -03:00
oobabooga	e461c0b7a0	Move the import to the top	2023-03-10 10:51:12 -03:00
deepdiffuser	9fbd60bf22	add no_split_module_classes to prevent tensor split error	2023-03-10 05:30:47 -08:00
deepdiffuser	ab47044459	add multi-gpu support for 4bit gptq LLaMA	2023-03-10 04:52:45 -08:00
rohvani	2ac2913747	fix reference issue	2023-03-09 20:13:23 -08:00
rohvani	826e297b0e	add llama-65b-4bit support & multiple pt paths	2023-03-09 18:31:32 -08:00
oobabooga	9849aac0f1	Don't show .pt models in the list	2023-03-09 21:54:50 -03:00
oobabooga	74102d5ee4	Insert to the path instead of appending	2023-03-09 20:51:22 -03:00
oobabooga	2965aa1625	Check if the .pt file exists	2023-03-09 20:48:51 -03:00
oobabooga	828a524f9a	Add LLaMA 4-bit support	2023-03-09 15:50:26 -03:00
oobabooga	59b5f7a4b7	Improve usage of stopping_criteria	2023-03-08 12:13:40 -03:00
oobabooga	add9330e5e	Bug fixes	2023-03-08 11:26:29 -03:00
Xan	5648a41a27	Merge branch 'main' of https://github.com/xanthousm/text-generation-webui	2023-03-08 22:08:54 +11:00
Xan	ad6b699503	Better TTS with autoplay - Adds "still_streaming" to shared module for extensions to know if generation is complete - Changed TTS extension with new options: - Show text under the audio widget - Automatically play the audio once text generation finishes - manage the generated wav files (only keep files for finished generations, optional max file limit) - [wip] ability to change voice pitch and speed - added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.	2023-03-08 22:02:17 +11:00
oobabooga	33fb6aed74	Minor bug fix	2023-03-08 03:08:16 -03:00
oobabooga	ad2970374a	Readability improvements	2023-03-08 03:00:06 -03:00
oobabooga	72d539dbff	Better separate the FlexGen case	2023-03-08 02:54:47 -03:00
oobabooga	0e16c0bacb	Remove redeclaration of a function	2023-03-08 02:50:49 -03:00
oobabooga	ab50f80542	New text streaming method (much faster)	2023-03-08 02:46:35 -03:00
oobabooga	8e89bc596b	Fix encode() for RWKV	2023-03-07 23:15:46 -03:00
oobabooga	19a34941ed	Add proper streaming to RWKV	2023-03-07 18:17:56 -03:00
oobabooga	8660227e1b	Add top_k to RWKV	2023-03-07 17:24:28 -03:00
oobabooga	153dfeb4dd	Add --rwkv-cuda-on parameter, bump rwkv version	2023-03-06 20:12:54 -03:00
oobabooga	6904a507c6	Change some parameters	2023-03-06 16:29:43 -03:00
oobabooga	20bd645f6a	Fix bug in multigpu setups (attempt 3)	2023-03-06 15:58:18 -03:00
oobabooga	09a7c36e1b	Minor improvement while running custom models	2023-03-06 15:36:35 -03:00
oobabooga	24c4c20391	Fix bug in multigpu setups (attempt #2 )	2023-03-06 15:23:29 -03:00
oobabooga	d88b7836c6	Fix bug in multigpu setups	2023-03-06 14:58:30 -03:00
oobabooga	5bed607b77	Increase repetition frequency/penalty for RWKV	2023-03-06 14:25:48 -03:00
oobabooga	bf56b6c1fb	Load settings.json without the need for --settings settings.json This is for setting UI defaults	2023-03-06 10:57:45 -03:00
oobabooga	e91f4bc25a	Add RWKV tokenizer	2023-03-06 08:45:49 -03:00
oobabooga	c855b828fe	Better handle <USER>	2023-03-05 17:01:47 -03:00
oobabooga	2af66a4d4c	Fix <USER> in pygmalion replies	2023-03-05 16:08:50 -03:00
oobabooga	a54b91af77	Improve readability	2023-03-05 10:21:15 -03:00
oobabooga	8e706df20e	Fix a memory leak when text streaming is on	2023-03-05 10:12:43 -03:00
oobabooga	c33715ad5b	Move towards HF LLaMA implementation	2023-03-05 01:20:31 -03:00
oobabooga	bd8aac8fa4	Add LLaMA 8-bit support	2023-03-04 13:28:42 -03:00
oobabooga	c93f1fa99b	Count the tokens more conservatively	2023-03-04 03:10:21 -03:00
oobabooga	ed8b35efd2	Add --pin-weight parameter for FlexGen	2023-03-04 01:04:02 -03:00
oobabooga	05e703b4a4	Print the performance information more reliably	2023-03-03 21:24:32 -03:00
oobabooga	5a79863df3	Increase the sequence length, decrease batch size I have no idea what I am doing	2023-03-03 15:54:13 -03:00
oobabooga	a345a2acd2	Add a tokenizer placeholder	2023-03-03 15:16:55 -03:00
oobabooga	5b354817f6	Make chat minimally work with LLaMA	2023-03-03 15:04:41 -03:00
oobabooga	ea5c5eb3da	Add LLaMA support	2023-03-03 14:39:14 -03:00
oobabooga	2bff646130	Stop chat from flashing dark when processing	2023-03-03 13:19:13 -03:00
oobabooga	169209805d	Model-aware prompts and presets	2023-03-02 11:25:04 -03:00
oobabooga	7bbe32f618	Don't return a value in an iterator function	2023-03-02 00:48:46 -03:00
oobabooga	ff9f649c0c	Remove some unused imports	2023-03-02 00:36:20 -03:00
oobabooga	1a05860ca3	Ensure proper no-streaming with generation_attempts > 1	2023-03-02 00:10:10 -03:00
oobabooga	a2a3e8f797	Add --rwkv-strategy parameter	2023-03-01 20:02:48 -03:00
oobabooga	449116a510	Fix RWKV paths on Windows (attempt)	2023-03-01 19:17:16 -03:00
oobabooga	955cf431e8	Minor consistency fix	2023-03-01 19:11:26 -03:00
oobabooga	f3da6dcc8f	Merge pull request #149 from oobabooga/RWKV Add RWKV support	2023-03-01 16:57:45 -03:00
oobabooga	831ac7ed3f	Add top_p	2023-03-01 16:45:48 -03:00
oobabooga	7c4d5ca8cc	Improve the text generation call a bit	2023-03-01 16:40:25 -03:00
oobabooga	2f16ce309a	Rename a variable	2023-03-01 12:33:09 -03:00
oobabooga	9e9cfc4b31	Parameters	2023-03-01 12:19:37 -03:00
oobabooga	0f6708c471	Sort the imports	2023-03-01 12:18:17 -03:00
oobabooga	e735806c51	Add a generate() function for RWKV	2023-03-01 12:16:11 -03:00
oobabooga	659bb76722	Add RWKVModel class	2023-03-01 12:08:55 -03:00
oobabooga	9c86a1cd4a	Add RWKV pip package	2023-03-01 11:42:49 -03:00
oobabooga	6837d4d72a	Load the model by name	2023-02-28 02:52:29 -03:00
oobabooga	a1429d1607	Add default extensions to the settings	2023-02-28 02:20:11 -03:00
oobabooga	19ccb2aaf5	Handle <USER> and <BOT>	2023-02-28 01:05:43 -03:00
oobabooga	626da6c731	Handle {{user}} and {{char}} in example dialogue	2023-02-28 00:59:05 -03:00
oobabooga	e861e68e38	Move the chat example dialogue to the prompt	2023-02-28 00:50:46 -03:00
oobabooga	f871971de1	Trying to get the chat to work	2023-02-28 00:25:30 -03:00
oobabooga	67ee7bead7	Add cpu, bf16 options	2023-02-28 00:09:11 -03:00
oobabooga	ebd698905c	Add streaming to RWKV	2023-02-28 00:04:04 -03:00
oobabooga	70e522732c	Move RWKV loader into a separate file	2023-02-27 23:50:16 -03:00
oobabooga	ebc64a408c	RWKV support prototype	2023-02-27 23:03:35 -03:00
oobabooga	021bd55886	Better format the prompt when generation attempts > 1	2023-02-27 21:37:03 -03:00
oobabooga	43b6ab8673	Store thumbnails as files instead of base64 strings This improves the UI responsiveness for large histories.	2023-02-27 13:41:00 -03:00
oobabooga	f24b6e78a3	Fix clear history	2023-02-26 23:58:04 -03:00
oobabooga	8e3e8a070f	Make FlexGen work with the newest API	2023-02-26 16:53:41 -03:00
oobabooga	3333f94c30	Make the gallery extension work on colab	2023-02-26 12:37:26 -03:00
oobabooga	633a2b6be2	Don't regenerate/remove last message if the chat is empty	2023-02-26 00:43:12 -03:00
oobabooga	6e843a11d6	Fix FlexGen in chat mode	2023-02-26 00:36:04 -03:00
oobabooga	4548227fb5	Downgrade gradio version (file uploads are broken in 3.19.1)	2023-02-25 22:59:02 -03:00
oobabooga	9456c1d6ed	Prevent streaming with no_stream + generation attempts > 1	2023-02-25 17:45:03 -03:00
oobabooga	32f40f3b42	Bump gradio version to 3.19.1	2023-02-25 17:20:03 -03:00
oobabooga	fa58fd5559	Proper way to free the cuda cache	2023-02-25 15:50:29 -03:00
oobabooga	b585e382c0	Rename the custom prompt generator function	2023-02-25 15:13:14 -03:00
oobabooga	700311ce40	Empty the cuda cache at model.generate()	2023-02-25 14:39:13 -03:00
oobabooga	1878acd9f3	Minor bug fix in chat	2023-02-25 09:30:59 -03:00
oobabooga	e71ff959f5	Clean up some unused code	2023-02-25 09:23:02 -03:00
oobabooga	91f5852245	Move bot_picture.py inside the extension	2023-02-25 03:00:19 -03:00
oobabooga	5ac24b019e	Minor fix in the extensions implementation	2023-02-25 02:53:18 -03:00
oobabooga	85f914b9b9	Disable the hijack after using it	2023-02-25 02:36:01 -03:00
oobabooga	7e9f13e29f	Rename a variable	2023-02-25 01:55:32 -03:00
oobabooga	1741c36092	Minor fix	2023-02-25 01:47:25 -03:00
oobabooga	7c2babfe39	Rename greed to "generation attempts"	2023-02-25 01:42:19 -03:00
oobabooga	2dfb999bf1	Add greed parameter	2023-02-25 01:31:01 -03:00
oobabooga	13f2688134	Better way to generate custom prompts	2023-02-25 01:08:17 -03:00
oobabooga	67623a52b7	Allow for permanent hijacking	2023-02-25 00:55:19 -03:00
oobabooga	111b5d42e7	Add prompt hijack option for extensions	2023-02-25 00:49:18 -03:00
oobabooga	7a527a5581	Move "send picture" into an extension I am not proud of how I did it for now.	2023-02-25 00:23:51 -03:00
oobabooga	e51ece21c0	Add ui() function to extensions	2023-02-24 19:00:11 -03:00
oobabooga	78ad55641b	Remove duplicate max_new_tokens parameter	2023-02-24 17:19:42 -03:00
oobabooga	65326b545a	Move all gradio elements to shared (so that extensions can use them)	2023-02-24 16:46:50 -03:00
oobabooga	0817fe1beb	Move code back into the chatbot wrapper	2023-02-24 14:10:32 -03:00
oobabooga	8a7563ae84	Reorder the imports	2023-02-24 12:42:43 -03:00
oobabooga	ace74a557a	Add some comments	2023-02-24 12:41:27 -03:00
oobabooga	fe5057f932	Simplify the extensions implementation	2023-02-24 10:01:21 -03:00
oobabooga	2fb6ae6970	Move chat preprocessing into a separate function	2023-02-24 09:40:48 -03:00
oobabooga	f6f792363b	Separate command-line params by spaces instead of commas	2023-02-24 08:55:09 -03:00
oobabooga	e260e84e5a	Merge branch 'max_memory' of https://github.com/elwolf6/text-generation-webui into elwolf6-max_memory	2023-02-24 08:47:01 -03:00
oobabooga	146f786c57	Reorganize a bit	2023-02-24 08:44:54 -03:00
oobabooga	c2f4c395b9	Clean up some chat functions	2023-02-24 08:31:30 -03:00
luis	5abdc99a7c	gpu-memory arg change	2023-02-23 18:43:55 -05:00
oobabooga	9ae063e42b	Fix softprompts when deepspeed is active (#112 )	2023-02-23 20:22:47 -03:00
oobabooga	dac6fe0ff4	Reset the history if no default history exists on reload	2023-02-23 19:53:50 -03:00
oobabooga	3b8cecbab7	Reload the default chat on page refresh	2023-02-23 19:50:23 -03:00
oobabooga	f1914115d3	Fix minor issue with chat logs	2023-02-23 16:04:47 -03:00
oobabooga	b78561fba6	Minor bug fix	2023-02-23 15:26:41 -03:00
oobabooga	2e86a1ec04	Move chat history into shared module	2023-02-23 15:11:18 -03:00
oobabooga	c87800341c	Move function to extensions module	2023-02-23 14:55:21 -03:00
oobabooga	2048b403a5	Reorder functions	2023-02-23 14:49:02 -03:00
oobabooga	7224343a70	Improve the imports	2023-02-23 14:41:42 -03:00
oobabooga	e46c43afa6	Move some stuff from server.py to modules	2023-02-23 13:42:23 -03:00
oobabooga	1dacd34165	Further refactor	2023-02-23 13:28:30 -03:00

1 2 3 4 5 ...

286 Commits