text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2024-12-01 20:04:04 +01:00

Author	SHA1	Message	Date
FartyPants	33f56fd41d	Update models.py to clear LORA names after unload (#2951 )	2023-07-03 17:39:06 -03:00
oobabooga	f0fcd1f697	Sort some imports	2023-06-25 01:44:36 -03:00
Panchovix	5646690769	Fix some models not loading on exllama_hf (#2835 )	2023-06-23 11:31:02 -03:00
LarryVRH	580c1ee748	Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777 )	2023-06-21 15:31:42 -03:00
ThisIsPIRI	def3b69002	Fix loading condition for universal llama tokenizer (#2753 )	2023-06-18 18:14:06 -03:00
oobabooga	9f40032d32	Add ExLlama support (#2444 )	2023-06-16 20:35:38 -03:00
oobabooga	7ef6a50e84	Reorganize model loading UI completely (#2720 )	2023-06-16 19:00:37 -03:00
oobabooga	00b94847da	Remove softprompt support	2023-06-06 07:42:23 -03:00
oobabooga	f276d88546	Use AutoGPTQ by default for GPTQ models	2023-06-05 15:41:48 -03:00
oobabooga	3578dd3611	Change a warning message	2023-05-29 22:40:54 -03:00
Luis Lopez	9e7204bef4	Add tail-free and top-a sampling (#2357 )	2023-05-29 21:40:01 -03:00
Forkoz	60ae80cf28	Fix hang in tokenizer for AutoGPTQ llama models. (#2399 )	2023-05-28 23:10:10 -03:00
oobabooga	361451ba60	Add --load-in-4bit parameter (#2320 )	2023-05-25 01:14:13 -03:00
oobabooga	cd3618d7fb	Add support for RWKV in Hugging Face format	2023-05-23 02:07:28 -03:00
oobabooga	e116d31180	Prevent unwanted log messages from modules	2023-05-21 22:42:34 -03:00
oobabooga	05593a7834	Minor bug fix	2023-05-20 23:22:36 -03:00
oobabooga	9d5025f531	Improve error handling while loading GPTQ models	2023-05-19 11:20:08 -03:00
oobabooga	ef10ffc6b4	Add various checks to model loading functions	2023-05-17 16:14:54 -03:00
oobabooga	abd361b3a0	Minor change	2023-05-17 11:33:43 -03:00
oobabooga	21ecc3701e	Avoid a name conflict	2023-05-17 11:23:13 -03:00
oobabooga	1a8151a2b6	Add AutoGPTQ support (basic) (#2132 )	2023-05-17 11:12:12 -03:00
oobabooga	7584d46c29	Refactor models.py (#2113 )	2023-05-16 19:52:22 -03:00
oobabooga	4e66f68115	Create get_max_memory_dict() function	2023-05-15 19:38:27 -03:00
oobabooga	2eeb27659d	Fix bug in --cpu-memory	2023-05-12 06:17:07 -03:00
oobabooga	3316e33d14	Remove unused code	2023-05-10 11:59:59 -03:00
oobabooga	3913155c1f	Style improvements (#1957 )	2023-05-09 22:49:39 -03:00
Wesley Pyburn	a2b25322f0	Fix trust_remote_code in wrong location (#1953 )	2023-05-09 19:22:10 -03:00
EgrorBs	d3ea70f453	More trust_remote_code=trust_remote_code (#1899 )	2023-05-07 23:48:20 -03:00
oobabooga	97a6a50d98	Use oasst tokenizer instead of universal tokenizer	2023-05-04 15:55:39 -03:00
Mylo	bd531c2dc2	Make --trust-remote-code work for all models (#1772 )	2023-05-04 02:01:28 -03:00
oobabooga	9c77ab4fc2	Improve some warnings	2023-05-03 22:06:46 -03:00
oobabooga	95d04d6a8d	Better warning messages	2023-05-03 21:43:17 -03:00
Ahmed Said	fbcd32988e	added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649 ) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>	2023-05-02 18:25:28 -03:00
oobabooga	9c2e7c0fab	Fix path on models.py	2023-04-26 03:29:09 -03:00
oobabooga	a8409426d7	Fix bug in models.py	2023-04-26 01:55:40 -03:00
oobabooga	f642135517	Make universal tokenizer, xformers, sdp-attention apply to monkey patch	2023-04-25 23:18:11 -03:00
Vincent Brouwers	92cdb4f22b	Seq2Seq support (including FLAN-T5) (#1535 ) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>	2023-04-25 22:39:04 -03:00
Wojtab	12212cf6be	LLaVA support (#1487 )	2023-04-23 20:32:22 -03:00
oobabooga	c0b5c09860	Minor change	2023-04-22 15:15:31 -03:00
oobabooga	fcb594b90e	Don't require llama.cpp models to be placed in subfolders	2023-04-22 14:56:48 -03:00
oobabooga	c4f4f41389	Add an "Evaluate" tab to calculate the perplexities of models (#1322 )	2023-04-21 00:20:33 -03:00
oobabooga	7bb9036ac9	Add universal LLaMA tokenizer support	2023-04-19 21:23:51 -03:00
catalpaaa	07de7d0426	Load llamacpp before quantized model (#1307 )	2023-04-17 10:47:26 -03:00
oobabooga	39099663a0	Add 4-bit LoRA support (#1200 )	2023-04-16 23:26:52 -03:00
Forkoz	c6fe1ced01	Add ChatGLM support (#1256 ) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>	2023-04-16 19:15:03 -03:00
oobabooga	ac189011cb	Add "Save current settings for this model" button	2023-04-15 12:54:02 -03:00
oobabooga	cacbcda208	Two new options: truncation length and ban eos token	2023-04-11 18:46:06 -03:00
oobabooga	1911504f82	Minor bug fix	2023-04-09 23:45:41 -03:00
oobabooga	dba2000d2b	Do things that I am not proud of	2023-04-09 23:40:49 -03:00
MarkovInequality	992663fa20	Added xformers support to Llama (#950 )	2023-04-09 23:08:40 -03:00
oobabooga	a3085dba07	Fix LlamaTokenizer eos_token (attempt)	2023-04-09 21:19:39 -03:00
oobabooga	0b458bf82d	Simplify a function	2023-04-07 21:37:41 -03:00
Φφ	ffd102e5c0	SD Api Pics extension, v.1.1 (#596 )	2023-04-07 21:36:04 -03:00
oobabooga	ea6e77df72	Make the code more like PEP8 for readability (#862 )	2023-04-07 00:15:45 -03:00
oobabooga	113f94b61e	Bump transformers (16-bit llama must be reconverted/redownloaded)	2023-04-06 16:04:03 -03:00
oobabooga	03cb44fc8c	Add new llama.cpp library (2048 context, temperature, etc now work)	2023-04-06 13:12:14 -03:00
catalpaaa	4ab679480e	allow quantized model to be loaded from model dir (#760 )	2023-04-04 23:19:38 -03:00
oobabooga	3a47a602a3	Detect ggml*.bin files automatically	2023-03-31 17:18:21 -03:00
oobabooga	4c27562157	Minor changes	2023-03-31 14:33:46 -03:00
Thomas Antony	79fa2b6d7e	Add support for alpaca	2023-03-30 11:23:04 +01:00
Thomas Antony	7745faa7bb	Add llamacpp to models.py	2023-03-30 11:22:37 +01:00
oobabooga	1cb9246160	Adapt to the new model names	2023-03-29 21:47:36 -03:00
oobabooga	53da672315	Fix FlexGen	2023-03-27 23:44:21 -03:00
oobabooga	ee95e55df6	Fix RWKV tokenizer	2023-03-27 23:42:29 -03:00
oobabooga	fde92048af	Merge branch 'main' into catalpaaa-lora-and-model-dir	2023-03-27 23:16:44 -03:00
oobabooga	49c10c5570	Add support for the latest GPTQ models with group-size (#530 ) Warning: old 4-bit weights will not work anymore! See here how to get up to date weights: https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#step-2-get-the-pre-converted-weights	2023-03-26 00:11:33 -03:00
catalpaaa	b37c54edcf	lora-dir, model-dir and login auth Added lora-dir, model-dir, and a login auth arguments that points to a file contains usernames and passwords in the format of "u:pw,u:pw,..."	2023-03-24 17:30:18 -07:00
oobabooga	a6bf54739c	Revert models.py (accident)	2023-03-24 19:56:45 -03:00
oobabooga	a80aa65986	Update models.py	2023-03-24 19:53:20 -03:00
oobabooga	ddb62470e9	--no-cache and --gpu-memory in MiB for fine VRAM control	2023-03-19 19:21:41 -03:00
oobabooga	e26763a510	Minor changes	2023-03-17 22:56:46 -03:00
Wojtek Kowaluk	7994b580d5	clean up duplicated code	2023-03-18 02:27:26 +01:00
Wojtek Kowaluk	30939e2aee	add mps support on apple silicon	2023-03-18 00:56:23 +01:00
oobabooga	ee164d1821	Don't split the layers in 8-bit mode by default	2023-03-16 18:22:16 -03:00
oobabooga	e085cb4333	Small changes	2023-03-16 13:34:23 -03:00
awoo	83cb20aad8	Add support for --gpu-memory witn --load-in-8bit	2023-03-16 18:42:53 +03:00
oobabooga	1c378965e1	Remove unused imports	2023-03-16 10:18:34 -03:00
oobabooga	66256ac1dd	Make the "no GPU has been detected" message more descriptive	2023-03-15 19:31:27 -03:00
oobabooga	265ba384b7	Rename a file, add deprecation warning for --load-in-4bit	2023-03-14 07:56:31 -03:00
Ayanami Rei	8778b756e6	use updated load_quantized	2023-03-13 22:11:40 +03:00
Ayanami Rei	e1c952c41c	make argument non case-sensitive	2023-03-13 20:22:38 +03:00
Ayanami Rei	3c9afd5ca3	rename method	2023-03-13 20:14:40 +03:00
Ayanami Rei	edbc61139f	use new quant loader	2023-03-13 20:00:38 +03:00
oobabooga	65dda28c9d	Rename --llama-bits to --gptq-bits	2023-03-12 11:19:07 -03:00
oobabooga	fed3617f07	Move LLaMA 4-bit into a separate file	2023-03-12 11:12:34 -03:00
draff	001e638b47	Make it actually work	2023-03-10 23:28:19 +00:00
draff	804486214b	Re-implement --load-in-4bit and update --llama-bits arg description	2023-03-10 23:21:01 +00:00
ItsLogic	9ba8156a70	remove unnecessary Path()	2023-03-10 22:33:58 +00:00
draff	e6c631aea4	Replace --load-in-4bit with --llama-bits Replaces --load-in-4bit with a more flexible --llama-bits arg to allow for 2 and 3 bit models as well. This commit also fixes a loading issue with .pt files which are not in the root of the models folder	2023-03-10 21:36:45 +00:00
oobabooga	e9dbdafb14	Merge branch 'main' into pt-path-changes	2023-03-10 11:03:42 -03:00
oobabooga	706a03b2cb	Minor changes	2023-03-10 11:02:25 -03:00
oobabooga	de7dd8b6aa	Add comments	2023-03-10 10:54:08 -03:00
oobabooga	e461c0b7a0	Move the import to the top	2023-03-10 10:51:12 -03:00
deepdiffuser	9fbd60bf22	add no_split_module_classes to prevent tensor split error	2023-03-10 05:30:47 -08:00
deepdiffuser	ab47044459	add multi-gpu support for 4bit gptq LLaMA	2023-03-10 04:52:45 -08:00
rohvani	2ac2913747	fix reference issue	2023-03-09 20:13:23 -08:00
rohvani	826e297b0e	add llama-65b-4bit support & multiple pt paths	2023-03-09 18:31:32 -08:00
oobabooga	9849aac0f1	Don't show .pt models in the list	2023-03-09 21:54:50 -03:00
oobabooga	74102d5ee4	Insert to the path instead of appending	2023-03-09 20:51:22 -03:00
oobabooga	2965aa1625	Check if the .pt file exists	2023-03-09 20:48:51 -03:00

1 2 3 4

167 Commits