oobabooga
|
77d6ccf12b
|
Add a LOADER debug message while loading models
|
2023-11-30 12:00:32 -08:00 |
|
oobabooga
|
8b66d83aa9
|
Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
|
2023-11-16 19:55:28 -08:00 |
|
oobabooga
|
a85ce5f055
|
Add more info messages for truncation / instruction template
|
2023-11-15 16:20:31 -08:00 |
|
oobabooga
|
883701bc40
|
Alternative solution to 025da386a0
Fixes an error.
|
2023-11-15 16:04:02 -08:00 |
|
oobabooga
|
8ac942813c
|
Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0 .
|
2023-11-15 16:01:54 -08:00 |
|
oobabooga
|
e6f44d6d19
|
Print context length / instruction template to terminal when loading models
|
2023-11-15 16:00:51 -08:00 |
|
Andy Bao
|
025da386a0
|
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-11-15 20:27:20 -03:00 |
|
oobabooga
|
2358706453
|
Add /v1/internal/model/load endpoint (tentative)
|
2023-11-07 20:58:06 -08:00 |
|
oobabooga
|
ec17a5d2b7
|
Make OpenAI API the default API (#4430)
|
2023-11-06 02:38:29 -03:00 |
|
feng lui
|
4766a57352
|
transformers: add use_flash_attention_2 option (#4373)
|
2023-11-04 13:59:33 -03:00 |
|
Julien Chaumond
|
fdcaa955e3
|
transformers: Add a flag to force load from safetensors (#4450)
|
2023-11-02 16:20:54 -03:00 |
|
oobabooga
|
839a87bac8
|
Fix is_ccl_available & is_xpu_available imports
|
2023-10-26 20:27:04 -07:00 |
|
Abhilash Majumder
|
778a010df8
|
Intel Gpu support initialization (#4340)
|
2023-10-26 23:39:51 -03:00 |
|
oobabooga
|
ef1489cd4d
|
Remove unused parameter in AutoAWQ
|
2023-10-23 20:45:43 -07:00 |
|
oobabooga
|
8ea554bc19
|
Check for torch.xpu.is_available()
|
2023-10-16 12:53:40 -07:00 |
|
oobabooga
|
b88b2b74a6
|
Experimental Intel Arc transformers support (untested)
|
2023-10-15 20:51:11 -07:00 |
|
oobabooga
|
f63361568c
|
Fix safetensors kwarg usage in AutoAWQ
|
2023-10-10 19:03:09 -07:00 |
|
oobabooga
|
fae8062d39
|
Bump to latest gradio (3.47) (#4258)
|
2023-10-10 22:20:49 -03:00 |
|
cal066
|
cc632c3f33
|
AutoAWQ: initial support (#3999)
|
2023-10-05 13:19:18 -03:00 |
|
oobabooga
|
87ea2d96fd
|
Add a note about RWKV loader
|
2023-09-26 17:43:39 -07:00 |
|
oobabooga
|
d0d221df49
|
Add --use_fast option (closes #3741)
|
2023-09-25 12:19:43 -07:00 |
|
oobabooga
|
63de9eb24f
|
Clean up the transformers loader
|
2023-09-24 20:26:26 -07:00 |
|
oobabooga
|
36c38d7561
|
Add disable_exllama to Transformers loader (for GPTQ LoRA training)
|
2023-09-24 20:03:11 -07:00 |
|
oobabooga
|
13ac55fa18
|
Reorder some functions
|
2023-09-19 13:51:57 -07:00 |
|
oobabooga
|
f0ef971edb
|
Remove obsolete warning
|
2023-09-18 12:25:10 -07:00 |
|
Johan
|
fdcee0c215
|
Allow custom tokenizer for llamacpp_HF loader (#3941)
|
2023-09-15 12:38:38 -03:00 |
|
oobabooga
|
c2a309f56e
|
Add ExLlamaV2 and ExLlamav2_HF loaders (#3881)
|
2023-09-12 14:33:07 -03:00 |
|
oobabooga
|
9331ab4798
|
Read GGUF metadata (#3873)
|
2023-09-11 18:49:30 -03:00 |
|
oobabooga
|
ed86878f02
|
Remove GGML support
|
2023-09-11 07:44:00 -07:00 |
|
jllllll
|
4a999e3bcd
|
Use separate llama-cpp-python packages for GGML support
|
2023-08-26 10:40:08 -05:00 |
|
oobabooga
|
83640d6f43
|
Replace ggml occurences with gguf
|
2023-08-26 01:06:59 -07:00 |
|
cal066
|
960980247f
|
ctransformers: gguf support (#3685)
|
2023-08-25 11:33:04 -03:00 |
|
oobabooga
|
52ab2a6b9e
|
Add rope_freq_base parameter for CodeLlama
|
2023-08-25 06:55:15 -07:00 |
|
cal066
|
7a4fcee069
|
Add ctransformers support (#3313)
---------
Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
|
2023-08-11 14:41:33 -03:00 |
|
oobabooga
|
d8fb506aff
|
Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
|
2023-08-08 21:25:48 -07:00 |
|
oobabooga
|
65aa11890f
|
Refactor everything (#3481)
|
2023-08-06 21:49:27 -03:00 |
|
oobabooga
|
75c2dd38cf
|
Remove flexgen support
|
2023-07-25 15:15:29 -07:00 |
|
appe233
|
89e0d15cf5
|
Use 'torch.backends.mps.is_available' to check if mps is supported (#3164)
|
2023-07-17 21:27:18 -03:00 |
|
oobabooga
|
5e3f7e00a9
|
Create llamacpp_HF loader (#3062)
|
2023-07-16 02:21:13 -03:00 |
|
oobabooga
|
e202190c4f
|
lint
|
2023-07-12 11:33:25 -07:00 |
|
FartyPants
|
9b55d3a9f9
|
More robust and error prone training (#3058)
|
2023-07-12 15:29:43 -03:00 |
|
oobabooga
|
5ac4e4da8b
|
Make --model work with argument like models/folder_name
|
2023-07-08 10:22:54 -07:00 |
|
Xiaojian "JJ" Deng
|
ff45317032
|
Update models.py (#3020)
Hopefully fixed error with "ValueError: Tokenizer class GPTNeoXTokenizer does not exist or is not currently
imported."
|
2023-07-05 21:40:43 -03:00 |
|
oobabooga
|
8705eba830
|
Remove universal llama tokenizer support
Instead replace it with a warning if the tokenizer files look off
|
2023-07-04 19:43:19 -07:00 |
|
FartyPants
|
33f56fd41d
|
Update models.py to clear LORA names after unload (#2951)
|
2023-07-03 17:39:06 -03:00 |
|
oobabooga
|
f0fcd1f697
|
Sort some imports
|
2023-06-25 01:44:36 -03:00 |
|
Panchovix
|
5646690769
|
Fix some models not loading on exllama_hf (#2835)
|
2023-06-23 11:31:02 -03:00 |
|
LarryVRH
|
580c1ee748
|
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777)
|
2023-06-21 15:31:42 -03:00 |
|
ThisIsPIRI
|
def3b69002
|
Fix loading condition for universal llama tokenizer (#2753)
|
2023-06-18 18:14:06 -03:00 |
|
oobabooga
|
9f40032d32
|
Add ExLlama support (#2444)
|
2023-06-16 20:35:38 -03:00 |
|