oobabooga
|
d423021a48
|
Remove CTransformers support (#5807)
|
2024-04-04 20:23:58 -03:00 |
|
oobabooga
|
13fe38eb27
|
Remove specialized code for gpt-4chan
|
2024-04-04 16:11:47 -07:00 |
|
oobabooga
|
4039999be5
|
Autodetect llamacpp_HF loader when tokenizer exists
|
2024-02-16 09:29:26 -08:00 |
|
oobabooga
|
b2b74c83a6
|
Fix Qwen1.5 in llamacpp_HF
|
2024-02-15 19:04:19 -08:00 |
|
oobabooga
|
d47182d9d1
|
llamacpp_HF: do not use oobabooga/llama-tokenizer (#5499)
|
2024-02-14 00:28:51 -03:00 |
|
oobabooga
|
4e34ae0587
|
Minor logging improvements
|
2024-02-06 08:22:08 -08:00 |
|
oobabooga
|
8ee3cea7cb
|
Improve some log messages
|
2024-02-06 06:31:27 -08:00 |
|
oobabooga
|
2a1063eff5
|
Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478 .
|
2024-02-06 06:21:36 -08:00 |
|
oobabooga
|
cde000d478
|
Remove non-HF ExLlamaV2 loader (#5431)
|
2024-02-04 01:15:51 -03:00 |
|
sam-ngu
|
c0bdcee646
|
added trust_remote_code to deepspeed init loaderClass (#5237)
|
2024-01-26 11:10:57 -03:00 |
|
oobabooga
|
89e7e107fc
|
Lint
|
2024-01-09 16:27:50 -08:00 |
|
oobabooga
|
94afa0f9cf
|
Minor style changes
|
2024-01-01 16:00:22 -08:00 |
|
oobabooga
|
2734ce3e4c
|
Remove RWKV loader (#5130)
|
2023-12-31 02:01:40 -03:00 |
|
oobabooga
|
0e54a09bcb
|
Remove exllamav1 loaders (#5128)
|
2023-12-31 01:57:06 -03:00 |
|
oobabooga
|
8e397915c9
|
Remove --sdp-attention, --xformers flags (#5126)
|
2023-12-31 01:36:51 -03:00 |
|
Yiximail
|
afc91edcb2
|
Reset the model_name after unloading the model (#5051)
|
2023-12-22 22:18:24 -03:00 |
|
oobabooga
|
f0f6d9bdf9
|
Add HQQ back & update version
This reverts commit 2289e9031e .
|
2023-12-20 07:46:09 -08:00 |
|
oobabooga
|
fadb295d4d
|
Lint
|
2023-12-19 21:36:57 -08:00 |
|
oobabooga
|
fb8ee9f7ff
|
Add a specific error if HQQ is missing
|
2023-12-19 21:32:58 -08:00 |
|
oobabooga
|
9992f7d8c0
|
Improve several log messages
|
2023-12-19 20:54:32 -08:00 |
|
Water
|
674be9a09a
|
Add HQQ quant loader (#4888)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-12-18 21:23:16 -03:00 |
|
oobabooga
|
3bbf6c601d
|
AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this)
|
2023-12-15 06:46:13 -08:00 |
|
oobabooga
|
39d2fe1ed9
|
Jinja templates for Instruct and Chat (#4874)
|
2023-12-12 17:23:14 -03:00 |
|
oobabooga
|
2a335b8aa7
|
Cleanup: set shared.model_name only once
|
2023-12-08 06:35:23 -08:00 |
|
oobabooga
|
98361af4d5
|
Add QuIP# support (#4803)
It has to be installed manually for now.
|
2023-12-06 00:01:01 -03:00 |
|
oobabooga
|
77d6ccf12b
|
Add a LOADER debug message while loading models
|
2023-11-30 12:00:32 -08:00 |
|
oobabooga
|
8b66d83aa9
|
Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
|
2023-11-16 19:55:28 -08:00 |
|
oobabooga
|
a85ce5f055
|
Add more info messages for truncation / instruction template
|
2023-11-15 16:20:31 -08:00 |
|
oobabooga
|
883701bc40
|
Alternative solution to 025da386a0
Fixes an error.
|
2023-11-15 16:04:02 -08:00 |
|
oobabooga
|
8ac942813c
|
Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0 .
|
2023-11-15 16:01:54 -08:00 |
|
oobabooga
|
e6f44d6d19
|
Print context length / instruction template to terminal when loading models
|
2023-11-15 16:00:51 -08:00 |
|
Andy Bao
|
025da386a0
|
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-11-15 20:27:20 -03:00 |
|
oobabooga
|
2358706453
|
Add /v1/internal/model/load endpoint (tentative)
|
2023-11-07 20:58:06 -08:00 |
|
oobabooga
|
ec17a5d2b7
|
Make OpenAI API the default API (#4430)
|
2023-11-06 02:38:29 -03:00 |
|
feng lui
|
4766a57352
|
transformers: add use_flash_attention_2 option (#4373)
|
2023-11-04 13:59:33 -03:00 |
|
Julien Chaumond
|
fdcaa955e3
|
transformers: Add a flag to force load from safetensors (#4450)
|
2023-11-02 16:20:54 -03:00 |
|
oobabooga
|
839a87bac8
|
Fix is_ccl_available & is_xpu_available imports
|
2023-10-26 20:27:04 -07:00 |
|
Abhilash Majumder
|
778a010df8
|
Intel Gpu support initialization (#4340)
|
2023-10-26 23:39:51 -03:00 |
|
oobabooga
|
ef1489cd4d
|
Remove unused parameter in AutoAWQ
|
2023-10-23 20:45:43 -07:00 |
|
oobabooga
|
8ea554bc19
|
Check for torch.xpu.is_available()
|
2023-10-16 12:53:40 -07:00 |
|
oobabooga
|
b88b2b74a6
|
Experimental Intel Arc transformers support (untested)
|
2023-10-15 20:51:11 -07:00 |
|
oobabooga
|
f63361568c
|
Fix safetensors kwarg usage in AutoAWQ
|
2023-10-10 19:03:09 -07:00 |
|
oobabooga
|
fae8062d39
|
Bump to latest gradio (3.47) (#4258)
|
2023-10-10 22:20:49 -03:00 |
|
cal066
|
cc632c3f33
|
AutoAWQ: initial support (#3999)
|
2023-10-05 13:19:18 -03:00 |
|
oobabooga
|
87ea2d96fd
|
Add a note about RWKV loader
|
2023-09-26 17:43:39 -07:00 |
|
oobabooga
|
d0d221df49
|
Add --use_fast option (closes #3741)
|
2023-09-25 12:19:43 -07:00 |
|
oobabooga
|
63de9eb24f
|
Clean up the transformers loader
|
2023-09-24 20:26:26 -07:00 |
|
oobabooga
|
36c38d7561
|
Add disable_exllama to Transformers loader (for GPTQ LoRA training)
|
2023-09-24 20:03:11 -07:00 |
|
oobabooga
|
13ac55fa18
|
Reorder some functions
|
2023-09-19 13:51:57 -07:00 |
|
oobabooga
|
f0ef971edb
|
Remove obsolete warning
|
2023-09-18 12:25:10 -07:00 |
|