Commit Graph

516 Commits

Author SHA1 Message Date
Brian Dashore
3345da2ea4
Add flash-attention 2 for windows (#4235) 2023-10-21 03:46:23 -03:00
mjbogusz
8f6405d2fa
Python 3.11, 3.9, 3.8 support (#4233)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-20 21:13:33 -03:00
oobabooga
43be1be598 Manually install CUDA runtime libraries 2023-10-12 21:02:44 -07:00
oobabooga
2e8b5f7c80
Update ROCm command 2023-10-08 10:12:13 -03:00
oobabooga
00187d641a
Note about pytorch 2.1 breaking change 2023-10-08 10:10:38 -03:00
oobabooga
1c6e57dd68
Note about pytorch 2.1 breaking change 2023-10-08 10:09:22 -03:00
oobabooga
d33facc9fe
Bump to pytorch 11.8 (#4209) 2023-10-07 00:23:49 -03:00
oobabooga
7ffb424c7b Add AutoAWQ to README 2023-10-05 09:22:37 -07:00
oobabooga
b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
StoyanStAtanasov
7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga
44438c60e5 Add INSTALL_EXTENSIONS environment variable 2023-09-25 13:12:35 -07:00
oobabooga
d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga
2e7b6b0014
Create alternative requirements.txt with AMD and Metal wheels (#4052) 2023-09-24 09:58:29 -03:00
oobabooga
895ec9dadb
Update README.md 2023-09-23 15:37:39 -03:00
oobabooga
299d285ff0
Update README.md 2023-09-23 15:36:09 -03:00
oobabooga
4b4d283a4c
Update README.md 2023-09-23 00:09:59 -03:00
oobabooga
0581f1094b
Update README.md 2023-09-22 23:31:32 -03:00
oobabooga
968f98a57f
Update README.md 2023-09-22 23:23:16 -03:00
oobabooga
72b4ab4c82 Update README 2023-09-22 15:20:09 -07:00
oobabooga
589ee9f623
Update README.md 2023-09-22 16:21:48 -03:00
oobabooga
c33a94e381 Rename doc file 2023-09-22 12:17:47 -07:00
oobabooga
6c5f81f002 Rename webui.py to one_click.py 2023-09-22 12:00:06 -07:00
oobabooga
fe2acdf45f
Update README.md 2023-09-22 15:52:20 -03:00
oobabooga
193fe18c8c Resolve conflicts 2023-09-21 17:45:11 -07:00
oobabooga
df39f455ad Merge remote-tracking branch 'second-repo/main' into merge-second-repo 2023-09-21 17:39:54 -07:00
James Braza
fee38e0601
Simplified ExLlama cloning instructions and failure message (#3972) 2023-09-17 19:26:05 -03:00
oobabooga
e75489c252 Update README 2023-09-15 21:04:51 -07:00
missionfloyd
2ad6ca8874
Add back chat buttons with --chat-buttons (#3947) 2023-09-16 00:39:37 -03:00
oobabooga
fb864dad7b Update README 2023-09-15 13:00:46 -07:00
oobabooga
2f935547c8 Minor changes 2023-09-12 15:05:21 -07:00
oobabooga
04a74b3774 Update README 2023-09-12 10:46:27 -07:00
Eve
92f3cd624c
Improve instructions for CPUs without AVX2 (#3786) 2023-09-11 11:54:04 -03:00
oobabooga
ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00
oobabooga
40ffc3d687
Update README.md 2023-08-30 18:19:04 -03:00
oobabooga
5190e153ed
Update README.md 2023-08-30 14:06:29 -03:00
oobabooga
bc4023230b Improved instructions for AMD/Metal/Intel Arc/CPUs without AVCX2 2023-08-30 09:40:00 -07:00
missionfloyd
787219267c
Allow downloading single file from UI (#3737) 2023-08-29 23:32:36 -03:00
oobabooga
3361728da1 Change some comments 2023-08-26 22:24:44 -07:00
oobabooga
7f5370a272 Minor fixes/cosmetics 2023-08-26 22:11:07 -07:00
oobabooga
83640d6f43 Replace ggml occurences with gguf 2023-08-26 01:06:59 -07:00
oobabooga
f4f04c8c32 Fix a typo 2023-08-25 07:08:38 -07:00
oobabooga
52ab2a6b9e Add rope_freq_base parameter for CodeLlama 2023-08-25 06:55:15 -07:00
oobabooga
3320accfdc
Add CFG to llamacpp_HF (second attempt) (#3678) 2023-08-24 20:32:21 -03:00
oobabooga
d6934bc7bc
Implement CFG for ExLlama_HF (#3666) 2023-08-24 16:27:36 -03:00
oobabooga
1b419f656f Acknowledge a16z support 2023-08-21 11:57:51 -07:00
oobabooga
54df0bfad1 Update README.md 2023-08-18 09:43:15 -07:00
oobabooga
f50f534b0f Add note about AMD/Metal to README 2023-08-18 09:37:20 -07:00
oobabooga
7cba000421
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610) 2023-08-18 12:03:34 -03:00
oobabooga
32ff3da941
Update ancient screenshots 2023-08-15 17:16:24 -03:00
oobabooga
87dd85b719 Update README 2023-08-15 12:21:50 -07:00
oobabooga
a03a70bed6 Update README 2023-08-15 12:20:59 -07:00
oobabooga
7089b2a48f Update README 2023-08-15 12:16:21 -07:00
oobabooga
155862a4a0 Update README 2023-08-15 12:11:12 -07:00
cal066
991bb57e43
ctransformers: Fix up model_type name consistency (#3567) 2023-08-14 15:17:24 -03:00
oobabooga
ccfc02a28d
Add the --disable_exllama option for AutoGPTQ (#3545 from clefever/disable-exllama) 2023-08-14 15:15:55 -03:00
oobabooga
619cb4e78b
Add "save defaults to settings.yaml" button (#3574) 2023-08-14 11:46:07 -03:00
Eve
66c04c304d
Various ctransformers fixes (#3556)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
2023-08-13 23:09:03 -03:00
oobabooga
a1a9ec895d
Unify the 3 interface modes (#3554) 2023-08-13 01:12:15 -03:00
Chris Lefever
0230fa4e9c Add the --disable_exllama option for AutoGPTQ 2023-08-12 02:26:58 -04:00
oobabooga
4c450e6b70
Update README.md 2023-08-11 15:50:16 -03:00
cal066
7a4fcee069
Add ctransformers support (#3313)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
2023-08-11 14:41:33 -03:00
oobabooga
949c92d7df
Create README.md 2023-08-10 14:32:40 -03:00
oobabooga
c7f52bbdc1 Revert "Remove GPTQ-for-LLaMa monkey patch support"
This reverts commit e3d3565b2a.
2023-08-10 08:39:41 -07:00
jllllll
e3d3565b2a
Remove GPTQ-for-LLaMa monkey patch support
AutoGPTQ will be the preferred GPTQ LoRa loader in the future.
2023-08-09 23:59:04 -05:00
jllllll
bee73cedbd
Streamline GPTQ-for-LLaMa support 2023-08-09 23:42:34 -05:00
oobabooga
2255349f19 Update README 2023-08-09 05:46:25 -07:00
oobabooga
d8fb506aff Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
2023-08-08 21:25:48 -07:00
Friedemann Lipphardt
901b028d55
Add option for named cloudflare tunnels (#3364) 2023-08-08 22:20:27 -03:00
oobabooga
8df3cdfd51
Add SSL certificate support (#3453) 2023-08-04 13:57:31 -03:00
oobabooga
4e6dc6d99d Add Contributing guidelines 2023-08-03 14:40:28 -07:00
oobabooga
87dab03dc0
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432) 2023-08-03 11:00:36 -03:00
oobabooga
b17893a58f Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e.
2023-07-26 07:06:01 -07:00
oobabooga
69f8b35bc9 Revert changes to README 2023-07-25 20:51:19 -07:00
oobabooga
1b89c304ad Update README 2023-07-25 15:46:12 -07:00
oobabooga
77d2e9f060 Remove flexgen 2 2023-07-25 15:18:25 -07:00
oobabooga
5134d5b1c6 Update README 2023-07-25 15:13:07 -07:00
Shouyi
031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
Eve
f653546484
README updates and improvements (#3198) 2023-07-25 18:58:13 -03:00
Ikko Eltociear Ashimine
b09e4f10fd
Fix typo in README.md (#3286)
tranformers -> transformers
2023-07-25 18:56:25 -03:00
oobabooga
a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
oobabooga
6415cc68a2 Remove obsolete information from README 2023-07-19 21:20:40 -07:00
Panchovix
10c8c197bf
Add Support for Static NTK RoPE scaling for exllama/exllama_hf (#2955) 2023-07-04 01:13:16 -03:00
oobabooga
4b1804a438
Implement sessions + add basic multi-user support (#2991) 2023-07-04 00:03:30 -03:00
oobabooga
7611978f7b
Add Community section to README 2023-06-27 13:56:14 -03:00
oobabooga
c52290de50
ExLlama with long context (#2875) 2023-06-25 22:49:26 -03:00
oobabooga
8bb3bb39b3
Implement stopping string search in string space (#2847) 2023-06-24 09:43:00 -03:00
oobabooga
0f9088f730 Update README 2023-06-23 12:24:43 -03:00
oobabooga
383c50f05b
Replace old presets with the results of Preset Arena (#2830) 2023-06-23 01:48:29 -03:00
LarryVRH
580c1ee748
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777) 2023-06-21 15:31:42 -03:00
oobabooga
a1cac88c19
Update README.md 2023-06-19 01:28:23 -03:00
oobabooga
5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga
7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
oobabooga
57be2eecdf
Update README.md 2023-06-16 15:04:16 -03:00
Tom Jobbins
646b0c889f
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648) 2023-06-15 23:59:54 -03:00
oobabooga
8936160e54
Add WSL installer to README (thanks jllllll) 2023-06-13 00:07:34 -03:00
oobabooga
eda224c92d Update README 2023-06-05 17:04:09 -03:00
oobabooga
bef94b9ebb Update README 2023-06-05 17:01:13 -03:00
oobabooga
f276d88546 Use AutoGPTQ by default for GPTQ models 2023-06-05 15:41:48 -03:00
oobabooga
632571a009 Update README 2023-06-05 15:16:06 -03:00