oobabooga
|
65e5864084
|
Update README
|
2024-09-28 20:25:26 -07:00 |
|
oobabooga
|
85994e3ef0
|
Bump pytorch to 2.4.1
|
2024-09-28 09:44:08 -07:00 |
|
oobabooga
|
ca5a2dba72
|
Bump rocm to 6.1.2
|
2024-09-28 09:39:53 -07:00 |
|
oobabooga
|
1124f71cf3
|
Update README.md
|
2024-08-20 11:19:46 -03:00 |
|
oobabooga
|
d9a031fcad
|
Update README.md
|
2024-08-20 01:52:30 -03:00 |
|
oobabooga
|
9d99156ca3
|
Update README.md
|
2024-08-20 01:27:02 -03:00 |
|
oobabooga
|
406995f722
|
Update README
|
2024-08-19 21:24:01 -07:00 |
|
oobabooga
|
1b1518aa6a
|
Update README.md
|
2024-08-20 00:36:18 -03:00 |
|
oobabooga
|
8bac1a9382
|
Update README.md
|
2024-08-19 23:10:04 -03:00 |
|
oobabooga
|
bb987ffe66
|
Update README.md
|
2024-08-19 23:06:52 -03:00 |
|
oobabooga
|
5c5e7264ec
|
Update README
|
2024-07-22 18:20:01 -07:00 |
|
oobabooga
|
05676caf70
|
Update README
|
2024-07-11 16:25:52 -07:00 |
|
oobabooga
|
f5599656b4
|
Update README
|
2024-07-11 16:22:00 -07:00 |
|
oobabooga
|
a30ec2e7db
|
Update README
|
2024-07-11 16:20:44 -07:00 |
|
oobabooga
|
53fbd2f245
|
Add TensorRT-LLM to the README
|
2024-06-25 14:45:37 -07:00 |
|
oobabooga
|
9420973b62
|
Downgrade PyTorch to 2.2.2 (#6124)
|
2024-06-14 16:42:03 -03:00 |
|
oobabooga
|
8930bfc5f4
|
Bump PyTorch, ExLlamaV2, flash-attention (#6122)
|
2024-06-13 20:38:31 -03:00 |
|
oobabooga
|
bd7cc4234d
|
Backend cleanup (#6025)
|
2024-05-21 13:32:02 -03:00 |
|
oobabooga
|
6a1682aa95
|
README: update command-line flags with raw --help output
This helps me keep this up-to-date more easily.
|
2024-05-19 20:28:46 -07:00 |
|
oobabooga
|
7a728a38eb
|
Update README
|
2024-05-07 02:59:36 -07:00 |
|
oobabooga
|
e61055253c
|
Bump llama-cpp-python to 0.2.69, add --flash-attn option
|
2024-05-03 04:31:22 -07:00 |
|
oobabooga
|
1eba888af6
|
Update FUNDING.yml
|
2024-05-01 05:54:21 -07:00 |
|
oobabooga
|
51fb766bea
|
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964)
|
2024-04-30 09:11:31 -03:00 |
|
oobabooga
|
c9b0df16ee
|
Lint
|
2024-04-24 09:55:00 -07:00 |
|
oobabooga
|
4094813f8d
|
Lint
|
2024-04-24 09:53:41 -07:00 |
|
oobabooga
|
9b623b8a78
|
Bump llama-cpp-python to 0.2.64, use official wheels (#5921)
|
2024-04-23 23:17:05 -03:00 |
|
oobabooga
|
d423021a48
|
Remove CTransformers support (#5807)
|
2024-04-04 20:23:58 -03:00 |
|
oobabooga
|
056717923f
|
Document StreamingLLM
|
2024-03-10 19:15:23 -07:00 |
|
Bartowski
|
104573f7d4
|
Update cache_4bit documentation (#5649)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2024-03-07 13:08:21 -03:00 |
|
oobabooga
|
2ec1d96c91
|
Add cache_4bit option for ExLlamaV2 (#5645)
|
2024-03-06 23:02:25 -03:00 |
|
oobabooga
|
fa0e68cefd
|
Installer: add back INSTALL_EXTENSIONS environment variable (for docker)
|
2024-03-06 11:31:06 -08:00 |
|
oobabooga
|
97dc3602fc
|
Create an update wizard (#5623)
|
2024-03-04 15:52:24 -03:00 |
|
oobabooga
|
527ba98105
|
Do not install extensions requirements by default (#5621)
|
2024-03-04 04:46:39 -03:00 |
|
oobabooga
|
8bd4960d05
|
Update PyTorch to 2.2 (also update flash-attn to 2.5.6) (#5618)
|
2024-03-03 19:40:32 -03:00 |
|
oobabooga
|
7342afaf19
|
Update the PyTorch installation instructions
|
2024-02-08 20:36:11 -08:00 |
|
smCloudInTheSky
|
b1463df0a1
|
docker: add options for CPU only, Intel GPU, AMD GPU (#5380)
|
2024-01-28 11:18:14 -03:00 |
|
oobabooga
|
c1470870bb
|
Update README
|
2024-01-26 05:58:40 -08:00 |
|
oobabooga
|
6ada77cf5a
|
Update README.md
|
2024-01-22 03:17:15 -08:00 |
|
oobabooga
|
cc6505df14
|
Update README.md
|
2024-01-22 03:14:56 -08:00 |
|
oobabooga
|
2734ce3e4c
|
Remove RWKV loader (#5130)
|
2023-12-31 02:01:40 -03:00 |
|
oobabooga
|
0e54a09bcb
|
Remove exllamav1 loaders (#5128)
|
2023-12-31 01:57:06 -03:00 |
|
oobabooga
|
8e397915c9
|
Remove --sdp-attention, --xformers flags (#5126)
|
2023-12-31 01:36:51 -03:00 |
|
oobabooga
|
2706149c65
|
Organize the CMD arguments by group (#5027)
|
2023-12-21 00:33:55 -03:00 |
|
oobabooga
|
de138b8ba6
|
Add llama-cpp-python wheels with tensor cores support (#5003)
|
2023-12-19 17:30:53 -03:00 |
|
oobabooga
|
0a299d5959
|
Bump llama-cpp-python to 0.2.24 (#5001)
|
2023-12-19 15:22:21 -03:00 |
|
Water
|
674be9a09a
|
Add HQQ quant loader (#4888)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-12-18 21:23:16 -03:00 |
|
oobabooga
|
f1f2c4c3f4
|
Add --num_experts_per_token parameter (ExLlamav2) (#4955)
|
2023-12-17 12:08:33 -03:00 |
|
oobabooga
|
41424907b1
|
Update README
|
2023-12-16 16:35:36 -08:00 |
|
oobabooga
|
0087dca286
|
Update README
|
2023-12-16 12:28:51 -08:00 |
|
oobabooga
|
3bbf6c601d
|
AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this)
|
2023-12-15 06:46:13 -08:00 |
|