Commit Graph

2929 Commits

Author SHA1 Message Date
oobabooga
78c0da4a18
Use the cuda branch of gptq-for-llama
Did I do this right @jllllll? This is because the current default branch (triton) is not compatible with Windows.
2023-03-30 18:04:05 -03:00
oobabooga
d4a9b5ea97 Remove redundant preset (see the plot in #587) 2023-03-30 17:34:44 -03:00
Nikita Skakun
d550c12a3e Fixed the bug with additional bytes.
The issue seems to be with huggingface not reporting the entire size of the model.
Added an error message with instructions if the checksums don't match.
2023-03-30 12:52:16 -07:00
Thomas Antony
7fa5d96c22 Update to use new llamacpp API 2023-03-30 11:23:05 +01:00
Thomas Antony
79fa2b6d7e Add support for alpaca 2023-03-30 11:23:04 +01:00
Thomas Antony
8953a262cb Add llamacpp to requirements.txt 2023-03-30 11:22:38 +01:00
Thomas Antony
a5f5736e74 Add to text_generation.py 2023-03-30 11:22:38 +01:00
Thomas Antony
7745faa7bb Add llamacpp to models.py 2023-03-30 11:22:37 +01:00
Thomas Antony
7a562481fa Initial version of llamacpp_model.py 2023-03-30 11:22:07 +01:00
Thomas Antony
53ab1e285d Update .gitignore 2023-03-30 11:22:07 +01:00
Nikita Skakun
297ac051d9 Added sha256 validation of model files. 2023-03-30 02:34:19 -07:00
Nikita Skakun
8c590c2362 Added a 'clean' flag to not resume download. 2023-03-30 00:42:19 -07:00
Nikita Skakun
e17af59261 Add support for resuming downloads
This commit adds the ability to resume interrupted downloads by adding a new function to the downloader module. The function uses the HTTP Range header to fetch only the remaining part of a file that wasn't downloaded yet.
2023-03-30 00:21:34 -07:00
oobabooga
f0fdab08d3 Increase --chat height 2023-03-30 01:02:11 -03:00
oobabooga
bd65940a48 Increase --chat box height 2023-03-30 00:43:49 -03:00
oobabooga
131753fcf5 Save the sha256sum of downloaded models 2023-03-29 23:28:16 -03:00
oobabooga
a21e580782 Move an import 2023-03-29 22:50:58 -03:00
oobabooga
55755e27b9 Don't hardcode prompts in the settings dict/json 2023-03-29 22:47:01 -03:00
oobabooga
1cb9246160 Adapt to the new model names 2023-03-29 21:47:36 -03:00
oobabooga
0345e04249 Fix "Unknown argument(s): {'verbose': False}" 2023-03-29 21:17:48 -03:00
oobabooga
9104164297
Merge pull request #618 from nikita-skakun/optimize-download-model
Improve download-model.py progress bar with multiple threads
2023-03-29 20:54:19 -03:00
oobabooga
37754164eb Move argparse 2023-03-29 20:47:36 -03:00
oobabooga
6403e72062 Merge branch 'main' into nikita-skakun-optimize-download-model 2023-03-29 20:45:33 -03:00
oobabooga
1445ea86f7 Add --output and better metadata for downloading models 2023-03-29 20:26:44 -03:00
oobabooga
58349f44a0
Handle training exception for unsupported models 2023-03-29 11:55:34 -03:00
oobabooga
a6d0373063
Fix training dataset loading #636 2023-03-29 11:48:17 -03:00
oobabooga
41b58bc47e
Update README.md 2023-03-29 11:02:29 -03:00
oobabooga
0de4f24b12
Merge pull request #4 from jllllll/oobabooga-windows
Change Micromamba download link
2023-03-29 09:49:32 -03:00
jllllll
ed0e593161
Change Micromamba download
Changed link to previous version.
This will provide a stable source for Micromamba so that new versions don't cause issues.
2023-03-29 02:47:19 -05:00
oobabooga
3b4447a4fe
Update README.md 2023-03-29 02:24:11 -03:00
oobabooga
5d0b83c341
Update README.md 2023-03-29 02:22:19 -03:00
oobabooga
c2a863f87d
Mention the updated one-click installer 2023-03-29 02:11:51 -03:00
oobabooga
da3aa8fbda
Merge pull request #2 from jllllll/oobabooga-windows
Update one-click-installer for Windows
2023-03-29 01:55:47 -03:00
oobabooga
1edfb96778
Fix loading extensions from within the interface 2023-03-28 23:27:02 -03:00
Nikita Skakun
aaa218a102 Remove unused import. 2023-03-28 18:32:49 -07:00
Nikita Skakun
ff515ec2fe Improve progress bar visual style
This commit reverts the performance improvements of the previous commit for for improved visual style of multithreaded progress bars. The style of the progress bar has been modified to take up the same amount of size to align them.
2023-03-28 18:29:20 -07:00
oobabooga
304f812c63 Gracefully handle CUDA out of memory errors with streaming 2023-03-28 19:20:50 -03:00
Nikita Skakun
4d8e101006 Refactor download process to use multiprocessing
The previous implementation used threads to download files in parallel, which could lead to performance issues due to the Global Interpreter Lock (GIL).
This commit refactors the download process to use multiprocessing instead,
which allows for true parallelism across multiple CPUs.
This results in significantly faster downloads, particularly for large models.
2023-03-28 14:24:23 -07:00
oobabooga
b2f356a9ae
Generalize GPTQ_loader, support any model (#615 from mayaeary/feature/gpt-j-4bit-v2)
This includes Pygmalion 4bit
2023-03-28 18:00:09 -03:00
oobabooga
010b259dde Update documentation 2023-03-28 17:46:00 -03:00
oobabooga
0bec15ebcd Reorder imports 2023-03-28 17:34:15 -03:00
Maya Eary
41ec682834 Disable kernel threshold for gpt-j 2023-03-28 22:45:38 +03:00
Maya
1ac003d41c
Merge branch 'oobabooga:main' into feature/gpt-j-4bit-v2 2023-03-28 22:30:39 +03:00
oobabooga
aebd3cf110
Merge pull request #616 from mayaeary/fix/api-convert-params
Fixes for api server - chat mode and integer temperature
2023-03-28 15:21:58 -03:00
Maya Eary
d1377c37af Fixes for api server - chat mode and integer temperature 2023-03-28 20:57:16 +03:00
Maya Eary
1c075d8d21 Fix typo 2023-03-28 20:43:50 +03:00
Maya Eary
c8207d474f Generalized load_quantized 2023-03-28 20:38:55 +03:00
oobabooga
cac577d99f Fix interface reloading 2023-03-28 13:25:58 -03:00
oobabooga
88ad86249d Remove unnecessary file 2023-03-28 13:19:52 -03:00
oobabooga
91aa5b460e If both .pt and .safetensors are present, download only safetensors 2023-03-28 13:08:38 -03:00