Commit Graph

3919 Commits

Author SHA1 Message Date
Shouyi
031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
Eve
f653546484
README updates and improvements (#3198) 2023-07-25 18:58:13 -03:00
Ikko Eltociear Ashimine
b09e4f10fd
Fix typo in README.md (#3286)
tranformers -> transformers
2023-07-25 18:56:25 -03:00
oobabooga
7bc408b472 Change rms_norm_eps to 5e-6 for llama-2-70b ggml
Based on https://github.com/ggerganov/llama.cpp/pull/2384
2023-07-25 14:54:57 -07:00
oobabooga
ef8637e32d
Add extension example, replace input_hijack with chat_input_modifier (#3307) 2023-07-25 18:49:56 -03:00
oobabooga
08c622df2e Autodetect rms_norm_eps and n_gqa for llama-2-70b 2023-07-24 15:27:34 -07:00
oobabooga
a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
oobabooga
6f4830b4d3 Bump peft commit 2023-07-24 09:49:57 -07:00
matatonic
90a4ab631c
extensions/openai: Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. (#3122) 2023-07-24 11:28:12 -03:00
jllllll
1141987a0d
Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading (#3225) 2023-07-24 11:25:36 -03:00
iongpt
74fc5dd873
Add user-agent to download-model.py requests (#3243) 2023-07-24 11:19:13 -03:00
Ikko Eltociear Ashimine
b2d5433409
Fix typo in deepspeed_parameters.py (#3222)
configration -> configuration
2023-07-24 11:17:28 -03:00
jllllll
eb105b0495
Bump llama-cpp-python to 0.1.74 (#3257) 2023-07-24 11:15:42 -03:00
jllllll
152cf1e8ef
Bump bitsandbytes to 0.41.0 (#3258)
e229fbce66...a06a0f6a08
2023-07-24 11:06:18 -03:00
jllllll
8d31d20c9a
Bump exllama module to 0.0.8 (#3256)
39b3541cdd...3f83ebb378
2023-07-24 11:05:54 -03:00
oobabooga
cc2ed46d44
Make chat the default again 2023-07-20 18:55:09 -03:00
jllllll
fcb215fed5
Add check for compute support for GPTQ-for-LLaMa (#104)
Installs from main cuda repo if fork not supported
Also removed cuBLAS llama-cpp-python installation in preperation for 4b19b74e6c
2023-07-20 11:11:00 -03:00
oobabooga
63ece46213 Merge branch 'main' into dev 2023-07-20 07:06:41 -07:00
oobabooga
6415cc68a2 Remove obsolete information from README 2023-07-19 21:20:40 -07:00
oobabooga
4b19b74e6c Add CUDA wheels for llama-cpp-python by jllllll 2023-07-19 19:33:43 -07:00
oobabooga
05f4cc63c8 Merge branch 'main' into dev 2023-07-19 19:22:34 -07:00
jllllll
4df3f72753
Fix GPTQ fail message not being shown on update (#103) 2023-07-19 22:25:09 -03:00
jllllll
87926d033d
Bump exllama module to 0.0.7 (#3211) 2023-07-19 22:24:47 -03:00
oobabooga
913e060348 Change the default preset to Divine Intellect
It seems to reduce hallucination while using instruction-tuned models.
2023-07-19 08:24:37 -07:00
oobabooga
0d7f43225f Merge branch 'dev' 2023-07-19 07:20:13 -07:00
oobabooga
08c23b62c7 Bump llama-cpp-python and transformers 2023-07-19 07:19:12 -07:00
oobabooga
5447e75191 Merge branch 'dev' 2023-07-18 15:36:26 -07:00
oobabooga
8ec225f245 Add EOS/BOS tokens to Llama-2 template
Following this comment:
https://github.com/ggerganov/llama.cpp/issues/2262#issuecomment-1641063329
2023-07-18 15:35:27 -07:00
oobabooga
3ef49397bb
Merge pull request #3195 from oobabooga/dev
v1.3
2023-07-18 17:33:11 -03:00
oobabooga
070a886278 Revert "Prevent lists from flickering in chat mode while streaming"
This reverts commit 5e5d926d2b.
2023-07-18 13:23:29 -07:00
oobabooga
a2918176ea Update LLaMA-v2-model.md (thanks Panchovix) 2023-07-18 13:21:18 -07:00
oobabooga
e0631e309f
Create instruction template for Llama-v2 (#3194) 2023-07-18 17:19:18 -03:00
oobabooga
603c596616 Add LLaMA-v2 conversion instructions 2023-07-18 10:29:56 -07:00
jllllll
c535f14e5f
Bump bitsandbytes Windows wheel to 0.40.2 (#3186) 2023-07-18 11:39:43 -03:00
jllllll
d7a14174a2
Remove auto-loading when only one model is available (#3187) 2023-07-18 11:39:08 -03:00
randoentity
a69955377a
[GGML] Support for customizable RoPE (#3083)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-07-17 22:32:37 -03:00
appe233
89e0d15cf5
Use 'torch.backends.mps.is_available' to check if mps is supported (#3164) 2023-07-17 21:27:18 -03:00
dependabot[bot]
234c58ccd1
Bump bitsandbytes from 0.40.1.post1 to 0.40.2 (#3178) 2023-07-17 21:24:51 -03:00
oobabooga
49a5389bd3
Bump accelerate from 0.20.3 to 0.21.0 2023-07-17 21:23:59 -03:00
oobabooga
8c1c2e0fae Increase max_new_tokens upper limit 2023-07-17 17:08:22 -07:00
oobabooga
5e5d926d2b Prevent lists from flickering in chat mode while streaming 2023-07-17 17:00:49 -07:00
dependabot[bot]
02a5fe6aa2
Bump accelerate from 0.20.3 to 0.21.0
Bumps [accelerate](https://github.com/huggingface/accelerate) from 0.20.3 to 0.21.0.
- [Release notes](https://github.com/huggingface/accelerate/releases)
- [Commits](https://github.com/huggingface/accelerate/compare/v0.20.3...v0.21.0)

---
updated-dependencies:
- dependency-name: accelerate
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-17 20:18:31 +00:00
oobabooga
60a3e70242 Update LLaMA links and info 2023-07-17 12:51:01 -07:00
oobabooga
f83fdb9270 Don't reset LoRA menu when loading a model 2023-07-17 12:50:25 -07:00
oobabooga
4ce766414b Bump AutoGPTQ version 2023-07-17 10:02:12 -07:00
oobabooga
b1a6ea68dd Disable "autoload the model" by default 2023-07-17 07:40:56 -07:00
oobabooga
656b457795 Add Airoboros-v1.2 template 2023-07-17 07:27:42 -07:00
oobabooga
a199f21799 Optimize llamacpp_hf a bit 2023-07-16 20:49:48 -07:00
oobabooga
9f08038864
Merge pull request #3163 from oobabooga/dev
v1.2
2023-07-16 02:43:18 -03:00
oobabooga
6a3edb0542 Clean up llamacpp_hf.py 2023-07-15 22:40:55 -07:00