1
0
mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-24 10:29:21 +01:00
Commit Graph

24 Commits

Author SHA1 Message Date
Erik Scholz
a98b1633d5
nix : add cuda, use a symlinked toolkit for cmake () 2023-09-25 13:48:30 +02:00
kang
80834daecf
flake : Restore default package's buildInputs () 2023-09-20 15:48:22 +02:00
Evgeny Kurnevsky
235f7c193b
flake : use pkg-config instead of pkgconfig ()
pkgconfig is an alias, it got removed from nixpkgs:
295a5e1e2b/pkgs/top-level/aliases.nix (L1408)
2023-09-15 11:10:22 +03:00
jneem
feea179e9f
flake : allow $out/include to already exist () 2023-09-14 21:54:47 +03:00
Asbjørn Olling
cf8238e7f4
flake : include llama.h in nix output () 2023-09-14 20:25:00 +03:00
takov751
ec2a24fedf
flake : add train-text-from-scratch to flake.nix () 2023-09-08 19:06:26 +03:00
Tungsten842
61d1a2895e
flake.nix : add rocm support and cleanup () 2023-08-26 21:19:44 +03:00
Volodymyr Vitvitskyi
f305bad11e
flake : build llama.cpp on Intel with nix ()
Problem
-------
`nix build` fails with missing `Accelerate.h`.

Changes
-------
- Fix build of the llama.cpp with nix for Intel: add the same SDK frameworks as
for ARM
- Add `quantize` app to the output of nix flake
- Extend nix devShell with llama-python so we can use convertScript

Testing
-------
Testing the steps with nix:
1. `nix build`
Get the model and then
2. `nix develop` and then `python convert.py models/llama-2-7b.ggmlv3.q4_0.bin`
3. `nix run llama.cpp#quantize -- open_llama_7b/ggml-model-f16.gguf ./models/ggml-model-q4_0.bin 2`
4. `nix run llama.cpp#llama -- -m models/ggml-model-q4_0.bin -p "What is nix?" -n 400 --temp 0.8 -e -t 8`

Co-authored-by: Volodymyr Vitvitskyi <volodymyrvitvitskyi@SamsungPro.local>
2023-08-26 16:25:39 +03:00
Shouzheng Liu
bf83bff674
metal : matrix-matrix multiplication kernel ()
* metal: matrix-matrix multiplication kernel

This commit removes MPS and uses custom matrix-matrix multiplication
kernels for all quantization types. This commit also adds grouped-query
attention to support llama2 70B.

* metal: fix performance degradation from gqa

Integers are slow on the GPU, and 64-bit divides are extremely slow.
In the context of GQA, we introduce a 64-bit divide that cannot be
optimized out by the compiler, which results in a decrease of ~8% in
inference performance. This commit fixes that issue by calculating a
part of the offset with a 32-bit divide. Naturally, this limits the
size of a single matrix to ~4GB. However, this limitation should
suffice for the near future.

* metal: fix bugs for GQA and perplexity test.

I mixed up ne02 and nb02 in previous commit.
2023-08-16 23:07:04 +03:00
wzy
bc3ec2cdc9
flake : support nix build '.#opencl' () 2023-07-23 14:57:02 +03:00
wzy
78a3d13424
flake : remove intel mkl from flake.nix due to missing files ()
NixOS's mkl misses some libraries like mkl-sdl.pc. See 
Currently NixOS doesn't have intel C compiler (icx, icpx). See https://discourse.nixos.org/t/packaging-intel-math-kernel-libraries-mkl/975
So remove it from flake.nix

Some minor changes:

- Change pkgs.python310 to pkgs.python3 to keep latest
- Add pkgconfig to devShells.default
- Remove installPhase because we have `cmake --install` from 
2023-07-21 13:26:34 +03:00
wzy
45a1b07e9b
flake : update flake.nix ()
When `isx86_32 || isx86_64`, it will use mkl, else openblas

According to
https://discourse.nixos.org/t/rpath-of-binary-contains-a-forbidden-reference-to-build/12200/3,
add -DCMAKE_SKIP_BUILD_RPATH=ON

Fix , Nix doesn't provide mkl-sdl.pc.
When we build with -DBUILD_SHARED_LIBS=ON, -DLLAMA_BLAS_VENDOR=Intel10_lp64
replace mkl-sdl.pc by mkl-dynamic-lp64-iomp.pc
2023-07-19 10:01:55 +03:00
Dave Della Costa
a6803cab94
flake : add runHook preInstall/postInstall to installPhase so hooks function () 2023-07-14 22:13:38 +03:00
Rowan Hart
fdd1860911
flake : fix ggml-metal.metal path and run nixfmt () 2023-06-24 14:07:08 +03:00
Faez Shakil
fc45a81bc6
exposed modules so that they can be invoked by nix run github:ggerganov/llama.cpp#server etc () 2023-06-17 14:13:05 +02:00
Andrei
303f5809f1
metal : fix issue with ggml-metal.metal path. Closes ()
* Fix issue with ggml-metal.metal path

* Add ggml-metal.metal as a resource for llama target

* Update flake.nix metal kernel substitution
2023-06-10 17:47:34 +03:00
jacobi petrucciani
5b57a5b726
flake : update to support metal on m1/m2 () 2023-06-07 07:15:31 +03:00
Pavol Rusnak
bb98e77be7
nix: use convert.py instead of legacy wrapper convert-pth-to-ggml.py () 2023-04-25 23:19:57 +02:00
Pavol Rusnak
a32f7acc9f
py : cleanup dependencies ()
after  we do not need torch, tqdm and requests in the dependencies
2023-04-14 15:37:11 +02:00
Pavol Rusnak
c729ff730a
flake.nix: add all binaries from bin () 2023-04-13 15:49:05 +02:00
lon
317fb12fbd
Add new binaries to flake.nix () 2023-04-08 12:04:23 +02:00
Ben Siraphob
a18c19259a Fix Nix build 2023-03-23 17:51:26 +01:00
Ben Siraphob
bd4b46d6ba Nix flake: set meta.mainProgram to llama 2023-03-20 22:50:22 +01:00
Niklas Korz
a292747893
Nix flake ()
* Nix flake

* Nix: only add Accelerate framework on macOS

* Nix: development shel, direnv and compatibility

* Nix: use python packages supplied by withPackages

* Nix: remove channel compatibility

* Nix: fix ARM neon dotproduct on macOS

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-17 23:03:48 +01:00