1
0
mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-21 17:19:23 +01:00
Commit Graph

204 Commits

Author SHA1 Message Date
Eve
017efe899d
cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor ()
* fix LLAMA_NATIVE

* syntax

* alternate implementation

* my eyes must be getting bad...

* set cmake LLAMA_NATIVE=ON by default

* march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc

* revert 8283237 and only allow LLAMA_NATIVE on x86 like the Makefile

* remove -DLLAMA_MPI=ON

---------

Co-authored-by: netrunnereve <netrunnereve@users.noreply.github.com>
2023-10-03 19:53:15 +03:00
Eve
0512d66670
ci : multithreaded builds ()
* mac and linux threads

* windows

* Update build.yml

* Update build.yml

* Update build.yml

* automatically get thread count

* windows syntax

* try to fix freebsd

* Update build.yml

* Update build.yml

* Update build.yml
2023-09-28 22:31:04 +03:00
Georgi Gerganov
2619109ad5
ci : disable freeBSD builds due to lack of VMs () 2023-09-28 19:36:36 +03:00
Alon
a40f2b656f
CI: FreeBSD fix ()
* - freebsd ci: use qemu
2023-09-20 14:06:36 +02:00
Erik Scholz
7ddf185537
ci : switch cudatoolkit install on windows to networked () 2023-09-18 02:21:47 +02:00
IsaacDynamo
b541b4f0b1
Enable BUILD_SHARED_LIBS=ON on all Windows builds () 2023-09-16 19:35:25 +02:00
Cebtenzzre
69eb67e282
fix build numbers by setting fetch-depth=0 () 2023-09-15 15:18:15 -04:00
Alon
83a53b753a
CI: add FreeBSD & simplify CUDA windows ()
* add freebsd to ci

* bump actions/checkout to v3
* bump cuda 12.1.0 -> 12.2.0
* bump Jimver/cuda-toolkit version

* unify and simplify "Copy and pack Cuda runtime"
* install only necessary cuda sub packages
2023-09-14 19:21:25 +02:00
dylan
980ab41afb
docker : add gpu image CI builds ()
Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.

Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com>
2023-09-14 19:47:00 +03:00
Jhen-Jie Hong
1b0d09259e
cmake : support build for iOS/tvOS ()
* cmake : support build for iOS/tvOS

* ci : add iOS/tvOS build into macOS-latest-cmake

* ci : split ios/tvos jobs
2023-09-11 19:49:06 +08:00
Alon
afc43d5f82
cov : add Code Coverage and codecov.io integration ()
* update .gitignore

* makefile: add coverage support (lcov, gcovr)

* add code-coverage workflow

* update code coverage workflow

* wun on ubuntu 20.04

* use gcc-8

* check why the job hang

* add env vars

* add LLAMA_CODE_COVERAGE=1 again

* - add CODECOV_TOKEN
- add missing make lcov-report

* install lcov

* update make file -pb flag

* remove unused  GGML_NITER from workflows

* wrap coverage output files in COV_TARGETS
2023-09-03 11:48:49 +03:00
M. Yusuf Sarıgöz
0d1c706181
gguf : add workflow for Pypi publishing ()
* gguf : add workflow for Pypi publishing

* gguf : add workflow for Pypi publishing

* fix trailing whitespace
2023-08-30 12:47:40 +03:00
alonfaraj
9509294420
make : add test and update CI ()
* build ci: run make test

* makefile:
- add all
- add test

* enable tests/test-tokenizer-0-llama

* fix path to model

* remove gcc-8 from macos build test

* Update Makefile

* Update Makefile
2023-08-30 12:42:51 +03:00
DannyDaemonic
ef955fbd23
Tag release with build number ()
* Modified build.yml to use build number for release

* Add the short hash back into the tag

* Prefix the build number with b
2023-08-24 15:58:02 +02:00
Eve
1fed755b1f
ci : add non-AVX scalar build/test ()
* noavx build and test

* we don't need to remove f16c in windows
2023-07-25 15:16:13 +03:00
Evan Miller
5656d10599
mpi : add support for distributed inference via MPI ()
* MPI support, first cut

* fix warnings, update README

* fixes

* wrap includes

* PR comments

* Update CMakeLists.txt

* Add GH workflow, fix test

* Add info to README

* mpi : trying to move more MPI stuff into ggml-mpi (WIP) ()

* mpi : add names for layer inputs + prep ggml_mpi_graph_compute()

* mpi : move all MPI logic into ggml-mpi

Not tested yet

* mpi : various fixes - communication now works but results are wrong

* mpi : fix output tensor after MPI compute (still not working)

* mpi : fix inference

* mpi : minor

* Add OpenMPI to GH action

* [mpi] continue-on-error: true

* mpi : fix after master merge

* [mpi] Link MPI C++ libraries to fix OpenMPI

* tests : fix new llama_backend API

* [mpi] use MPI_INT32_T

* mpi : factor out recv / send in functions and reuse

* mpi : extend API to allow usage with outer backends (e.g. Metal)

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-10 18:49:56 +03:00
Georgi Gerganov
a7e20edf22
ci : switch threads to 1 () 2023-07-07 21:23:57 +03:00
Qingyou Meng
1d656d6360
ggml : change ggml_graph_compute() API to not require context ()
* ggml_graph_compute: deprecate using ggml_context, try resolve issue 

* rewrite: no longer consider backward compitability; plan and make_plan

* minor: rename ctx as plan; const

* remove ggml_graph_compute from tests/test-grad0.c, but current change breaks backward

* add static ggml_graph_compute_sugar()

* minor: update comments

* reusable buffers

* ggml : more consistent naming + metal fixes

* ggml : fix docs

* tests : disable grad / opt + minor naming changes

* ggml : add ggml_graph_compute_with_ctx()

- backwards compatible API
- deduplicates a lot of copy-paste

* ci : enable test-grad0

* examples : factor out plan allocation into a helper function

* llama : factor out plan stuff into a helper function

* ci : fix env

* llama : fix duplicate symbols + refactor example benchmark

* ggml : remove obsolete assert + refactor n_tasks section

* ggml : fix indentation in switch

* llama : avoid unnecessary bool

* ggml : remove comments from source file and match order in header

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-07 19:24:01 +03:00
Stephan Walter
1b107b8550
ggml : generalize quantize_fns for simpler FP16 handling ()
* Generalize quantize_fns for simpler FP16 handling

* Remove call to ggml_cuda_mul_mat_get_wsize

* ci : disable FMA for mac os actions

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-05 19:13:06 +03:00
Erik Scholz
698efad5fb
CI: make the brew update temporarily optional. ()
until they decide to fix the brew installation in the macos runners.
see the open issues. eg https://github.com/actions/runner-images/pull/7710
2023-07-04 01:50:12 +02:00
slaren
e4caa8da59
ci : run when changing only the CUDA sources () 2023-06-12 20:12:47 +03:00
Georgi Gerganov
e7fe66e670
ci : disable auto tidy () 2023-06-05 23:05:05 +03:00
Kerfuffle
0df7d63e5b
Include server in releases + other build system cleanups ()
Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases.

Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default)

Fix issue where `vdot` binary wasn't removed when running `make clean`.

Fix compile warnings in `server` example.

Add `.hpp` files to trigger workflow (the server example has one).
2023-05-27 11:04:14 -06:00
Henri Vasserman
0ecb1bbbeb
[CI] Fix openblas ()
* Fix OpenBLAS build

* Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.
2023-05-27 17:24:06 +03:00
Henri Vasserman
83c54e6da5
[CI] CLBlast: Fix directory name () 2023-05-27 14:18:25 +02:00
Henri Vasserman
ac7876ac20
Update CLBlast to 1.6.0 ()
* Update CLBlast to 1.6.0
2023-05-24 10:30:09 +03:00
Zenix
b8ee340abe
feature : support blis and other blas implementation ()
* feature: add blis support

* feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927

* fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake

* Fix typo in INTEGER

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Fix: blas changes on ci

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-20 17:58:31 +03:00
slaren
553fd4d4b5
Add clang-tidy reviews to CI () 2023-05-12 15:40:53 +02:00
Henri Vasserman
e1295513a4
CI: add Windows CLBlast and OpenBLAS builds ()
* Add OpenCL and CLBlast support

* Add OpenBLAS support

* Remove testing from matrix

* change build name to 'clblast'
2023-05-07 13:20:09 +02:00
Erik Scholz
a3b85b28da
ci : add cublas to windows release () 2023-05-05 22:56:09 +02:00
Stephan Walter
2ec83428de
Fix build for gcc 8 and test in CI () 2023-04-24 15:38:26 +00:00
Stephan Walter
857308d1e8
ci : trigger CI for drafts, but not most PR actions () 2023-04-22 16:12:29 +03:00
Howard Su
7e312f165c
cmake : fix build under Windows when enable BUILD_SHARED_LIBS ()
* Fix build under Windows when enable BUILD_SHARED_LIBS

* Make AVX512 test on Windows to build the shared libs
2023-04-22 11:18:20 +03:00
Ivan Komarov
6a9661ea5a
ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the CI ()
[Accelerate](https://developer.apple.com/documentation/accelerate) is an Apple framework which can only be used on macOS, and the CMake build [ignores](https://github.com/ggerganov/llama.cpp/blob/master/CMakeLists.txt#L102) the `LLAMA_ACCELERATE` variable when run on non-Apple platforms. This implies setting `LLAMA_ACCELERATE` is a no-op on Ubuntu and can be removed.

This will reduce visual noise in CI check results (in addition to reducing the number of checks we have to run for every PR). Right now every sanitized build is duplicated twice for no good reason (e.g., we have `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, ON)` and `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, OFF)`).
2023-04-20 18:15:18 +03:00
Georgi Gerganov
5af8e32238
ci : do not run on drafts 2023-04-18 19:57:06 +03:00
Pavol Rusnak
8b679987cd
Fix whitespace, add .editorconfig, add GitHub workflow () 2023-04-11 19:45:44 +00:00
anzz1
9cbc404ba6
ci : re-enable AVX512 testing (Windows-MSVC) ()
* CI: Re-enable AVX512 testing (Windows-MSVC)

Now with 100% less base64 encoding

* plain __cpuid is enough here
2023-03-29 23:44:39 +03:00
anzz1
f1217055ea
CI: fix subdirectory path globbing ()
- Changes in subdirectories will now be detecter properly
- (Windows-MSVC) AVX512 tests temporarily disabled
2023-03-28 22:43:25 +03:00
Georgi Gerganov
96f9c0506f
ci : make ctest verbose, hopefully we see what is wrong with the sanitizer 2023-03-28 20:01:09 +03:00
Erik Scholz
34c1072e49
ci: add debug build to sanitizer build matrix () 2023-03-26 15:48:40 +00:00
Juan Calderon-Perez
8c2ec5e21d
Add support for linux/arm64 platform during Docker Builds ()
* Add support for linux/arm64 platform

* Add platform to versioned builds
2023-03-26 14:48:42 +00:00
anzz1
19726169b3
CI: Run other sanitizer builds even if one fails ()
applies only to sanitizer builds so they wont be cancelled
2023-03-26 00:13:28 +02:00
anzz1
2f7bf7dd7c
CMake / CI additions ()
* CMake: Add AVX512 option

* CI: Add AVX/AVX512 builds (Windows)
(AVX512 tests can only be run when the worker happens to support it, building works anyway)

* CMake: Fix sanitizer linkage ( merged  )

* CI: Add sanitizer builds (Ubuntu)

* CI: Fix release tagging
(change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input  is merged)
2023-03-25 23:38:11 +02:00
anzz1
e4412b45e3
CI: CMake: Separate build and test steps ()
* CI: Separate Build and Test steps (CMake)

* CI: Make sure build passes before running tests (CMake)

* CI: Standardise step id names
2023-03-23 04:20:34 +02:00
Stephan Walter
69c92298a9
Deduplicate q4 quantization functions ()
* Deduplicate q4 quantization functions

* Use const; add basic test

* Re-enable quantization test

* Disable AVX2 flags in CI

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-22 19:29:06 +02:00
anzz1
e6c9e0986c Fix bin dir for win ci 2023-03-22 00:01:08 +02:00
Erik Scholz
01a297b099
specify build type for ctest on windows () 2023-03-21 23:34:25 +02:00
Georgi Gerganov
eb34620aec
Add tokenizer test + revert to C++11 ()
* Add test-tokenizer-0 to do a few tokenizations - feel free to expand
* Added option to convert-pth-to-ggml.py script to dump just the vocabulary
* Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
* Added utility to load vocabulary file from previous point (temporary implementation)
* Avoid using std::string_view and drop back to C++11 (hope I didn't break something)
* Rename gpt_vocab -> llama_vocab
* All CMake binaries go into ./bin/ now
2023-03-21 17:29:41 +02:00
Bernat Vadell
0f1b21cb90
Docker - Fix publish docker image in GitHub Registry ()
* fix publish permission

* try to fix docker pipeline using as password github_token & username repository_owner
2023-03-20 18:05:20 +01:00
anzz1
b2de7f18df
CI Improvements ()
* CI Improvements

Manual build feature, autoreleases for Windows

* better CI naming convention

use branch name in releases and tags
2023-03-18 09:27:12 +02:00
mmyjona
6b0df5ccf3
add ptread link to fix cmake build under linux ()
* add ptread link to fix cmake build under linux

* add cmake to linux and macos platform

* separate make and cmake workflow

---------

Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>
2023-03-17 13:38:24 -03:00
Bernat Vadell
2af23d3043
🚀 Dockerize llamacpp ()
* feat: dockerize llamacpp

* feat: split build & runtime stages

* split dockerfile into main & tools

* add quantize into tool docker image

* Update .devops/tools.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* add docker action pipeline

* change CI to publish at github docker registry

* fix name runs-on macOS-latest is macos-latest (lowercase)

* include docker versioned images

* fix github action docker

* fix docker.yml

* feat: include all-in-one command tool & update readme.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17 10:47:06 +01:00
Sebastián A
2f700a2738
Add windows to the CI () 2023-03-13 22:29:10 +02:00
Georgi Gerganov
2d555e5b42
Add CI () 2023-03-12 22:08:24 +02:00