This commit removes the setting of the `used` field of the contexts in
the global state (g_state) in `ggml_init`.
The motivation for this change is that I believe that this additional
initialization might not be required after the changes in Commit
45fc4fed0b9fb5b1af4a8525cbebb95e11208732 ("sync : latest changes from
whisper.cpp"), which changed the initialization of the contexts field
from `{ 0 }` to `{ { 0 } }`:
```console
g_state = (struct ggml_state) {
- /*.contexts =*/ { 0 },
+ /*.contexts =*/ { { 0 } },
};
```
My understanding is that the `{0}` initialization might not have
zero-initialized all the nested fields in every array element because of
compiler differences, and might have been the reason for having the
explicit setting of the `used` fields to false.
* added classic vim support
* fixed ring update, removed blank line
* minor
* minor
* minor doc update
* removed uneeded var
* minor
* minor
* fixed job_start creating new scratch buffers
* fixed job_start creating new scratch buffers
* fixed ghost text indenting when expandtab is on
* removed unused code
* minor
* unified fim_on_exit
* minor
* vim ghost text rendering now uses pos_x and pos_y parameters
* renamed *_hlgroup to hlgroup_*
* renamed *_ghost_text to ghost_text_*, moved nvim/vim detection to llama#init()
* minor
---------
Co-authored-by: Michael Coppola <info@michaeljcoppola.com>
This commit renames the member field batch in llm_build_context to
ubatch, and also the parameter batch in llama_build_graph, and
llama_set_inputs to ubatch.
The motivation for this change is to make the code more readable
(considering there are the structs llama_batch and llama_sbatch), and
consistent with other parts of the code base where parameters/fields of
type llama_ubatch are named ubatch.
* [CANN] Adapt to dynamically loadable backends mechanism
* Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class
* Handle the review comments of this pull request
This commit fixes two typos in the help text for the `--embd-normalize`
and `--embd-separator` arguments. It also updates common.h which contain
the same typo in two comments.
This commit updates the argument value hint for the `--attention`
argument to `non-causal`.
The motivation for this change is that the only values for this argument
are `causal` and `non-causal`.
* llama : deprecate softmax sampler + fix dist sampler
ggml-ci
* tests : replace macros with functions
ggml-ci
* sampling : change temperature sampler logic
For t <= 0.0f, keep the max logit intact and set the rest to -inf
* cont : no need for special "greedy" logic
top-k == 1 is the same
* tests : init prob correctly
* llama : handle temp <= 0.0 in the temp_ext sampler too
ggml-ci
* cont : avoid extra loop in temperature sampler for sub-zero temp
ggml-ci
llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.
add intel amx isa detection
add vnni kernel for gemv cases
add vnni and amx kernel support for block_q8_0
code cleanup
fix packing B issue
enable openmp
fine tune amx kernel
switch to aten parallel pattern
add error message for nested parallelism
code cleanup
add f16 support in ggml-amx
add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS
update CMakeList
update README
fix some compilation warning
fix compiler warning when amx is not enabled
minor change
ggml-ci
move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp
ggml-ci
update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16
ggml-ci
add amx as an ggml-backend
update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h
minor change
update CMakeLists.txt
minor change
apply weight prepacking in set_tensor method in ggml-backend
fix compile error
ggml-ci
minor change
ggml-ci
update CMakeLists.txt
ggml-ci
add march dependency
minor change
ggml-ci
change ggml_backend_buffer_is_host to return false for amx backend
ggml-ci
fix supports_op
use device reg for AMX backend
ggml-ci
minor change
ggml-ci
minor change
fix rebase
set .buffer_from_host_ptr to be false for AMX backend
* fix: use `vm_allocate` to allocate CPU backend buffer on macOS
* fix: switch to `posix_memalign` to keep existing `free()` usages work
* feat: move `GGML_ALIGNED_MALLOC` to `ggml-backend-impl.h`, add support for `vm_allocate` on macOS
* style: formatting
* fix: move const outside of `#ifndef`
* style: formatting
* fix: unused var
* fix: transform `GGML_ALIGNED_MALLOC` and `GGML_ALIGNED_FREE` into functions and add them to `ggml-impl.h`
* fix: unused var
* fix: page align to `GGUF_DEFAULT_ALIGNMENT`
* fix: page align to `TENSOR_ALIGNMENT`
* fix: convert `TENSOR_ALIGNMENT` to a macro
* fix: increase page size to `32` on iOS
* fix: iOS page size
* fix: `hbw_posix_memalign` alignment
* llama : suppress conversion from 'size_t' to 'int'
This commit updates llm_tokenizer_spm.tokenize to suppress/remove the
following warnings that are generated on Windows when using MSVC:
```console
src\llama-vocab.cpp(211,1): warning C4267: 'argument':
conversion from 'size_t' to 'int', possible loss of data
src\llama-vocab.cpp(517,1): warning C4267: 'argument':
conversion from 'size_t' to 'int', possible loss of data
```
This is done by adding a cast for the size_t returned from
symbols.size(). I believe this is safe as it seems unlikely that
symbols, which stores an entry for each UTF8 character, would become
larger than INT_MAX.
The motivation for this change is to reduce the number of warnings that
are currently generated when building on Windows.
* squash! llama : suppress conversion from 'size_t' to 'int'
Move cast into for loop.
Prior to this commit, using a JSON Schema containing a string
with `pattern` regular expression that uses top-level alternation
(e.g. `"pattern": "^A|B|C|D$"`) would result in invalid JSON
output from the constrained sampling grammar, because it
ended up creating a grammar rule like this for the string:
```
thing ::= "\"" "A" | "B" | "C" | "D" "\"" space
```
Note that this rule will only match a starting quote for the "A" case,
and will only match an ending quote for the "D" case,
so this rule will always produce invalid JSON when used for sampling
(that is, the JSON will always be lacking the starting quote,
the ending quote, or both).
This was fixed in a simple way by adding parentheses to the
generated rule (for all string pattern rules, to keep it simple),
such that the new generated rule looks like this (correct):
```
thing ::= "\"" ("A" | "B" | "C" | "D") "\"" space
```