mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-26 06:10:29 +01:00
2891c8aa9a
* BERT model graph construction (build_bert) * WordPiece tokenizer (llm_tokenize_wpm) * Add flag for non-causal attention models * Allow for models that only output embeddings * Support conversion of BERT models to GGUF * Based on prior work by @xyzhang626 and @skeskinen --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
embedding.cpp | ||
README.md |
llama.cpp/example/embedding
This example demonstrates generate high-dimensional embedding vector of a given text with llama.cpp.
Quick Start
To get started right away, run the following command, making sure to use the correct path for the model you have:
Unix-based systems (Linux, macOS, etc.):
./embedding -m ./path/to/model --log-disable -p "Hello World!" 2>/dev/null
Windows:
embedding.exe -m ./path/to/model --log-disable -p "Hello World!" 2>$null
The above command will output space-separated float values.