stduhpf e0324285a5
speculative : threading options (#4959)
* speculative: expose draft threading

* fix usage format

* accept -td and -tbd args

* speculative: revert default behavior when -td is unspecified

* fix trailing whitespace
2024-01-16 13:04:32 +02:00
..

llama.cpp/examples/speculative

Demonstration of speculative decoding and tree-based speculative decoding techniques

More info: