* Initial version of q4_0 matrix multiplication benchmark
* Bugfix: Added dependency to ggml.o to benchmark
* Reviewer requests: added parameter for threads, switched to ggml_time_us()
* Reviewer input: removed rtsc, use epsilon for check
* Review comment: Removed set_locale
* Feature: Param for numer of iterations, Bugfix for use of parameter threads
* Reviewer suggestion: Moved to examples
* Reviewer feedback: Updated clean: and benchmark: sections
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>