* use javascript generators as much cleaner API
Also add ways to access completion as promise and EventSource
* export llama_timings as struct and expose them in server
* update readme, update baked includes
* llama : uniform variable names + struct init
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* expose simple web interface on root domain
* embed index and add --path for choosing static dir
* allow server to multithread
because web browsers send a lot of garbage requests we want the server
to multithread when serving 404s for favicon's etc. To avoid blowing up
llama we just take a mutex when it's invoked.
* let's try this with the xxd tool instead and see if msvc is happier with that
* enable server in Makefiles
* add /completion.js file to make it easy to use the server from js
* slightly nicer css
* rework state management into session, expose historyTemplate to settings
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>