Commit Graph

8 Commits

Author SHA1 Message Date
Xuan Son Nguyen
caa106d4e0
Server: format error to json (#5961)
* server: format error to json

* server: do not crash on grammar error

* fix api key test case

* revert limit max n_predict

* small fix

* correct coding style

* update completion.js

* launch_slot_with_task

* update docs

* update_slots

* update webui

* update readme
2024-03-11 10:56:41 +01:00
Alexey Parfenov
213d1439fa
server : remove model.json endpoint (#5371) 2024-02-06 20:08:38 +02:00
Georgi Gerganov
e0085fdf7c
Revert "server : change deps.sh xxd files to string literals (#5221)"
This reverts commit 4003be0e5f.
2024-01-30 21:19:26 +02:00
JohnnyB
4003be0e5f
server : change deps.sh xxd files to string literals (#5221)
* Changed ugly xxd to literals.

HPP files are much more readable as multiline literals rather than hex arrays.

* Dashes in literal variable names.

Replace . and - with _ in file names -> variable names.

* Comment on removing xxd.

XXD-> string literals

* XXD to string literals.

Replaced these unreadable headers with string literal versions using new deps.sh.
2024-01-30 20:15:05 +02:00
Phil H
0ef3ca2ac6
server : add token counts to html footer (#4738)
* server: add token counts to stats

* server: generate hpp

---------

Co-authored-by: phiharri <ph@got-root.co.uk>
2024-01-02 17:48:49 +02:00
Cebtenzzre
182af739c4
server: regenerate completion.js.hpp (#2515) 2023-08-04 21:00:57 +02:00
Tobias Lütke
31cfbb1013
Expose generation timings from server & update completions.js (#2116)
* use javascript generators as much cleaner API

Also add ways to access completion as promise and EventSource

* export llama_timings as struct and expose them in server

* update readme, update baked includes

* llama : uniform variable names + struct init

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-05 16:51:13 -04:00
Tobias Lütke
7ee76e45af
Simple webchat for server (#1998)
* expose simple web interface on root domain

* embed index and add --path for choosing static dir

* allow server to multithread

because web browsers send a lot of garbage requests we want the server
to multithread when serving 404s for favicon's etc. To avoid blowing up
llama we just take a mutex when it's invoked.


* let's try this with the xxd tool instead and see if msvc is happier with that

* enable server in Makefiles

* add /completion.js file to make it easy to use the server from js

* slightly nicer css

* rework state management into session, expose historyTemplate to settings

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04 16:05:27 +02:00