Update server README.md

2025-01-31 06:03:11 +01:00 · 2023-11-22 03:55:23 -03:00 · 2023-11-22 03:55:23 -03:00 · af4d68b22d
commit af4d68b22d
parent 2f84f5dc84
1 changed files with 33 additions and 0 deletions
--- a/examples/server/README.md
+++ b/examples/server/README.md
@ -234,6 +234,39 @@ node index.js

 -   **GET** `/props`: Return the required assistant name and anti-prompt to generate the prompt in case you have specified a system prompt for all slots.

+-   **POST** `/v1/chat/completions`: OpenAI-compatible Chat Completions API. Given a ChatML-formatted json description in `messages`, it returns the predicted completion. While no strong claims of compatibility with OpenAI API spec is being made, in our experience it suffices to support many apps. Only ChatML-tuned models, such as Dolphin, OpenOrca, OpenHermes, OpenChat-3.5, etc can be used with this endpoint. Compared to `api_like_OAI.py` this API implementation does not require a wrapper to be served.
+
+    *Options:*
+
+    See (OpenAI Chat Completions API documentation)[https://platform.openai.com/docs/api-reference/chat]. While some OpenAI-specific features such as function calling aren't supported, llama.cpp `/completion`-specific features such are `mirostat` are supported.
+
+    *Examples:*
+
+    You can use either Python `openai` library with appropriate checkpoints, or raw HTTP requests:
+
+    ```python
+    openai.api_base = "http://<Your api-server IP>:port"
+    ```
+
+    ```shell
+    curl http://localhost:8080/v1/chat/completions \
+    -H "Content-Type: application/json" \
+    -H "Authorization: Bearer no-key" \
+    -d '{
+    "model": "gpt-3.5-turbo",
+    "messages": [
+    {
+        "role": "system",
+        "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."
+    },
+    {
+        "role": "user",
+        "content": "Write a limerick about python exceptions"
+    }
+    ]
+    }'
+    ```
+
 ## More examples

 ### Change system prompt on runtime