diff --git a/examples/server/README.md b/examples/server/README.md index a6eda3b32..be13529fc 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -234,6 +234,39 @@ node index.js - **GET** `/props`: Return the required assistant name and anti-prompt to generate the prompt in case you have specified a system prompt for all slots. +- **POST** `/v1/chat/completions`: OpenAI-compatible Chat Completions API. Given a ChatML-formatted json description in `messages`, it returns the predicted completion. While no strong claims of compatibility with OpenAI API spec is being made, in our experience it suffices to support many apps. Only ChatML-tuned models, such as Dolphin, OpenOrca, OpenHermes, OpenChat-3.5, etc can be used with this endpoint. Compared to `api_like_OAI.py` this API implementation does not require a wrapper to be served. + + *Options:* + + See (OpenAI Chat Completions API documentation)[https://platform.openai.com/docs/api-reference/chat]. While some OpenAI-specific features such as function calling aren't supported, llama.cpp `/completion`-specific features such are `mirostat` are supported. + + *Examples:* + + You can use either Python `openai` library with appropriate checkpoints, or raw HTTP requests: + + ```python + openai.api_base = "http://:port" + ``` + + ```shell + curl http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer no-key" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + { + "role": "system", + "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests." + }, + { + "role": "user", + "content": "Write a limerick about python exceptions" + } + ] + }' + ``` + ## More examples ### Change system prompt on runtime