mirror of
https://github.com/oobabooga/text-generation-webui.git
synced 2024-11-22 08:07:56 +01:00
add whisper api to openai plugin (#3958)
This commit is contained in:
parent
cd534ba46e
commit
cc7f345c29
@ -5,11 +5,12 @@ This extension creates an API that works kind of like openai (ie. api.openai.com
|
|||||||
## Setup & installation
|
## Setup & installation
|
||||||
|
|
||||||
Install the requirements:
|
Install the requirements:
|
||||||
|
|
||||||
```
|
```
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
It listens on ```tcp port 5001``` by default. You can use the ```OPENEDAI_PORT``` environment variable to change this.
|
It listens on `tcp port 5001` by default. You can use the `OPENEDAI_PORT` environment variable to change this.
|
||||||
|
|
||||||
Make sure you enable it in server launch parameters, it should include:
|
Make sure you enable it in server launch parameters, it should include:
|
||||||
|
|
||||||
@ -17,11 +18,12 @@ Make sure you enable it in server launch parameters, it should include:
|
|||||||
--extensions openai
|
--extensions openai
|
||||||
```
|
```
|
||||||
|
|
||||||
You can also use the ``--listen`` argument to make the server available on the networ, and/or the ```--share``` argument to enable a public Cloudflare endpoint.
|
You can also use the `--listen` argument to make the server available on the networ, and/or the `--share` argument to enable a public Cloudflare endpoint.
|
||||||
|
|
||||||
To enable the basic image generation support (txt2img) set the environment variable ```SD_WEBUI_URL``` to point to your Stable Diffusion API ([Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui)).
|
To enable the basic image generation support (txt2img) set the environment variable `SD_WEBUI_URL` to point to your Stable Diffusion API ([Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui)).
|
||||||
|
|
||||||
For example:
|
For example:
|
||||||
|
|
||||||
```
|
```
|
||||||
SD_WEBUI_URL=http://127.0.0.1:7861
|
SD_WEBUI_URL=http://127.0.0.1:7861
|
||||||
```
|
```
|
||||||
@ -29,7 +31,7 @@ SD_WEBUI_URL=http://127.0.0.1:7861
|
|||||||
## Quick start
|
## Quick start
|
||||||
|
|
||||||
1. Install the requirements.txt (pip)
|
1. Install the requirements.txt (pip)
|
||||||
2. Enable the ```openeai``` module (--extensions openai), restart the server.
|
2. Enable the `openeai` module (--extensions openai), restart the server.
|
||||||
3. Configure the openai client
|
3. Configure the openai client
|
||||||
|
|
||||||
Most openai application can be configured to connect the API if you set the following environment variables:
|
Most openai application can be configured to connect the API if you set the following environment variables:
|
||||||
@ -42,7 +44,6 @@ OPENAI_API_BASE=http://0.0.0.0:5001/v1
|
|||||||
|
|
||||||
If needed, replace 0.0.0.0 with the IP/port of your server.
|
If needed, replace 0.0.0.0 with the IP/port of your server.
|
||||||
|
|
||||||
|
|
||||||
### Models
|
### Models
|
||||||
|
|
||||||
This has been successfully tested with Alpaca, Koala, Vicuna, WizardLM and their variants, (ex. gpt4-x-alpaca, GPT4all-snoozy, stable-vicuna, wizard-vicuna, etc.) and many others. Models that have been trained for **Instruction Following** work best. If you test with other models please let me know how it goes. Less than satisfying results (so far) from: RWKV-4-Raven, llama, mpt-7b-instruct/chat.
|
This has been successfully tested with Alpaca, Koala, Vicuna, WizardLM and their variants, (ex. gpt4-x-alpaca, GPT4all-snoozy, stable-vicuna, wizard-vicuna, etc.) and many others. Models that have been trained for **Instruction Following** work best. If you test with other models please let me know how it goes. Less than satisfying results (so far) from: RWKV-4-Raven, llama, mpt-7b-instruct/chat.
|
||||||
@ -53,7 +54,7 @@ For good results with the [Completions](https://platform.openai.com/docs/api-ref
|
|||||||
|
|
||||||
For good results with the [ChatCompletions](https://platform.openai.com/docs/api-reference/chat) or [Edits](https://platform.openai.com/docs/api-reference/edits) API endpoints you can use almost any model trained for instruction following. Be sure that the proper instruction template is detected and loaded or the results will not be good.
|
For good results with the [ChatCompletions](https://platform.openai.com/docs/api-reference/chat) or [Edits](https://platform.openai.com/docs/api-reference/edits) API endpoints you can use almost any model trained for instruction following. Be sure that the proper instruction template is detected and loaded or the results will not be good.
|
||||||
|
|
||||||
For the proper instruction format to be detected you need to have a matching model entry in your ```models/config.yaml``` file. Be sure to keep this file up to date.
|
For the proper instruction format to be detected you need to have a matching model entry in your `models/config.yaml` file. Be sure to keep this file up to date.
|
||||||
A matching instruction template file in the characters/instruction-following/ folder will loaded and applied to format messages correctly for the model - this is critical for good results.
|
A matching instruction template file in the characters/instruction-following/ folder will loaded and applied to format messages correctly for the model - this is critical for good results.
|
||||||
|
|
||||||
For example, the Wizard-Vicuna family of models are trained with the Vicuna 1.1 format. In the models/config.yaml file there is this matching entry:
|
For example, the Wizard-Vicuna family of models are trained with the Vicuna 1.1 format. In the models/config.yaml file there is this matching entry:
|
||||||
@ -64,7 +65,7 @@ For example, the Wizard-Vicuna family of models are trained with the Vicuna 1.1
|
|||||||
instruction_template: 'Vicuna-v1.1'
|
instruction_template: 'Vicuna-v1.1'
|
||||||
```
|
```
|
||||||
|
|
||||||
This refers to ```characters/instruction-following/Vicuna-v1.1.yaml```, which looks like this:
|
This refers to `characters/instruction-following/Vicuna-v1.1.yaml`, which looks like this:
|
||||||
|
|
||||||
```
|
```
|
||||||
user: "USER:"
|
user: "USER:"
|
||||||
@ -76,31 +77,31 @@ context: "A chat between a curious user and an artificial intelligence assistant
|
|||||||
For most common models this is already setup, but if you are using a new or uncommon model you may need add a matching entry to the models/config.yaml and possibly create your own instruction-following template and for best results.
|
For most common models this is already setup, but if you are using a new or uncommon model you may need add a matching entry to the models/config.yaml and possibly create your own instruction-following template and for best results.
|
||||||
|
|
||||||
If you see this in your logs, it probably means that the correct format could not be loaded:
|
If you see this in your logs, it probably means that the correct format could not be loaded:
|
||||||
|
|
||||||
```
|
```
|
||||||
Warning: Loaded default instruction-following template for model.
|
Warning: Loaded default instruction-following template for model.
|
||||||
```
|
```
|
||||||
|
|
||||||
### Embeddings (alpha)
|
### Embeddings (alpha)
|
||||||
|
|
||||||
Embeddings requires ```sentence-transformers``` installed, but chat and completions will function without it loaded. The embeddings endpoint is currently using the HuggingFace model: ```sentence-transformers/all-mpnet-base-v2``` for embeddings. This produces 768 dimensional embeddings (the same as the text-davinci-002 embeddings), which is different from OpenAI's current default ```text-embedding-ada-002``` model which produces 1536 dimensional embeddings. The model is small-ish and fast-ish. This model and embedding size may change in the future.
|
Embeddings requires `sentence-transformers` installed, but chat and completions will function without it loaded. The embeddings endpoint is currently using the HuggingFace model: `sentence-transformers/all-mpnet-base-v2` for embeddings. This produces 768 dimensional embeddings (the same as the text-davinci-002 embeddings), which is different from OpenAI's current default `text-embedding-ada-002` model which produces 1536 dimensional embeddings. The model is small-ish and fast-ish. This model and embedding size may change in the future.
|
||||||
|
|
||||||
| model name | dimensions | input max tokens | speed | size | Avg. performance |
|
| model name | dimensions | input max tokens | speed | size | Avg. performance |
|
||||||
| --- | --- | --- | --- | --- | --- |
|
| ---------------------- | ---------- | ---------------- | ----- | ---- | ---------------- |
|
||||||
| text-embedding-ada-002 | 1536 | 8192| - | - | - |
|
| text-embedding-ada-002 | 1536 | 8192 | - | - | - |
|
||||||
| text-davinci-002 | 768 | 2046 | - | - | - |
|
| text-davinci-002 | 768 | 2046 | - | - | - |
|
||||||
| all-mpnet-base-v2 | 768 | 384 | 2800 | 420M | 63.3 |
|
| all-mpnet-base-v2 | 768 | 384 | 2800 | 420M | 63.3 |
|
||||||
| all-MiniLM-L6-v2 | 384 | 256 | 14200 | 80M | 58.8 |
|
| all-MiniLM-L6-v2 | 384 | 256 | 14200 | 80M | 58.8 |
|
||||||
|
|
||||||
In short, the all-MiniLM-L6-v2 model is 5x faster, 5x smaller ram, 2x smaller storage, and still offers good quality. Stats from (https://www.sbert.net/docs/pretrained_models.html). To change the model from the default you can set the environment variable ```OPENEDAI_EMBEDDING_MODEL```, ex. "OPENEDAI_EMBEDDING_MODEL=all-MiniLM-L6-v2".
|
In short, the all-MiniLM-L6-v2 model is 5x faster, 5x smaller ram, 2x smaller storage, and still offers good quality. Stats from (https://www.sbert.net/docs/pretrained_models.html). To change the model from the default you can set the environment variable `OPENEDAI_EMBEDDING_MODEL`, ex. "OPENEDAI_EMBEDDING_MODEL=all-MiniLM-L6-v2".
|
||||||
|
|
||||||
Warning: You cannot mix embeddings from different models even if they have the same dimensions. They are not comparable.
|
Warning: You cannot mix embeddings from different models even if they have the same dimensions. They are not comparable.
|
||||||
|
|
||||||
### Client Application Setup
|
### Client Application Setup
|
||||||
|
|
||||||
|
|
||||||
Almost everything you use it with will require you to set a dummy OpenAI API key environment variable.
|
Almost everything you use it with will require you to set a dummy OpenAI API key environment variable.
|
||||||
|
|
||||||
With the [official python openai client](https://github.com/openai/openai-python), set the ```OPENAI_API_BASE``` environment variables:
|
With the [official python openai client](https://github.com/openai/openai-python), set the `OPENAI_API_BASE` environment variables:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
# Sample .env file:
|
# Sample .env file:
|
||||||
@ -110,7 +111,7 @@ OPENAI_API_BASE=http://0.0.0.0:5001/v1
|
|||||||
|
|
||||||
If needed, replace 0.0.0.0 with the IP/port of your server.
|
If needed, replace 0.0.0.0 with the IP/port of your server.
|
||||||
|
|
||||||
If using .env files to save the ```OPENAI_API_BASE``` and ```OPENAI_API_KEY``` variables, make sure the .env file is loaded before the openai module is imported:
|
If using .env files to save the `OPENAI_API_BASE` and `OPENAI_API_KEY` variables, make sure the .env file is loaded before the openai module is imported:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
@ -121,10 +122,12 @@ import openai
|
|||||||
With the [official Node.js openai client](https://github.com/openai/openai-node) it is slightly more more complex because the environment variables are not used by default, so small source code changes may be required to use the environment variables, like so:
|
With the [official Node.js openai client](https://github.com/openai/openai-node) it is slightly more more complex because the environment variables are not used by default, so small source code changes may be required to use the environment variables, like so:
|
||||||
|
|
||||||
```js
|
```js
|
||||||
const openai = OpenAI(Configuration({
|
const openai = OpenAI(
|
||||||
|
Configuration({
|
||||||
apiKey: process.env.OPENAI_API_KEY,
|
apiKey: process.env.OPENAI_API_KEY,
|
||||||
basePath: process.env.OPENAI_API_BASE,
|
basePath: process.env.OPENAI_API_BASE
|
||||||
}));
|
})
|
||||||
|
);
|
||||||
```
|
```
|
||||||
|
|
||||||
For apps made with the [chatgpt-api Node.js client library](https://github.com/transitive-bullshit/chatgpt-api):
|
For apps made with the [chatgpt-api Node.js client library](https://github.com/transitive-bullshit/chatgpt-api):
|
||||||
@ -132,8 +135,8 @@ For apps made with the [chatgpt-api Node.js client library](https://github.com/t
|
|||||||
```js
|
```js
|
||||||
const api = new ChatGPTAPI({
|
const api = new ChatGPTAPI({
|
||||||
apiKey: process.env.OPENAI_API_KEY,
|
apiKey: process.env.OPENAI_API_KEY,
|
||||||
apiBaseUrl: process.env.OPENAI_API_BASE,
|
apiBaseUrl: process.env.OPENAI_API_BASE
|
||||||
})
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
## API Documentation & Examples
|
## API Documentation & Examples
|
||||||
@ -166,7 +169,7 @@ print(text)
|
|||||||
## Compatibility & not so compatibility
|
## Compatibility & not so compatibility
|
||||||
|
|
||||||
| API endpoint | tested with | notes |
|
| API endpoint | tested with | notes |
|
||||||
| --- | --- | --- |
|
| ------------------------- | ---------------------------------- | --------------------------------------------------------------------------- |
|
||||||
| /v1/chat/completions | openai.ChatCompletion.create() | Use it with instruction following models |
|
| /v1/chat/completions | openai.ChatCompletion.create() | Use it with instruction following models |
|
||||||
| /v1/embeddings | openai.Embedding.create() | Using SentenceTransformer embeddings |
|
| /v1/embeddings | openai.Embedding.create() | Using SentenceTransformer embeddings |
|
||||||
| /v1/images/generations | openai.Image.create() | Bare bones, no model configuration, response_format='b64_json' only. |
|
| /v1/images/generations | openai.Image.create() | Bare bones, no model configuration, response_format='b64_json' only. |
|
||||||
@ -176,13 +179,13 @@ print(text)
|
|||||||
| /v1/edits | openai.Edit.create() | Deprecated by openai, good with instruction following models |
|
| /v1/edits | openai.Edit.create() | Deprecated by openai, good with instruction following models |
|
||||||
| /v1/text_completion | openai.Completion.create() | Legacy endpoint, variable quality based on the model |
|
| /v1/text_completion | openai.Completion.create() | Legacy endpoint, variable quality based on the model |
|
||||||
| /v1/completions | openai api completions.create | Legacy endpoint (v0.25) |
|
| /v1/completions | openai api completions.create | Legacy endpoint (v0.25) |
|
||||||
| /v1/engines/*/embeddings | python-openai v0.25 | Legacy endpoint |
|
| /v1/engines/\*/embeddings | python-openai v0.25 | Legacy endpoint |
|
||||||
| /v1/engines/*/generate | openai engines.generate | Legacy endpoint |
|
| /v1/engines/\*/generate | openai engines.generate | Legacy endpoint |
|
||||||
| /v1/engines | openai engines.list | Legacy Lists models |
|
| /v1/engines | openai engines.list | Legacy Lists models |
|
||||||
| /v1/engines/{model_name} | openai engines.get -i {model_name} | You can use this legacy endpoint to load models via the api or command line |
|
| /v1/engines/{model_name} | openai engines.get -i {model_name} | You can use this legacy endpoint to load models via the api or command line |
|
||||||
| /v1/images/edits | openai.Image.create_edit() | not yet supported |
|
| /v1/images/edits | openai.Image.create_edit() | not yet supported |
|
||||||
| /v1/images/variations | openai.Image.create_variation() | not yet supported |
|
| /v1/images/variations | openai.Image.create_variation() | not yet supported |
|
||||||
| /v1/audio/\* | openai.Audio.\* | not yet supported |
|
| /v1/audio/\* | openai.Audio.\* | supported |
|
||||||
| /v1/files\* | openai.Files.\* | not yet supported |
|
| /v1/files\* | openai.Files.\* | not yet supported |
|
||||||
| /v1/fine-tunes\* | openai.FineTune.\* | not yet supported |
|
| /v1/fine-tunes\* | openai.FineTune.\* | not yet supported |
|
||||||
| /v1/search | openai.search, engines.search | not yet supported |
|
| /v1/search | openai.search, engines.search | not yet supported |
|
||||||
@ -194,7 +197,7 @@ Streaming, temperature, top_p, max_tokens, stop, should all work as expected, bu
|
|||||||
Some hacky mappings:
|
Some hacky mappings:
|
||||||
|
|
||||||
| OpenAI | text-generation-webui | note |
|
| OpenAI | text-generation-webui | note |
|
||||||
| --- | --- | --- |
|
| ----------------------- | -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| model | - | Ignored, the model is not changed |
|
| model | - | Ignored, the model is not changed |
|
||||||
| frequency_penalty | encoder_repetition_penalty | this seems to operate with a different scale and defaults, I tried to scale it based on range & defaults, but the results are terrible. hardcoded to 1.18 until there is a better way |
|
| frequency_penalty | encoder_repetition_penalty | this seems to operate with a different scale and defaults, I tried to scale it based on range & defaults, but the results are terrible. hardcoded to 1.18 until there is a better way |
|
||||||
| presence_penalty | repetition_penalty | same issues as frequency_penalty, hardcoded to 1.0 |
|
| presence_penalty | repetition_penalty | same issues as frequency_penalty, hardcoded to 1.0 |
|
||||||
@ -208,13 +211,12 @@ Some hacky mappings:
|
|||||||
| user | - | not supported yet |
|
| user | - | not supported yet |
|
||||||
| functions/function_call | - | function calls are not supported yet |
|
| functions/function_call | - | function calls are not supported yet |
|
||||||
|
|
||||||
|
|
||||||
### Applications
|
### Applications
|
||||||
|
|
||||||
Almost everything needs the ```OPENAI_API_KEY``` and ```OPENAI_API_BASE``` environment variable set, but there are some exceptions.
|
Almost everything needs the `OPENAI_API_KEY` and `OPENAI_API_BASE` environment variable set, but there are some exceptions.
|
||||||
|
|
||||||
| Compatibility | Application/Library | Website | Notes |
|
| Compatibility | Application/Library | Website | Notes |
|
||||||
| --- | --- | --- | --- |
|
| ------------- | ---------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||||
| ✅❌ | openai-python (v0.25+) | https://github.com/openai/openai-python | only the endpoints from above are working. OPENAI_API_BASE=http://127.0.0.1:5001/v1 |
|
| ✅❌ | openai-python (v0.25+) | https://github.com/openai/openai-python | only the endpoints from above are working. OPENAI_API_BASE=http://127.0.0.1:5001/v1 |
|
||||||
| ✅❌ | openai-node | https://github.com/openai/openai-node | only the endpoints from above are working. environment variables don't work by default, but can be configured (see above) |
|
| ✅❌ | openai-node | https://github.com/openai/openai-node | only the endpoints from above are working. environment variables don't work by default, but can be configured (see above) |
|
||||||
| ✅❌ | chatgpt-api | https://github.com/transitive-bullshit/chatgpt-api | only the endpoints from above are working. environment variables don't work by default, but can be configured (see above) |
|
| ✅❌ | chatgpt-api | https://github.com/transitive-bullshit/chatgpt-api | only the endpoints from above are working. environment variables don't work by default, but can be configured (see above) |
|
||||||
@ -230,12 +232,13 @@ Almost everything needs the ```OPENAI_API_KEY``` and ```OPENAI_API_BASE``` envir
|
|||||||
| ❌ | guidance | https://github.com/microsoft/guidance | logit_bias and logprobs not yet supported |
|
| ❌ | guidance | https://github.com/microsoft/guidance | logit_bias and logprobs not yet supported |
|
||||||
|
|
||||||
## Future plans
|
## Future plans
|
||||||
* better error handling
|
|
||||||
* model changing, esp. something for swapping loras or embedding models
|
- better error handling
|
||||||
* consider switching to FastAPI + starlette for SSE (openai SSE seems non-standard)
|
- model changing, esp. something for swapping loras or embedding models
|
||||||
|
- consider switching to FastAPI + starlette for SSE (openai SSE seems non-standard)
|
||||||
|
|
||||||
## Bugs? Feedback? Comments? Pull requests?
|
## Bugs? Feedback? Comments? Pull requests?
|
||||||
|
|
||||||
To enable debugging and get copious output you can set the ```OPENEDAI_DEBUG=1``` environment variable.
|
To enable debugging and get copious output you can set the `OPENEDAI_DEBUG=1` environment variable.
|
||||||
|
|
||||||
Are all appreciated, please @matatonic and I'll try to get back to you as soon as possible.
|
Are all appreciated, please @matatonic and I'll try to get back to you as soon as possible.
|
@ -20,6 +20,10 @@ from extensions.openai.tokens import token_count, token_decode, token_encode
|
|||||||
from extensions.openai.utils import debug_msg
|
from extensions.openai.utils import debug_msg
|
||||||
from modules import shared
|
from modules import shared
|
||||||
|
|
||||||
|
import cgi
|
||||||
|
import speech_recognition as sr
|
||||||
|
from pydub import AudioSegment
|
||||||
|
|
||||||
params = {
|
params = {
|
||||||
'port': int(os.environ.get('OPENEDAI_PORT')) if 'OPENEDAI_PORT' in os.environ else 5001,
|
'port': int(os.environ.get('OPENEDAI_PORT')) if 'OPENEDAI_PORT' in os.environ else 5001,
|
||||||
}
|
}
|
||||||
@ -138,6 +142,42 @@ class Handler(BaseHTTPRequestHandler):
|
|||||||
|
|
||||||
@openai_error_handler
|
@openai_error_handler
|
||||||
def do_POST(self):
|
def do_POST(self):
|
||||||
|
|
||||||
|
if '/v1/audio/transcriptions' in self.path:
|
||||||
|
r = sr.Recognizer()
|
||||||
|
|
||||||
|
# Parse the form data
|
||||||
|
form = cgi.FieldStorage(
|
||||||
|
fp=self.rfile,
|
||||||
|
headers=self.headers,
|
||||||
|
environ={'REQUEST_METHOD': 'POST', 'CONTENT_TYPE': self.headers['Content-Type']}
|
||||||
|
)
|
||||||
|
|
||||||
|
audio_file = form['file'].file
|
||||||
|
audio_data = AudioSegment.from_file(audio_file)
|
||||||
|
|
||||||
|
# Convert AudioSegment to raw data
|
||||||
|
raw_data = audio_data.raw_data
|
||||||
|
|
||||||
|
# Create AudioData object
|
||||||
|
audio_data = sr.AudioData(raw_data, audio_data.frame_rate, audio_data.sample_width)
|
||||||
|
whipser_language = form.getvalue('language', None)
|
||||||
|
whipser_model = form.getvalue('model', 'tiny') # Use the model from the form data if it exists, otherwise default to tiny
|
||||||
|
|
||||||
|
transcription = {"text": ""}
|
||||||
|
|
||||||
|
try:
|
||||||
|
transcription["text"] = r.recognize_whisper(audio_data, language=whipser_language, model=whipser_model)
|
||||||
|
except sr.UnknownValueError:
|
||||||
|
print("Whisper could not understand audio")
|
||||||
|
transcription["text"] = "Whisper could not understand audio UnknownValueError"
|
||||||
|
except sr.RequestError as e:
|
||||||
|
print("Could not request results from Whisper", e)
|
||||||
|
transcription["text"] = "Whisper could not understand audio RequestError"
|
||||||
|
|
||||||
|
self.return_json(transcription, no_debug=True)
|
||||||
|
return
|
||||||
|
|
||||||
debug_msg(self.requestline)
|
debug_msg(self.requestline)
|
||||||
debug_msg(self.headers)
|
debug_msg(self.headers)
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user