Transcribes audio into the input language.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
"openai/gpt-4o-mini"
"gpt-4o-mini"
Key-value pairs of metadata for the request. Identify end-users associated with the query with the waystone metadata. All other fields are metadata for OpenAI requests and are not used by Waystone.
{
"waystone": "{\n\t\t\t\t\t\t\"user\": {\n\t\t\t\t\t\t\t\"id\": \"user123\",\n\t\t\t\t\t\t\t\"metadata\": { \"email\": \"[email protected]\", \"name\": \"John Doe\" }\n\t\t\t\t\t\t}\n\t\t\t\t\t}"
}
{
"waystone": "{\n\t\t\t\t\t\t\"user\": \"user123\",\n\t\t\t\t\t\t\"group\": { \"id\": \"group123\", \"metadata\": { \"name\": \"Group Name\" } }\n\t\t\t\t\t}"
}
For OpenAI: Whether or not to store the output of this chat completion request in OpenAI
The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
The language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.
json, text, srt, verbose_json, vtt If set to true, the model response data will be streamed to the client as it is generated. Streaming availability is dependent on the model provider.