Chat
POST/chat/completions
Retrieves one or multiple chat completions for a given prompt
Request
Query Parameters
Setting this to True, will signal to the API that you intend to be nice to other users by de-prioritizing your request below concurrent ones.
- application/json
Bodyrequired
messages object[]required
The ID of the model to query.
The requested model must be eligible for chat completions.
When specified, this number will decrease (or increase) the likelihood of repeating tokens that were mentioned prior in the completion.
The penalty is cumulative. The more a token is mentioned in the completion, the more its probability will decrease.
Possible values: >= -2
and <= 2
logit_bias
When set to true, the model will return the log probabilities of the sampled tokens in the completion.
When specified, the model will return the log probabilities of the top n
tokens in the completion.
Possible values: <= 20
The maximum number of tokens to generate in the completion. The model will stop generating tokens once it reaches this length.
The maximum value for this parameter depends on the specific model and the length of the input prompt. When no value is provided, the highest possible value will be used.
Possible values: >= 1
The number of completions to generate for each prompt. The model will generate this many completions and return all of them.
When no value is provided, one completion will be returned.
Possible values: >= 1
When specified, this number will decrease (or increase) the likelihood of repeating tokens that were mentioned prior in the completion.s
The penalty is not cumulative. Mentioning a token more than once will not increase the penalty further.
Possible values: >= -2
and <= 2
This parameter is unsupported and will be rejected.
This parameter is unsupported and will be ignored.
This parameter is unsupported and will be ignored.
stop object
When set to true, the model will transmit all completions tokens as soon as they become available via the server-sent events protocol.
stream_options object
Controls the randomness of the model. Lower values will make the model more deterministic and higher values will make it more random.
Mathematically, the temperature is used to divide the logits before sampling. A temperature of 0 will always return the most likely token.
When no value is provided, the default value of 1 will be used.
Possible values: <= 2
"nucleus" parameter to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities. It specifies a probability threshold, below which all less likely tokens are filtered out.
When no value is provided, the default value of 1 will be used.
Possible values: <= 1
Specifies how the output of the model should be steered. This steers the output in the direction given by positive examples associated to the steering concept and away from the negative examples.
Possible values: Value must match regular expression ^_worker/[a-zA-Z0-9-_]{1,64}$
[]
This parameter is unsupported and will be rejected.
This parameter is unsupported and will be rejected.
This parameter is unsupported and will be rejected.
This parameter is unsupported and will be ignored.
Responses
- 200
OK
- application/json
- Schema
- Example (auto)
Schema
- Array [
- ]
An ID that is unique throughout the given request. When multiple chunks are returned using server-sent events, this ID will be the same for all of them.
choices object[]
The Unix timestamp (in seconds) of when the chat completion was created.
The ID of the model that generated the completion.
The specific version of the model that generated the completion. This field can be used to track inconsistencies between calls to different deployments of otherwise identical models.
When streaming is enabled, the value is only set in the last chunk of a completion and null
otherwise.
Will be chat.completion
by default and chat.completion.chunk
when streaming is enabled.
Possible values: [chat.completion
, chat.completion.chunk
]
usage object
[
{
"id": "string",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "string"
},
"delta": {
"role": "assistant",
"content": "string"
},
"logprobs": {
"content": [
{
"token": "string",
"logprob": 0,
"bytes": [
0
],
"top_logprobs": [
{
"token": "string",
"logprob": 0,
"bytes": [
0
]
}
]
}
]
}
}
],
"created": 0,
"model": "string",
"system_fingerprint": "string",
"object": "chat.completion",
"usage": {
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0
}
}
]
Authorization: http
name: tokentype: httpscheme: bearerdescription: Can be generated in your [Aleph Alpha profile](https://app.aleph-alpha.com/profile)
- csharp
- curl
- dart
- go
- http
- java
- javascript
- kotlin
- c
- nodejs
- objective-c
- ocaml
- php
- powershell
- python
- r
- ruby
- rust
- shell
- swift
- HTTPCLIENT
- RESTSHARP
var client = new HttpClient();
var request = new HttpRequestMessage(HttpMethod.Post, "https://docs.aleph-alpha.com/chat/completions");
request.Headers.Add("Accept", "application/json");
request.Headers.Add("Authorization", "Bearer <token>");
var content = new StringContent("{\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"string\"\n }\n ],\n \"model\": \"string\",\n \"frequency_penalty\": 0,\n \"logprobs\": true,\n \"top_logprobs\": 0,\n \"max_tokens\": 0,\n \"n\": 0,\n \"presence_penalty\": 0,\n \"stop\": \"string\",\n \"stream\": true,\n \"stream_options\": {\n \"include_usage\": true\n },\n \"temperature\": 0,\n \"top_p\": 0,\n \"steering_concepts\": [\n \"string\"\n ]\n}", null, "application/json");
request.Content = content;
var response = await client.SendAsync(request);
response.EnsureSuccessStatusCode();
Console.WriteLine(await response.Content.ReadAsStringAsync());