The PhariaInference API
The PhariaInference API allows you to access and interact with Aleph Alpha models and PhariaAI functionality.
The API operates across the entire PhariaAI stack by providing access to the backends of various Aleph Alpha products. Its endpoints enable you to generate reliable results programmatically.
See also The PhariaOS operations manual.
Authentication
To use the API, you need an authentication token. You create these in PhariaOS. See Managing credentials and tokens.
Endpoints
The API endpoints for PhariaInference are the following:
| Endpoint | Description |
|---|---|
|
Use this endpoint to list the models that are available to the client. |
|
Use this endpoint to complete a prompt using a specific model. |
|
Use this endpoint to generate a completion in valid JSON format, even if this was not requested explicitly in the prompt. JSON completion is currently only available for Luminous workers. |
|
Use this endpoint to retrieve one or more chat completions for a given prompt. This endpoint generates completions in a conversational style. You can generate multi-turn conversations, that is, conversations that contain follow-up questions to intermediate responses. |
|
Use this endpoint to embed a text using a specific model. This results in vectors that can be used for downstream tasks (such as semantic similarity) or models (such as classifiers). See also Embedding. |
|
Use this endpoint to embed a prompt using a specific model and semantic embedding method. See also Embedding. |
|
Use this endpoint to embed multiple prompts using a specific model and semantic embedding method. |
|
Use this endpoint to embed the input using an instruction and a specific model. |
|
Use this endpoint to evaluate the probability that the model will produce an expected completion given a prompt. This is useful if you already know the output you expect, or you want to test the probability of a given output.
Note that the |
|
Use this endpoint to better understand the source of a completion. The endpoint returns how much the log-probabilities of the generated completion would change if we suppress individual parts (based on a configurable granularity) of a prompt. This reveals how much each section of a prompt impacts each token of the completion. See also Explainability. |
|
Use this endpoint to tokenize a prompt for a specific model. |
|
Use this endpoint to detokenize a list of tokens into a string. |
|
Use this endpoint to translate input text from one language to a specified target language. For a list of supported languages, see the documentation for your selected model. |
|
Use this endpoint to transcribe an audio file using a specified transcription model. |