📄️ Introduction
This documentation describes the inference API of PhariaOS.
📄️ Attention Manipulation (AtMan)
AtMan is our method to manipulate the attention of an input sequence (this can be a token, a word, or a sentence) to steer the model's prediction in a different contextual direction.
📄️ Explainability
In the previous section, we explained how you can steer the attention of our models and either suppress or amplify parts of the input sequences.
📄️ Steering
Large language models (LLMs) generate text based on patterns that they have learned from vast amounts of data. In many use cases, however, we need to influence how the LLMs respond.
🗃️ Endpoints
7 items
📄️ Multimodality
Multimodal capabilities allow large language models to process and understand multiple types of
📄️ Structured Output with Chat Completions
Introduction
📄️ Embeddings
Embeddings are dense vector representations of text that capture semantic meaning, enabling machines to understand
📄️ Tool Calling
Introduction
📄️ Troubleshooting
Describing infrastructure related issues (e.g. network issues, k8s scheduling issues, ...) is beyond the scope of this part of the documentation.