PhariaInference

PhariaInference is a core component of the PhariaAI stack that enables the execution of tasks using large language models (LLMs) in a structured and controlled manner.

Tasks are implemented using the PhariaInference SDK, which relies on Pydantic for type-safe serialization and deserialization of inputs and outputs to and from JSON.

PhariaInference is designed to integrate with various LLMs and is used within the broader PhariaAI ecosystem for developing, deploying, and managing enterprise-grade generative AI applications.

With PhariaInference, you can:

Call tools to allow AI models to interact with external systems and APIs in a structured way.
Use embeddings in use cases involving semantic search, text similarity, fraud detection, clustering and classification.
Produce structured output by applying JSON schemas in chat completions.
Rerank a list of items based on a query to improve the accuracy of the search results.

In this section:

Tool calling
Embeddings
Structured output with chat completions
Reranking