Skip to main content

Configuring steering

Large language models (LLMs) generate text based on patterns that they have learned from vast amounts of data. In many use cases, however, we need to influence how the LLMs respond.

Steering is a technique that nudges a model’s responses in a particular direction, but without changing the model itself. Instead of describing the desired change in the prompt, which takes up valuable context space, this method works by identifying underlying patterns in the model’s internal representations.

By providing a steering concept consisting of a set of positive and negative examples, we can compute a direction that subtly guides the model’s responses towards the desired style or behaviour. For example, we can coax it to speak more formally, use slang, or adopt a specific tone. This approach offers an efficient and flexible way to control LLMs' responses in a user-defined manner.

There are two different methods for defining steering concepts:

  • User-defined steering concepts, where you create the steering concepts using the PhariaInference HTTP API. It is enabled by default for llama-3.1-8b-instruct.
  • Worker-defined steering concepts is the approach of loading the examples from text files during worker startup. Please note that all worker-defined steering concepts need to be prefixed with _worker/ in completion and chat requests.

Creating and using steering concepts in completion and chat requests is described on the Steering page.

Enabling user-defined steering concepts for other models and the obsolete approach of creating steering concepts at the worker-level are described on the following pages.