Skip to main content

How-to

📄️ Steering Configuration

Large language models (LLMs) generate text based on patterns they’ve learned from vast amounts of data, but sometimes we want to influence how they respond. Steering is a technique that nudges a model’s responses in a particular direction—without changing the model itself. Instead of manually inserting examples into the prompt, which takes up valuable context space, this method works by identifying underlying patterns in the model’s internal representations. By providing a set of positive and negative examples, we can compute a direction that subtly guides the model’s responses toward the desired style or behavior—like making it speak more formally, use slang, or adopt a specific tone. This approach offers an efficient and flexible way to control LLMs responses in a user-defined manner. Note that this is an experimental feature that will be refined in future versions and is currently only supported for the llama-3.1-8b-instruct model.