Skip to main content

Set up worker-defined steering concepts

Adding a steering concept to the worker configuration

Before steering can be applied to model responses, you must first define and configure the desired steering concepts. This involves two key steps:

  • Creating the steering configuration – Define the necessary files and organize them in a new folder.

  • Updating the worker configuration – Modifying the Helm chart to reference the newly created steering configuration.

At a high level, these steps ensure that the steering configuration is properly loaded during the worker’s startup phase. As a result, whenever a new steering concept is added or an existing one is modified, the corresponding workers must be restarted for the changes to take effect.

This document describes how to create your own custom steering concepts. Using these concepts in completion and chat requests is described on the Steering page.

1. Creating the steering configuration

Let's explore this in practice by creating a slang steering concept. This concept will be referred to as _worker/slang during completion and chat requests, as we define it at the worker level.

Note that all steering concept names must match the following regex: ^[a-zA-Z0-9-_]{1,64}$ This ensures that the entire string is 1 to 64 characters long and contains only letters, digits, hyphens, and underscores.

First we create a new directory to store the steering configurations: steering/.

This directory will contain the following files:

  • config.yml
  • slang-negative.txt
  • slang-positive.txt

A steering concept is defined by specifying a set of paired negative and positive examples. In this case, the negative examples might be very formal phrases, while the positive examples would be their paraphrased slang counterparts. The examples in a given row of both files should correspond in semantics; that is, they must be ordered in the same way.

The config.yml file

This file defines the steering strength and registered concepts. The strength parameter must be set to 0.062 for the llama-3.1-8b-instruct model.

{
"version": ".unknown.",
"steering_config":
{
"strength": 0.062,
"concepts": ["slang"] # List[string]
}

}

The slang-negative.txt file

This file contains a set of N counterexamples which we want to steer the model away from. For example:

I appreciate your valuable feedback on this matter.
The financial projections indicate significant growth potential.
Please ensure all documentation is submitted by the deadline.
The restaurant's ambiance was sophisticated and the cuisine was exceptional.
Your assistance in this project has been invaluable.
The meteorological forecast predicts inclement weather conditions.
This academic paper requires substantial revision before submission.
The theatrical performance exceeded all expectations.
I encountered significant traffic during my morning commute.
The cellular device appears to be malfunctioning.
Our quarterly earnings have surpassed initial projections.
The real estate market shows signs of increasing stability.
Your conduct during the meeting was highly unprofessional.
I require additional time to complete this assignment.
The automobile requires immediate maintenance attention.
This establishment's customer service is subpar.
I must depart from this gathering immediately.
The compensation package appears quite competitive.
Your argumentative position lacks sufficient evidence.
The social gathering was extremely enjoyable.
Please coordinate with relevant stakeholders regarding this matter.
The apartment's condition has deteriorated significantly.
I found the cinematographic experience rather disappointing.
Your sartorial choices are quite impressive today.
The technological interface requires optimization.
This culinary creation is absolutely magnificent.
The musical composition was incredibly moving.
Please refrain from excessive noise after designated quiet hours.
The romantic relationship has reached its natural conclusion.
The examination results were less than satisfactory.

The slang-positive.txt file

This file contains a set of N examples which showcase the desired style/theme we want the model to follow. For example:

Thanks for the real talk, fam.
Yo, these numbers are looking mad stacked!
Get those papers in ASAP or it's gonna be big yikes.
That spot was mad fancy and the food was straight fire!
You're the real MVP on this one, no cap.
Heads up, weather's gonna be straight trash.
This paper needs major work before it's gucci.
That show was bussin' fr fr!
Traffic was stupid thick this morning, ngl.
My phone's acting mad sus rn.
We're making bank, way more than we thought!
The housing game's finally chilling out.
You were wildin' in that meeting, not gonna lie.
I need a min to get this done, dawg.
Whip's acting up, needs a mechanic ASAP.
This place's service be straight garbage.
Gotta bounce, no cap.
This bag they're offering is pretty lit.
Your receipts ain't adding up, fam.
That party was straight vibing!
Hit up the crew about this real quick.
This crib's gotten mad sketchy.
That movie was mid af.
You drippin' today, no cap!
This app needs mad work fr fr.
This food slaps so hard!
That track hit different, on god.
Keep it down after hours or it's gonna be beef.
We ain't a thing no more, it's done done.
These grades ain't it, chief.

Using additional steering concepts

To add another steering concept, such as formalgerman, modify the concepts field in config.yml to be ["slang", "formalgerman"]. Additionally, create two new files:

  • formalgerman-negative.txt - To contain a collection of informal phrases in German.
  • formalgerman-positive.txt - To contain a collection of formal phrases in German.

2. Updating the worker configuration

We now create a configmap with our newly created directory containing all the files.

kubectl create configmap steering-llama-3-1-8b-instruct --from-file=steering -n <pharia-ai-install-namespace>

As outlined here, we would then need to overwrite inference-worker.checkpoints in values.yaml and reference the newly created configmap:

  • steeringConfigMap: "steering-llama-3-1-8b-instruct"
inference-worker:
checkpoints:
...
- generator:
type: "luminous"
pipeline_parallel_size: 1
tensor_parallel_size: 1
tokenizer_path: "llama-3.1-8b-instruct/tokenizer.json"
weight_set_directories: ["llama-3.1-8b-instruct"]
queue: "llama-3.1-8b-instruct"
replicas: 1
modelVolumeClaim: "models-llama-3.1-8b-instruct"
version: 0
steeringConfigMap: "steering-llama-3-1-8b-instruct" # <-- new line
models:
llama-3.1-8b-instruct:
...

The Helm chart must now be redeployed again for the changes to take effect and for the worker to read these files during its startup phase.