Skip to main content

Create Examples for an evaluation datasets

An example is the line item of a dataset for Evaluation. Each Example contains the input that the user could provide and the expected output that the AI logic should return. In real world applications, these Examples are collected from Annotation tools or from validated feedback from users.

Prerequisites for implementing a task

Make sure you have a Poetry project with Intelligence Layer as a dependency as explained here

Add the necessary dependencies

from intelligence_layer.evaluation import Example

Create a list of examples

From How to implement a task, we have already created the input and output types for our task. Let's paste them here for simplicity.

class TellAJokeTaskInput(BaseModel):
topic: str

class TellAJokeTaskOutput(BaseModel):
joke: str

With these types defined, we can now create our list of examples as follows:

examples = [
Example(
input=TellAJokeTaskInput(topic="******"),
expected_output=TellAJokeTaskOutput(joke="@@@@@"),
metadata={
"author": "Shakespeare"
}
),
# ...
]

As visible, it is possible to add any metadata to each example that can help with the process of evaluations.