Creating examples for an evaluation dataset

An example is a line item of a dataset for evaluation. Each example contains the input that the user may provide and the output that the AI logic is expected to return.

In real world applications, these examples are collected from annotation tools or from validated feedback from users.


Prerequisites

Ensure you have a project with PhariaInference SDK as a dependency as explained in Adding PhariaAI SDKs to your project.

Add required dependencies

from pharia_studio_sdk.evaluation import Example

Create a list of examples

In Implementing a simple task, we created the input and output types for our task. We copy them here for simplicity.

class TellAJokeTaskInput(BaseModel):
    topic: str

class TellAJokeTaskOutput(BaseModel):
    joke: str

With these types defined, we can now create our list of examples as follows:

examples = [
    Example(
        input=TellAJokeTaskInput(topic="******"),
        expected_output=TellAJokeTaskOutput(joke="@@@@@"),
        metadata={
            "author": "Shakespeare"
        }
    ),
    # ...
]

You can add any metadata to each example that can help with the process of evaluation.