Storing an evaluation dataset in PhariaStudio

In this article:

Prerequisites
Add required dependencies
Submit the dataset using code
View the dataset in PhariaStudio
Upload the dataset using the PhariaStudio portal

Prerequisites

Follow the instructions in Creating examples for an evaluation dataset to create a list of examples before creating and storing an evaluation dataset.

Add required dependencies

from pharia_studio_sdk.connectors import StudioClient
from pharia_studio_sdk.evaluation import (
    Example,
    StudioDatasetRepository
)

Submit the dataset using code

We initialise the PhariaStudio client linking to an existing project:

studio_client = StudioClient("Test Evaluation")

When the client is initialised and pointing to a project, you can submit the dataset:

studio_dataset_repo = StudioDatasetRepository(studio_client=studio_client)

studio_dataset = studio_dataset_repo.create_dataset(
    examples=examples,
    dataset_name="Jokes",
    metadata={"description": "This is an extensive list of jokes"},
)

View the dataset in PhariaStudio

After you submit the dataset, you can view it in the Dataset section of PhariaStudio:

The dataset ID is needed to create a benchmark object. To copy the ID, click on the kebab menu icon and select Copy ID:

Click on a line in the Datasets table to display the content of that dataset:

Upload the dataset using the PhariaStudio portal

The dataset file must be in the JSON lines format and each example must contain a unique ID.

You can upload a new dataset in the PhariaStudio portal, as follows:

In the Evaluate menu in the sidebar, select Datasets.
If the current project does not have any datasets, PhariaStudio displays a code snippet to enable the creation of a dataset using code.
Click Upload Dataset.
Upload a dataset file, or drag and drop one into the popup.