Store an evaluation dataset in PhariaStudio
Prerequisites to submit a dataset
Follow the instructions here to create a list of examples before creating and storing the evaluation dataset.
Add the necessary dependencies
from intelligence_layer.connectors import StudioClient
from intelligence_layer.evaluation import (
Example,
StudioDatasetRepository
)
Submit the Dataset via Code
Let's initialize the PhariaStudio client linking to an existing project.
studio_client = StudioClient("Test Evaluation")
Once the client is initialized and pointing to a project, it is possible to submit the dataset.
studio_dataset_repo = StudioDatasetRepository(studio_client=studio_client)
studio_dataset = studio_dataset_repo.create_dataset(
examples=examples,
dataset_name="Jokes",
metadata={"description": "This is an extensive list of jokes"},
)
Check Your Dataset in PhariaStudio
Once you submitted the dataset, it is possible to check it from the Dataset section of PhariaStudio.
By clicking on the three dots menu, it is possible to delete the dataset or copy its ID to be used while creating the Benchmark object.
Clicking on the line, the UI will display the content of the dataset.
Upload the Dataset via Studio UI
It is possible to upload a new dataset by accessing Evaluate > Datasets from the sidebar. If the current project does not have any datasets, the UI will present the code snippet to enable the creation of a dataset via code.
By clicking on the Upload Dataset button, a modal will appear with further instructions.
The dataset file needs to be in the JSONLine format and each example contain a unique ID.