How to use PhariaFinetuning API

The finetuning service allows users to train models by submitting jobs via the API. The API consists of multiple routes that can be used to start, cancel, and list both running and previous jobs.

Submitting a Job

Currently, the easiest way to submit a finetuning job is through the Swagger UI. Your Swagger UI will be accessible at:

https://pharia-finetuning-api.<YOUR_CONFIGURED_URL_POSTFIX>/docs#/

where <YOUR_CONFIGURED_URL_POSTFIX> is the URL postfix configured during the installation of the Pharia-finetuning Helm chart.

1. Authorize

To use the API, you need access to the Pharia Studio token. Follow these steps to retrieve it:

Go to the Pharia Studio page and log in if necessary.
In the upper-right corner, click on your profile.
In the popup, click on Copy Bearer Token.

copy-bearer-token

Once you have the token:

Click on the "Authorize" button in the top-right corner of the Swagger UI.
Paste your token to authenticate.
After authorization, you can safely close the popup window.

authorize

2. Submit a Job

Follow these steps to submit a finetuning job:

Click on the /api/v1/finetuning/jobs endpoint to start a new job.
Click "Try it out" to enter your parameters.
Fill in the request body with the necessary parameters:

a. Supported Models

Aleph-Alpha/Pharia-1-LLM-7B-control-hf
meta-llama/Llama-3.1-8B-Instruct
roneneldan/TinyStories-1M

b. Dataset Parameters

dataset_id is required (no default values).
validation_dataset_id is optional (uses dataset_id if left unspecified).
limit_samples is optional maximum number of samples to run.

c. Supported Finetuning Types

full → Full finetuning
lora → LoRA finetuning

d. Configurable Finetuning Hyperparameters

n_epochs → Number of epochs for training.
learning_rate_multiplier → Multiplier of the default learning rate (default lr = 2.0e-5).
batch_size → Size of per-device batches.

Once submitted, you will receive a submission_id in the response under job/id. This serves as the unique identifier for your job.

Viewing Job Status

The finetuning API allows you to check the status of all jobs (running, canceled, or failed) or view details for a specific job.

View Statuses of All Jobs

To get a list of all jobs and their statuses, use the /api/v1/finetuning/jobs route in Swagger UI.

Since this is a GET route, simply click "Try it out", then "Execute".
The response will be a JSON array containing job details and statuses.

The schema of the response is as follows:

schema-response

View Status of a Single Job

To get the status of a specific job, use the /api/v1/finetuning/jobs/{job_id} route.

Since this is a POST route, click "Try it out".
Enter the job_id you want details for and click "Execute".
The response will be a single JSON object containing job details.

Monitoring Jobs via Aim

In addition to the finetuning API, an Aim dashboard is available for tracking training progress. This dashboard displays both model metrics (loss, perplexity, etc.) and system metrics (CPU/GPU usage, etc.).

Accessing the Aim Dashboard

https://pharia-aim.<YOUR_CONFIGURED_URL_POSTFIX>/runs

where <YOUR_CONFIGURED_URL_POSTFIX> is the URL postfix configured during installation.

Key Features of Aim:

"Runs" page → View all finetuning runs.
"Metrics" page → Monitor combined training metrics.

Aim is fully integrated with the finetuning service and enabled by default, so no extra setup is required to track training progress.

Submitting a Job​

1. Authorize​

2. Submit a Job​

a. Supported Models​

b. Dataset Parameters​

c. Supported Finetuning Types​

d. Configurable Finetuning Hyperparameters​

Viewing Job Status​

View Statuses of All Jobs​

View Status of a Single Job​

Monitoring Jobs via Aim​

Accessing the Aim Dashboard​