Installation process

The installation process requires familiarity with Kubernetes and Helm charts.

Our documentation assumes you are using Linux or MacOS for your installation. However, this is not required.

We recommend using K9s, a Kubernetes management interface that allows you to interact with your installation more effectively.

In this article:

Prerequisites
Preliminaries: Helm chart and Kubernetes
Download the Helm chart
Configure the Helm chart
Install the Helm chart
- Option 1. Pass the credentials directly to Helm
- Option 2. Pass the Docker secret to Helm

Prerequisites

Make sure you have completed the installation prerequisites before starting.

Preliminaries: Helm chart and Kubernetes

PhariaAI can be operated on any suitable Kubernetes cluster using the Helm chart provided in this JFrog repository.

The Helm chart installs the necessary components to run the PhariaAI models on your cluster.

In this section:

Set up the registry credentials
Create the target namespace
Download the model weights for the LLMs

Set up the registry credentials

If you have not already done so, you can create a token on your Aleph Alpha account on JFrog as follows:

In JFrog, click your profile icon at the top right.
In the dropdown menu, click Edit Profile.
In the window that opens, click Generate an Identity Token under Authentication Settings.
Optionally, enter a description for the token.
Click Next.

The identity token is generated and displayed to you.

Tokens are not stored on the JFrog platform. You must copy the token and store it carefully before you close the popup. Also, the provided credentials must be authorized to read from the registry via API.

For the purpose of this instruction, we export the credentials to environment variables:

export AA_REGISTRY_USERNAME=<username> # the account provided to you
export AA_REGISTRY_PASSWORD=<password> # your generated token for Helm

You can create Hugging Face user access tokens as described in the Hugging Face documentation.

To export this token to an environment variable, run the following:

export $HUGGINGFACE_TOKEN=<token> # your generated token for Hugging Face access

Once your credentials are set, authenticate your local Helm client with the repository. This step ensures Helm has the necessary access to fetch the PhariaAI Helm chart:

helm registry login https://alephalpha.jfrog.io -u "$AA_REGISTRY_USERNAME" -p "$AA_REGISTRY_PASSWORD"

Create the target namespace

All setup steps following this section assume the target namespace exists within your Kubernetes cluster. You can create it like this:

kubectl create namespace <pharia-ai-install-namespace>

Download the model weights for the LLMs

Depending on the model, the weights are available in JFrog or on Hugging Face. They need to be downloaded before they can be used by the inference stack of the PhariaAI installation.

We have prepared a separate Helm chart for downloading the model weights to persistent volumes in your cluster. The Helm chart deploys persistent volume claims and Kubernetes jobs for triggering the download.

By default, the chart deployment downloads the model weights for luminous-base, llama-3.1-8b-instruct, llama-3.3-70b-instruct, and llama-guard-3-8b. If you want to download only these default models, run the following commands:

helm install pharia-ai-models oci://alephalpha.jfrog.io/inference-helm/models \
  --set modelCredentials.username=$AA_REGISTRY_USERNAME \
  --set modelCredentials.password=$AA_REGISTRY_PASSWORD \
  --set huggingFaceCredentials.token=$HUGGINGFACE_TOKEN \
  -n <pharia-ai-install-namespace>

You may get an out-of-memory (OOM) error after running the command above. As llama-3.3-70b-instruct is a massively resource-consuming model, we suggest that you update the configuration before deploying models and exclude this model from installation. You do this as follows:

Run the following command:
helm pull oci://alephalpha.jfrog.io/inference-helm/models —untar
Copy the values.yaml file to values-override.yaml.
Delete the third name node of the array models, which starts with the line: name: models-llama-3.3-70b-instruct.

Additionally, note that changing resources.limits.memory from 2Gi to 4Gi can also help to avoid potential OOM errors.

If you want to download additional models, see the configuration guide: Configuring model weights downloaders.

Whether you download the default models or additional models, you can check the status of the download job by running:

kubectl get jobs -n <pharia-ai-install-namespace>

An incorrect Helm configuration can result in pod errors of the Kubernetes download job. Adapting the configuration and upgrading the Helm deployment may require the prior deletion of the Kubernetes jobs.

The names of the created persistent volume claims are required for the Helm configuration of the PhariaAI chart and can be obtained using:

kubectl get pvc -n <pharia-ai-install-namespace>

Once the download job is completed, you can proceed with the installation of the PhariaAI Helm chart.

To use any features of PhariaAI that depend on embedding models, such as PhariaAssistant Chat or document indexing, it is essential to have the luminous-base model. Note that the Pharia-1-LLM-7B models do not currently support embedding functionalities.

Before you can install the PhariaAI Helm chart, you need to provide your access credentials to Helm. If you have not already done so, see Set up the registry credentials.

In this section:

Download the Helm chart
Configure the Helm chart
Configure Kubernetes Ingress
Configure database connections
Optional: Disable reloader
Configure authentication
Configure the PhariaAssistant API
Configure the PhariaData API
Configuring storage and database SSL variables and storage providers
Migrate to the new ETL (Extract, Transform, Load)
Configure which models to run
Schedule on GPU nodes
Configure the PhariaInference API
Configure PhariaFinetuning
Disable the finetuning service
Configure PhariaOS

Download the Helm chart

Step 1: Log in

helm registry login https://alephalpha.jfrog.io -u "$AA_REGISTRY_USERNAME" -p "$AA_REGISTRY_PASSWORD"

Step 2: Pull and unpack the latest chart version

helm pull oci://alephalpha.jfrog.io/pharia-ai-helm/pharia-ai --untar

Step 3: Change into the chart directory

The previous tar command creates a pharia-ai directory containing all the dependencies and a default values.yaml file. Change into this directory with the cd pharia-ai command.

Configure the Helm chart

The Helm chart configuration is provided by a values.yaml file. The initial values in the bundled values.yaml are suitable for a default installation, but they can be modified to meet your specific configuration needs.

Although you can modify the default values.yaml file directly, we strongly recommend to create a copy with the name values-override.yaml and make changes to the default configuration there. The rest of this article assumes you are working in a values-override.yaml file.

You can find additional comments and documentation on suitable configuration overrides in the relevant sections of the values.yaml/values-override.yaml file.

Set the PhariaAI edition

You can set an explicit PhariaAI "edition" in your Helm chart. This is useful to decouple your installation from breaking changes introduced by future PhariaAI artifact releases.

Setting a PhariaAI edition allows you to upgrade to a newer version of the Helm chart without incurring breaking changes, as long as you do not change your global.phariaAiEdition.

You set an explicit PhariaAI edition in your values-override.yaml file as follows:

global:
    phariaAiEdition: "1"

Currently, we support only PhariaAI edition 1.

Configure Kubernetes Ingress

External access to PhariaAI services with API or UI endpoints is provided using Kubernetes Ingress resources.

Major Ingress configuration is provided globally for all subcharts simultaneously:

global:
  # Global config for all Ingress resources
  ingress:
    # -- The ingressClassName globally defined for all ingress resources.
    #    See also: https://kubernetes.io/docs/concepts/services-networking/ingress/#the-ingress-resource
    ingressClassName: "nginx"
    # -- Domain for external access / Ingress to Pharia AI services via {service}.{domain}
    #    e.g. {service}.pharia-ai.example.com
    ingressDomain: "pharia-ai.local"
    # -- Additional annotations globally defined for all ingress-resources. This can be used to add Ingress controller specific annotations.
    additionalAnnotations: {}

Specifically, the following entries may require custom overrides in your values-override.yaml:

global.ingress.additionalAnnotations: annotations added globally to dependency-specific Ingress annotations. They may be needed for allowing automated certificate generation for TLS support (see also Annotated Ingress resource).
global.ingress.ingressClassName: relates to the installed Kubernetes Ingress controller in the deployment target cluster (see also Kubernetes Ingress resource).

For each dependency, specific Ingress configuration is provided individually in the respective section of the values-override.yaml file:

<sub-chart>:
  ingress:
    enabled: true
    # -- Hostname for the Ingress (without domain). The domain is read from global.ingress.ingressDomain.
    #    This needs to be changed if multiple instances are deployed to the same cluster using the same domain.
    hostname: "<sub-chart>"
    # -- Annotations for the ingress-resource. This can be used to add Ingress controller-specific annotations.
    annotations: {}
    tls:
      # -- Enable TLS configuration for this Ingress
      enabled: false
      # -- The name of the secret containing the TLS certificate.
      #    See also: https://kubernetes.io/docs/concepts/services-networking/ingress/#tls
      secretName: "<sub-chart>-tls"

Specifically, the following entries may require custom overrides:

<sub-chart>.ingress.tls.enabled: enable TLS for a specific Ingress host.
<sub-chart>.ingress.tls.secretName: name of the secret containing the TLS certificates or used for certificate generation using an installed cert-manager.

Configure database connections

Several PhariaAI applications require PostgreSQL databases as a persistence layer. For a productive PhariaAI installation, we highly recommend the use of external (managed) database instances.

By default, Kubernetes PostgreSQL instances are enabled. For each database configuration, you can either provide the necessary values directly during the Helm installation (in values-override.yaml) or reference an existing Kubernetes secret that stores the required values.

The necessary database deployments automatically connect to client applications. While PostgreSQL deployment is enabled by default for each dependency, you must define a password in values-override.yaml:

<sub-chart>:
  postgresql:
    # -- This is used to indicate whether the internal PostgreSQL is to be used or not.
    enabled: true
    auth:
      # -- If internal PostgreSQL is used, a dedicated password must be provided for setup of application authentication.
      password: ""

Ensure that you set an initial password using Helm values to enable authentication between the application and the database instance. Otherwise, the installation will fail.

External managed databases

We recommend using external database instances for production environments. The connection configuration and credential setup for each PhariaAI dependency can be managed with Helm chart values:

<sub-chart>:
  postgresql:
    # -- Disable the built-in Postgresql chart
    enabled: false
  databaseConfig:
    # -- Default secret name is used to create a secret if `external.existingSecret` is not provided.
    defaultSecret: default-secret-name
    secretKeys:
      # -- The key in the secret that contains the host of the database
      hostKey: "host"
      # -- The key in the secret that contains the port of the database
      portKey: "port"
      # -- The key in the secret that contains the user of the database
      userKey: "user"
      # -- The key in the secret that contains the password of the database
      passwordKey: "password"
      # -- The key in the secret that contains the database name
      databaseNameKey: "databaseName"
    # -- Provide an existing database if you want to use an external database
    external:
      # -- Set this value if a k8s Secret with PostgreSQL values already exists. Make sure that the all the keys exists in the secret with a valid value.
      existingSecret: ""
      # -- The host of the database
      host: ""
      # -- The port of the database
      port: ""
      # -- The user of the database
      user: ""
      # -- The password of the database
      password: ""
      # -- The name of the database
      databaseName: ""

Optional: Disable reloader

We use Reloader to ensure our deployments are restarted if relevant configuration or secrets change. This happens transparently and does not affect the users. However, if you already have Reloader deployed in the target cluster, we recommend disabling the Reloader deployment we ship with our stack.

To do so, set these values in the values-override.yaml:

pharia-reloader:
  enabled: false

Restarting of your deployments is now handled exclusively by your existing Reloader deployment.

Configure authentication

PhariaAI supports two identity provider options: Zitadel and Dex.

Zitadel is deprecated. Support for it will end in a future version of PhariaAI. We recommend migrating to Dex to continue receiving updates and support. For new installations, we recommend using Dex.

Understanding the identity provider options

Dex (recommended):

Website
Federated OIDC provider that acts as a bridge to external identity providers
Lightweight with lower resource footprint
Does not store user credentials directly
Requires an external OIDC-compliant identity provider (such as Google, Microsoft, Okta)
No username/password authentication or self-registration support
Faster and more reliable installation

Zitadel (legacy):

Website
Full-featured identity provider with local credential storage
Supports username/password authentication
Allows self-registration and user account creation within PhariaAI
Suitable for standalone deployments without external identity providers
Only use if you cannot migrate to Dex immediately. Plan your migration as soon as possible following the migration guide.
For complete Zitadel configuration instructions, see Configuring user and login options (legacy Zitadel documentation).

Configuring with Dex (recommended)

For production deployments, configure Dex with an external OIDC provider. Ensure you have an operational external OIDC identity provider configured before installation.

For test and development purposes without an external OIDC provider, you can configure Dex with static users. See Configuring static users for testing & development.

The following is an example configuration using Google as the OIDC provider:

pharia-iam:
  config:
    # -- Email address of the user who should receive admin privileges
    #    This user will be assigned the Admin role when they log in via the external OIDC provider
    #    The email must match the email provided by your OIDC provider
    adminEmail: "max.mustermann@your-company-doman.com"

    zitadel:
      enabled: false

    dex:
      enabled: true

      # Environment variables for injecting sensitive credentials
      envVars:
        - name: "OIDC_CLIENT_ID"
          valueFrom:
            secretKeyRef:
              name: "<your-oidc-secret-name>"
              key: "client-id"
        - name: "OIDC_CLIENT_SECRET"
          valueFrom:
            secretKeyRef:
              name: "<your-oidc-secret-name>"
              key: "client-secret"
        # Database connection (if using external database)
        - name: "DB_PASSWORD"
          valueFrom:
            secretKeyRef:
              name: "<your-db-secret-name>"
              key: "password"
        - name: "DB_NAME"
          value: "<your-database-name>"
        - name: "DB_HOST"
          value: "<your-database-host>"
        - name: "DB_USER"
          value: "<your-database-user>"

      config:
        # Configure your OIDC connector
        # For other connector options please refer to dex documentation https://dexidp.io/docs/connectors/
        connectors:
        - type: oidc
          id: google
          name: Google
          config:
            issuer: https://accounts.google.com
            # Reference environment variables for sensitive credentials
            clientID: $OIDC_CLIENT_ID
            clientSecret: $OIDC_CLIENT_SECRET
            redirectURI: https://pharia-iam.{{ .Values.global.ingress.ingressDomain }}/oidc/callback
            scopes:
              - openid
              - profile
              - email
            getUserInfo: true

    # For production: Use external PostgreSQL database
    dexPostgresql:
      enabled: false

Creating the OIDC credentials secret:

Before deploying, create a Kubernetes secret with your OIDC provider credentials:

kubectl create secret generic <your-oidc-secret-name> \
  --from-literal=client-id='<YOUR_GOOGLE_CLIENT_ID>' \
  --from-literal=client-secret='<YOUR_GOOGLE_CLIENT_SECRET>' \
  -n <pharia-ai-install-namespace>

Important notes:

Environment variables are referenced in the connector config using $VARIABLE_NAME syntax.
Set adminEmail to the email address of the user who must have admin privileges. This user will be automatically assigned the Admin role when they first log in through your OIDC provider.
The redirectURI uses a template variable that will be automatically populated with your configured ingress domain.
For production deployments, use an external PostgreSQL database instead of the included dexPostgresql.
You can configure other OIDC providers (Microsoft, Okta, LDAP, SAML, and so on) by adjusting the connector configuration. See the [Dex connectors documentation](https://dexidp.io/docs/connectors/) for a complete list of supported connectors and their configuration options
Note: adminPassword is not used with Dex as authentication is handled by your external OIDC provider.

Configure the PhariaAssistant API

Configuring which models are used by PhariaAssistant

Model configuration is done through environment variables in your Helm chart. You can configure the Summarization and Generation models in the pharia-assistant-api chart:

pharia-assistant-api:
  env:
    SUMMARY_MODEL_NAME: "llama-3.1-8b-instruct"
    GENERATE_MODEL_NAME: "llama-3.1-8b-instruct"

To configure the Chat model, use the PHARIA_CHAT_DEFAULT_MODEL variable in the pharia-chat Helm chart:

pharia-chat:
  env:
    values:
      PHARIA_CHAT_DEFAULT_MODEL: "llama-3.3-70b-instruct"

For guidance on selecting appropriate models for different tasks and hardware configurations, see Model recommendations for PhariaAssistant.

Configuring which collections are visible to PhariaAssistant Chat

To configure which collections are accessible to PhariaAssistant Chat, you must define the environment variables RETRIEVER_QA_INDEX_NAME and DOCUMENT_INDEX_FILE_UPLOAD_NAMESPACE in the pharia-chat chart:

pharia-chat:
  env:
    values:
      DOCUMENT_INDEX_FILE_UPLOAD_NAMESPACE: "Assistant"
      RETRIEVER_QA_INDEX_NAME: "Assistant-Index"

The namespace and index are created automatically, and the index configuration conforms to our recommended index configuration.

To appear in PhariaAssistant Chat, collections must have the RETRIEVER_QA_INDEX_NAME index assigned. Also, all users interacting with PhariaAssistant (by default AssistantUser) must be authorized as follows in the permissionModel:

permission: AccessNamespace
namespace: $DOCUMENT_INDEX_FILE_UPLOAD_NAMESPACE

Configure the PhariaData API

The PhariaData API requires a RabbitMQ service. We recommend using an external RabbitMQ instance (see the next section). However, by default, an internal RabbitMQ instance is provided with the built-in Helm chart and enabled automatically; you must define a password in values-override.yaml:

pharia-data-api:
  rabbitmq:
    # Enable or disable the internal RabbitMQ service.
    enabled: true
    auth:
      # Set the RabbitMQ application username.
      username: user
      # Set the RabbitMQ application password.
      password: ""

Configuring an external RabbitMQ instance

For production environments, we recommend using an external RabbitMQ instance. To do this, you must disable the built-in RabbitMQ service in values-override.yaml and configure the external connection settings.

The following is an example configuration for using an external RabbitMQ instance:

pharia-data-api:
  rabbitmq:
    enabled: false
  rabbitmqConfig:
    # Default secret name used to create a secret if `external.existingSecret` is not provided.
    defaultSecret: pharia-data-api-rabbitmq-secret
    # The load definitions secret must hold the RabbitMQ topology configuration.
    defaultLoadDefinitionsSecret: pharia-data-api-rabbitmq-load-definitions-secret
    secretKeys:
      # The key in the secret that contains the host of RabbitMQ.
      hostKey: "rabbitmq-host"
      # The key in the secret that contains the port of RabbitMQ.
      portKey: "rabbitmq-port"
      # The key in the secret that contains the user of RabbitMQ.
      userKey: "rabbitmq-username"
      # The key in the secret that contains the password of RabbitMQ.
      userPasswordKey: "rabbitmq-password"
    external:
      # Set this value if a Kubernetes Secret with RabbitMQ values already exists. Ensure all keys exist in the secret with valid values.
      existingSecret: ""
      # The user of RabbitMQ.
      rabbitmqUser: ""
      # The password of the RabbitMQ user.
      rabbitmqUserPassword: ""
      # The load definitions secret name.
      loadDefinitionsSecret: ""

Configuring storage and database SSL variables and storage providers

If SSL is enabled for the storage provider and for the database, be sure to set the values described in the code below. Also, set PHARIA_DATA_STORAGE_PROVIDER to either minio or stackit, depending on your storage provider.

pharia-data-api:
  env:
    values:
      PHARIA_STORAGE_SSLMODE: "true"
      PHARIA_DATA_POSTGRES_SSL_MODE: "true"
      PHARIA_DATA_STORAGE_PROVIDER: "minio|stackit"

Migrate to the new ETL (Extract, Transform, Load)

The new ETL (Extract, Transform, Load) can be used by adding the value etlSelector to the values-override.yaml file. If the new ETL is selected, this key must have v2 as a value. Otherwise, you can remove this key, or set it to v1 to use the previous version of the ETL. If the etlSelector value is set to none, no ETL will run.

The new ETL requires a Temporal server setup. You can enable the PhariaAI provided Temporal server by setting the value enabled in the pharia-temporal key to true. Then, under the pharia-data-api key, specify the information for this Temporal server using the externalTemporalConfig key:

pharia-data-api:
  etlSelector: "v2"
  externalTemporalConfig:
    host: "pharia-temporal"
    port: "7233"
    sslEnable: "false"
    namespace: "pharia-data"

Configure which models to run

Our default set of models is luminous-base, llama-3.1-8b-instruct, llama-3.3-70b-instruct, and llama-guard-3-8b. To change this, you must overwrite inference-workers.checkpoints. See Deploying workers.

Each checkpoint requires the correct reference to the persistent volume claim (PVC) which relates to the volume (PV) in which the model weights are stored (see also model download).

The model to be used in PhariaAssistant must be set in your values-override.yaml file based on the queue name used above. For example:

pharia-assistant-api:
  env:
    ...
    QA_MODEL_NAME: llama-3.1-8b-instruct
    SAFETY_MODEL_NAME: llama-3.1-8b-instruct
    SUMMARY_MODEL_NAME: llama-3.1-8b-instruct

Schedule on GPU nodes

To install the inference stack, the Kubernetes cluster requires GPU nodes (node pool) to run the respective application pods (relevant for PhariaAI subcharts inference-worker and pharia-translate).

The scheduling of the worker and translate deployment to the GPU nodes can be achieved with node taints and tolerations (see also Kubernetes taint and toleration). The tolerations configuration can be applied using overrides and Helm configuration as part of the values-override.yaml file:

inference-worker:
---
tolerations:
  - effect: NoSchedule
    key: nvidia.com/gpu # key used for node taints
    operator: Exists
---
pharia-translate:
  tolerations:
    - effect: NoSchedule
      key: nvidia.com/gpu # key used for node taints
      operator: Exists

Tolerations can also be specified for individual worker checkpoints to assign worker pods to different node pools in the context of the respective model used (for example, large models to nodes with multi-GPU support).

inference-worker:
  checkpoints:
    - ...
      tolerations:
        - effect: NoSchedule
          key: nvidia.com/gpu # key used for node taints
          operator: Exists

The total number of required GPUs for each worker deployment is calculated using the specified checkpoints configuration entries pipeline_parallel_size and tensor_parallel_size. It is automatically added to the worker Kubernetes deployment resources section:

resources:
  limits:
    nvidia.com/gpu: <number-of-gpus>

This configuration also controls the scheduling to GPU nodes with the respective number of available GPUs.

Configure the PhariaInference API

The Helm configuration of the inference-api dependency requires the initial setup of some credentials. Secrets can be directly passed as Helm values or by using existing Kubernetes secrets that are already available in the cluster.

Add the required configurations to your installed values-override.yaml file; for example:

inference-api:
  inferenceApiServices:
    # -- Name of an existing inferenceApiServices secret.
    #    If you want to provide your own secret, set this to the name of your secret.
    #    Keep in mind to set global.inferenceApiServicesSecretRef to the same name if an existing secret is used.
    #    The secret is expected to have a key-value-pair with key `secret`.
    existingSecret: ""
    # -- Manually added services secret
    #    If no existing external secret is provided via inferenceApiServices.existingSecret, a secret value has to be applied during installation
    secret: ""
  jwt:
    # -- Name of an existing jwt secret to use
    #    The secret is expected to have a key-value-pair with key `secret`.
    existingSecret: ""
    # -- Manually added jwt secret
    #    If no existing external secret is provided via jwt.existingSecret, a secret value has to be applied during installation
    secret: ""
  admin:
    # -- Email of the admin user to create on startup
    email: "tools@aleph-alpha.com"
    # -- Initial password of the admin user. If no existing external secret is provided via admin.existingSecret, a password value has to be applied during installation
    password: ""
    # -- Existing secret to use instead of email/password.
    existingSecret: ""
    # -- The email key in the secret
    emailKey: "email"
    # -- The password key in the secret
    passwordKey: "password"

Ensure that you set global.inferenceApiServicesSecretRef to the same name if an existing secret is used for inference-api.inferenceApiServices.existingSecret.

Configure PhariaFinetuning

If you want to use the PhariaFinetuning service to finetune models on custom data, additional GPU and CPU resources are required; see PhariaFinetuning service resource requirements.

We recommend configuring a separate GPU node pool for finetuning jobs and attaching a custom taint to it. In this way, you avoid interference with the GPU workloads required for the PhariaInference API.

GPUs for finetuning are not occupied constantly, but only when finetuning jobs are running. Therefore, we recommend using autoscaling for the finetuning node pool to free GPUs when they are not needed, thus reducing costs.

Additionally, we recommend that you connect the finetuning service to an external S3 storage bucket and an external database. Although we do ship PhariaAI with a built-in storage solution and database, we cannot guarantee the persistence of your finetuning artifacts this way.

Using MLflow as backend store

PhariaAI ships MLflow as an experiment-tracking backend. By default, MLflow’s tracking service stores experiment and run data on the local filesystem. While this works for simple use cases, in production environments or when working with multiple users, we recommend you use a more robust backend store; this is why we recommend connecting to an external database.

MLflow supports multiple database backends, including:

SQLite (default)
PostgreSQL
MySQL
Microsoft SQL Server
Oracle

To configure the finetuning service to use an external storage bucket, first create the bucket and generate credentials for access. Then, create a Kubernetes secret in the namespace where you install PhariaAI as follows:

apiVersion: v1
kind: Secret
data:
  bucketName: <base64 encoded name of the created storage bucket>
  bucketPassword: <base64 encoded password to your bucket>
  bucketUser: <base64 encoded username>
  endpointUrl: <base64 encoded endpoint URL of your S3 storage, e.g. https://object.storage.eu01.onstackit.cloud>
  region: <base64 encoded region of your bucket, e.g. EU01>
metadata:
  name: <your storage secret name>
type: Opaque

To configure the finetuning service to use an external database, first create the bucket and generate credentials for access. Then, create a Kubernetes secret in the namespace where you install PhariaAI as follows:

apiVersion: v1
kind: Secret
data:
  dbPassword: <base64 encoded database password>
metadata:
  name: <your db secret name>
type: Opaque

With this, you can now configure the following values in your values-override.yaml file to allow you to finetune models:

pharia-finetuning:
  rayCluster:
    workerGroups:
      gpu-group:
        # -- Tolerations matching the taints of the GPU nodes you want to use for finetuning
        tolerations:
          - effect: NoSchedule
            key: nvidia.com/gpu # key used for node taints
            operator: Exists
          - effect: NoSchedule
            key: pharia-finetuning # key used for node taints
            operator: Exists
  minio:
    # -- For production installations, we highly recommend to disable the built-in Minio service and to configure an external storage backend via the `storageConfig` section
    enabled: false
  # -- See reference of Helm chart values for detailed information on the `storageConfig` section.
  storageConfig:
    fromSecret:
      secretName: <your storage secret name>
  mlflow:
    postgresql:
      # -- For production installations, we highly recommend to disable the built-in postgres database and to configure an external database via the `mlflow.externalDatabase` section
      enabled: false
    minio:
      # -- For production installations, we highly recommend to disable the built-in storage solution and to configure an external storage backend via the `mlflow.externalS3` section
      enabled: false
    externalDatabase:
      dialectDriver: postgresql
      host: <your DB host>
      port: <your DB port>
      user: <your DB user>
      database: <your DB name>
      existingSecret: <your db secret name>
      existingSecretPasswordKey: dbPassword
    externalS3:
      existingSecret: <your storage secret name>
      existingSecretAccessKeyIDKey: bucketUser
      existingSecretKeySecretKey: bucketPassword
      bucket: <your bucket name>
      host: <your storage host>
      port: <your storage port>

Disable the finetuning service

If you are not planning to use PhariaAI to finetune models on custom data, you can disable the finetuning service by adding the following to your values-override.yaml file:

pharia-finetuning:
  enabled: false

Configure PhariaOS

PhariaOS uses KServe to dynamically deploy machine learning models. It currently supports pulling models from two sources:

Aleph Alpha Artifactory
Hugging Face

To enable model deployment from these sources, you must configure access credentials during installation. The following is an example configuration snippet for the phariaos-manager component:

phariaos-manager:
  kserve:
    enabled: true
    storage:
      http:
        existingSecret: ""  # Optional: Use an existing Kubernetes secret
        secretKeys:
          httpToken: "http-token"  # Key name in the secret
        endpoint: "alephalpha.jfrog.io"  # Aleph Alpha model registry endpoint
        token: ""  # Direct token (used if existingSecret is not set)

      huggingFace:
        existingSecret: ""  # Optional: Use an existing Kubernetes secret
        secretKeys:
          huggingFaceToken: "huggingface-token"  # Key name in the secret
        token: ""  # Direct Hugging Face token (used if existingSecret is not set)

You can either supply a token directly or reference an existing Kubernetes secret.
Set kserve.enabled to true if you want PhariaOS to deploy models via KServe.

Install the Helm chart

The Helm chart is installed using helm upgrade --install. For the Helm installation, you must choose a target Kubernetes namespace: <pharia-ai-install-namespace>.

You must provide access credentials for the image registry. There are two recommended options:

Option 1. Pass the credentials directly to Helm

One method is to set the credentials directly by passing them to the Helm chart:

helm upgrade --install pharia-ai . \
  --set imagePullCredentials.username=$AA_REGISTRY_USERNAME \
  --set imagePullCredentials.password=$AA_REGISTRY_PASSWORD \
  --values values.yaml --values values-override.yaml \
  -n <pharia-ai-install-namespace>

This command assumes that the default value for the registry imagePullCredentials.registry: "alephalpha.jfrog.io" is used. You can override the registry with --set imagePullCredentials.registry=<private-registry>.

During the installation, the Kubernetes (image-pull) secrets with names defined at global.imagePullSecretName and global.imagePullOpaqueSecretName are generated in the installation namespace.

Option 2. Pass the Docker secret to Helm

If you already have a Docker secret in your Kubernetes cluster, you can pass the secret name to Helm:

helm upgrade --install pharia-ai . \
  --set global.imagePullSecretName=<secretName> \
  --set global.imagePullOpaqueSecretName=<opaqueSecretName> \
  --values values.yaml --values values-override.yaml \
  -n <pharia-ai-install-namespace>

The credentials are expected to be set with the following keys:

registryUser
registryPassword