Installation Process

info

The installation process requires familiarity with Kubernetes and Helm

note

Our documentation is written assuming you will be using Linux or MacOS for your installation but this is not required

Prerequisites

Make sure you have completed the prerequisites before starting.

How to operate PhariaAI

PhariaAI can be operated on any suitable Kubernetes cluster using the Helm chart provided in this repository: https://alephalpha.jfrog.io/artifactory/helm/pharia-ai/. The Helm chart will install the necessary components to run the PhariaAI models on your cluster.

Registry Credential Setup

If you have not already done so, you can create a token with your account on Software Self-Service under this artifact path:

https://alephalpha.jfrog.io/ui/repos/tree/General/helm

Click the "Set me Up" button to generate a token.

For the purpose of this instruction, the credentials are exported to environment variables.

export AA_REGISTRY_USERNAME=<username> # the account provided to you
export AA_REGISTRY_PASSWORD=<password> # your generated token for the helm.

Once your credentials are set, authenticate your local Helm client with the repository. This step ensures Helm has the necessary access to fetch the PhariaAI chart.

helm registry login https://alephalpha.jfrog.io -u "$AA_REGISTRY_USERNAME" -p "$AA_REGISTRY_PASSWORD"

Create the target namespace

All setup steps following this section assume the target namespace already exists within your Kubernetes cluster. You can create it like this:

kubectl create namespace <pharia-ai-install-namespace>

Download the model weights for the LLMs

The model weights are available in our Software Self-Service instance and need to be downloaded beforehand to be used by the inference stack of the PhariaAI installation.

We have prepared a separate Helm chart for downloading the model weights to persistent volumes in your cluster. The Helm chars deploys persistent volume claims and Kubernetes jobs for triggering the download.

By default, the chart deployment downloads the model weights for luminous-base and llama-3.1-8b-instruct. If you want to download only those default models, run the following commands:

helm install pharia-ai-models oci://alephalpha.jfrog.io/inference-helm/models \
  --set modelCredentials.username=$AA_REGISTRY_USERNAME \
  --set modelCredentials.password=$AA_REGISTRY_PASSWORD \
  -n <pharia-ai-install-namespace>

If you want to download additional models, you can configure the models to download in a separate values.yaml file like this:

models:
  - name: luminous-base
    pvcSize: 100Gi
    weights:
      - repository:
          download: luminous-base.tar.gz
          targetDirectory: luminous-base-2022-04
  - name: pharia-1-llm-7b-control
    pvcSize: 30Gi
    weights:
      - repository:
          download: Pharia-1-LLM-7B-control.tar
          targetDirectory: Pharia-1-LLM-7B-control

You might need to set a storage class of the persistent volume to create. This can be done via the (top level) persistence.storageClass. For k3s you need to set it to "local-path" for example.

To run the model download with the additional models, run the following command:

helm install pharia-ai-models oci://alephalpha.jfrog.io/inference-helm/models \
  --set modelCredentials.username=$AA_REGISTRY_USERNAME \
  --set modelCredentials.password=$AA_REGISTRY_PASSWORD \
  --values values.yaml \
  -n <pharia-ai-install-namespace>

Note: Restricting the model download to persistent volumes in a dedicated availability zone can be achieved via defining respective K8s node tolerations / node selectors (cf. values.yaml of models Helm chart).

Whether you download the default models or additional models, you can check the status of the download job by running:

kubectl get jobs -n <pharia-ai-install-namespace>

Note: An incorrect Helm configuration might result in Pod errors of the download K8s Job. Adapting the config and upgrading the Helm deployment might require the prior deletion of the involved K8s Jobs.

The names of the created persistent volume claims are required for the Helm config of the PhariaAI chart and can be obtained via:

kubectl get pvc -n <pharia-ai-install-namespace>

Once the download job is completed, you can proceed with the installation of the PhariaAI Helm chart.

Note: To utilize any features of PhariaAI that depend on embedding models, such as Assistant Q&A or document indexing, it is essential to have the luminous-base model. Note that the Pharia 1 LLM 7B models do not currently support embedding functionalities.

How to install PhariaAI

Before you can install the PhariaAI Helm chart from Software Self-Service, you need to provide your access credentials to Helm. If you have not already done so, see Registry Credential Setup.

Download the Helm chart

The following command allows you to download the pharia-ai helm chart

helm registry login https://alephalpha.jfrog.io -u "$AA_REGISTRY_USERNAME" -p "$AA_REGISTRY_PASSWORD"

Step 2: Pull and unpack the latest chart version

helm pull oci://alephalpha.jfrog.io/pharia-ai-helm/pharia-ai --untar

Step 3: Change into the chart directory

The previous tar command will have created a pharia-ai directory containing all the dependencies and default values.yaml file. Change into the directory cd pharia-ai

Configure the Helm chart

The Helm chart configuration is provided via a respective Helm values.yaml file. The initial values in the bundled values.yaml are suitable for a default installation, and they may be modified to meet your specific configuration needs.

You will find additional comments and documentation on suitable config overrides directly added to the respective sections of the bundled values.yaml file.

Instead of modifying the default values.yaml, you can make a copy called values-override.yaml where you will make changes to the default configuration.

Configure Ingress

External access to respective PhariaAI services with API or UI endpoint is provided via Kubernetes Ingress resources.

Major ingress configuration is provided globally for all sub-charts simultaneously:

global:
  # Global config for all ingress resources
  ingress:
    # -- The ingressClassName globally defined for all ingress resources.
    #    See also: https://kubernetes.io/docs/concepts/services-networking/ingress/#the-ingress-resource
    ingressClassName: "nginx"
    # -- Domain for external access / ingress to Pharia AI services via {service}.{domain}
    #    e.g. {service}.pharia-ai.example.com
    ingressDomain: "pharia-ai.local"
    # -- Additional annotations globally defined for all ingress-resources. This can be used to add ingress controller specific annotations.
    additionalAnnotations: {}

Specifically the following entries might require custom overrides in your values-override.yaml:

global.ingress.additionalAnnotations: annotations added globally to dependency specific ingress annotations. Might be needed for allowing automated certificate generation for TLS support (cf. https://cert-manager.io/docs/usage/ingress/).
global.ingress.ingressClassName: relates to the installed Kubernetes ingress controller in the deployment target cluster (cf. https://kubernetes.io/docs/concepts/services-networking/ingress/#the-ingress-resource.

For each dependency, specific ingress configuration is provided individually in the respective section of the values-override.yaml file:

<sub-chart>:
  ingress:
    enabled: true
    # -- Hostname for the ingress (without domain). The domain is read from global.ingress.ingressDomain.
    #    This needs to be changed, if multiple instances are deployed to the same cluster using the same domain.
    hostname: "<sub-chart>"
    # -- Annotations for the ingress-resource. This can be used to add ingress controller specific annotations.
    annotations: {}
    tls:
      # -- Enable TLS configuration for this Ingress
      enabled: false
      # -- The name of the secret containing the TLS certificate.
      #    See also: https://kubernetes.io/docs/concepts/services-networking/ingress/#tls
      secretName: "<sub-chart>-tls"

Specifically the following entries might require custom overrides:

<sub-chart>.ingress.tls.enabled: enable TLS for specific ingress host.
<sub-chart>.ingress.tls.secretName: name of the secret containing the TLS certificates or used for certificate generation via an installed cert-manager.

Configure Database Connections

Several PhariaAI applications requires PostgreSQL databases as persistence layer. For a productive PhariaAI installation we highly recommend the usage of external (managed) database instances.

By default, Kubernetes PostgreSQL instances are enabled. For each database configuration, you can either provide the necessary values directly during the Helm installation (via values-override.yaml) or reference an existing Kubernetes secret that stores the required values.

The necessary database deployments automatically connect to client applications. While PostgreSQL deployment is enabled by default for each dependency, you must define a password in values-override.yaml:

<sub-chart>:
  postgresql:
    # -- This is used to indicate whether the internal PostgreSQL should be used or not.
    enabled: true
    auth:
      # -- If internal PostgreSQL is used a dedicated password has to be provided for setup of application authentication
      password: ""

Make sure to set an initial password via Helm values to enable authentication between the application and the database instance.

External Managed databases

We recommend using external database instances for production environments. The connection configuration and credential setup for each PhariaAI dependency can be managed via Helm Chart values:

<sub-chart>:
  postgresql:
    # -- Disable the built-in Postgresql chart
    enabled: false
  databaseConfig:
    # -- Default secret name is used to create a secret if `external.existingSecret` is not provided.
    defaultSecret: default-secret-name
    secretKeys:
      # -- The key in the secret that contains the host of the database
      hostKey: "host"
      # -- The key in the secret that contains the port of the database
      portKey: "port"
      # -- The key in the secret that contains the user of the database
      userKey: "user"
      # -- The key in the secret that contains the password of the database
      passwordKey: "password"
      # -- The key in the secret that contains the database name
      databaseNameKey: "databaseName"
    # -- Provide an existing database if you want to use an external database
    external:
      # -- Set this value if a k8s Secret with PostgreSQL values already exists. Make sure that the all the keys exists in the secret with a valid value.
      existingSecret: ""
      # -- The host of the database
      host: ""
      # -- The port of the database
      port: ""
      # -- The user of the database
      user: ""
      # -- The password of the database
      password: ""
      # -- The name of the database
      databaseName: ""

Configuring Pharia Assistant API

Pharia Assistant API requires a Redis service. By default, an internal Redis instance is provided via the built-in Helm chart and enabled automatically, you must define a password in values-override.yaml:

  redis:
    # -- Indicate whether the internal Redis should be used.
    enabled: true
    auth:
      # -- Redis Password
      password: ""

External Pharia Assistant API Redis

However, it's recommended to use an external Redis instance, for that you must disable the built-in Redis service (via values-override.yaml) and configure the external connection settings.

Example configuration for using an external Redis instance:

  redis:
    # -- Indicate whether the internal Redis should be used.
    enabled: false
  redisConfig:
    external:
      existingSecret: ""
      host: "my-redis"
      port: "6379"
      password: "redispassword"

Configuring Pharia Data API

The Pharia Data API requires a RabbitMQ service. By default, an internal RabbitMQ instance is provided via the built-in Helm chart and enabled automatically. You must define a password in values-override.yaml:

pharia-data-api:
  rabbitmq:
    # Enable or disable the internal RabbitMQ service.
    enabled: true
    auth:
      # Set the RabbitMQ application username.
      username: user
      # Set the RabbitMQ application password.
      password: ""

External RabbitMQ

For production environments, it's recommended to use an external RabbitMQ instance. To do this, disable the built-in RabbitMQ service in values-override.yaml and configure the external connection settings.

Example configuration for using an external RabbitMQ instance:

pharia-data-api:
  rabbitmq:
    enabled: false
  rabbitmqConfig:
    # Default secret name used to create a secret if `external.existingSecret` is not provided.
    defaultSecret: pharia-data-api-rabbitmq-secret
    # The load definitions secret must hold the RabbitMQ topology configuration.
    defaultLoadDefinitionsSecret: pharia-data-api-rabbitmq-load-definitions-secret
    secretKeys:
      # The key in the secret that contains the host of RabbitMQ.
      hostKey: "rabbitmq-host"
      # The key in the secret that contains the port of RabbitMQ.
      portKey: "rabbitmq-port"
      # The key in the secret that contains the user of RabbitMQ.
      userKey: "rabbitmq-username"
      # The key in the secret that contains the password of RabbitMQ.
      userPasswordKey: "rabbitmq-password"
    external:
      # Set this value if a Kubernetes Secret with RabbitMQ values already exists. Ensure all keys exist in the secret with valid values.
      existingSecret: ""
      # The user of RabbitMQ.
      rabbitmqUser: ""
      # The password of the RabbitMQ user.
      rabbitmqUserPassword: ""
      # The load definitions secret name.
      loadDefinitionsSecret: ""

Configuring which models to run

Our default set of models is luminous-base and llama-3.1-8b-instruct. To be able to use these models, you have to configure the PhariaAI Helm chart by adding the following to the values-override.yaml, into the inference-worker.checkpoints section:

inference-worker:
  checkpoints:
    - generator:
        type: "luminous"
        pipeline_parallel_size: 1
        tensor_parallel_size: 1
        tokenizer_path: "luminous-base-2022-04/alpha-001-128k.json"
        weight_set_directories: ["luminous-base-2022-04"]
      queue: "luminous-base"
      replicas: 1
      modelVolumeClaim: "models-luminous-base"
    - generator:
        type: "luminous"
        pipeline_parallel_size: 1
        tensor_parallel_size: 1
        tokenizer_path: "llama-3.1-8b-instruct/tokenizer.json"
        weight_set_directories: ["llama-3.1-8b-instruct"]
      queue: "llama-3.1-8b-instruct"
      replicas: 1
      modelVolumeClaim: "models-llama-3.1-8b-instruct"

Note: Each checkpoint requires the correct reference to the persistent volume claim (PVC) which relates to the volume (PV), the model weights are stored (cf. model download).

The model to be used in Pharia Assistant must be set in your values-override.yaml file based on the queue name used above, e.g.

pharia-assistant-api:
  env:
    ...
    QA_MODEL_NAME: llama-3.1-8b-instruct
    SAFETY_MODEL_NAME: llama-3.1-8b-instruct
    SUMMARY_MODEL_NAME: llama-3.1-8b-instruct

Scheduling on GPU Nodes

For installing the inference stack, the Kubernetes cluster requires GPU nodes (node pool) to run the respective application Pods (relevant for PhariaAI sub-charts inference-worker and pharia-translate).

The scheduling of the worker and translate deployment to the GPU nodes can be achieved via node taints and tolerations (cf. https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/). The tolerations config can be applied via overrides and Helm config as part of the values-override.yaml file.

inference-worker:
---
tolerations:
  - effect: NoSchedule
    key: nvidia.com/gpu # key used for node taints
    operator: Exists
---
pharia-translate:
  tolerations:
    - effect: NoSchedule
      key: nvidia.com/gpu # key used for node taints
      operator: Exists

Tolerations can also be specified for individual worker checkpoints in order to assign worker Pods to different node pools in context of the respective model used (e.g. large models to nodes with multi-GPU support).

inference-worker:
  checkpoints:
    - ...
      tolerations:
        - effect: NoSchedule
          key: nvidia.com/gpu # key used for node taints
          operator: Exists

The total number of needed GPUs for each worker deployment is calculated via the specified checkpoints config entries pipeline_parallel_size and tensor_parallel_sizeand automatically added to the worker K8s Deployment resources section:

resources:
  limits:
    nvidia.com/gpu: <number-of-gpus>

This config also controls the scheduling to GPU nodes with the respective number of available GPUs.

Inference API Configuration

The Helm config of the inference-api dependency requires the initial setup of certain credentials. Secrets can be directly passed as Helm values or via existing Kubernetes secrets, already available in the cluster.

Please, add respective config to your install values-override.yaml.

inference-api:
  inferenceApiServices:
    # -- Name of an existing inferenceApiServices secret.
    #    If you want to provide your own secret, set this to the name of your secret.
    #    Keep in mind to set global.inferenceApiServicesSecretRef to the same name if an existing secret is used.
    #    The secret is expected to have a key-value-pair with key `secret`.
    existingSecret: ""
    # -- Manually added services secret
    #    If no existing external secret is provided via inferenceApiServices.existingSecret, a secret value has to be applied during installation
    secret: ""
  jwt:
    # -- Name of an existing jwt secret to use
    #    The secret is expected to have a key-value-pair with key `secret`.
    existingSecret: ""
    # -- Manually added jwt secret
    #    If no existing external secret is provided via jwt.existingSecret, a secret value has to be applied during installation
    secret: ""
  admin:
    # -- Email of the admin user to create on startup
    email: "tools@aleph-alpha.com"
    # -- Initial password of the admin user. If no existing external secret is provided via admin.existingSecret, a password value has to be applied during installation
    password: ""
    # -- Existing secret to use instead of email/password.
    existingSecret: ""
    # -- The email key in the secret
    emailKey: "email"
    # -- The password key in the secret
    passwordKey: "password"

Keep in mind to set global.inferenceApiServicesSecretRef to the same name if an existing secret is used for inference-api.inferenceApiServices.existingSecret.

Finetuning Service Configuration

In case you want to use the Finetuning Service to finetune models on custom data, you need to configure the following values in your values-override.yaml file:

pharia-finetuning:
  rayCluster:
    workerGroups:
      gpu-group:
        # -- Tolerations matching the taints of the GPU nodes you want to use for finetuning
        tolerations: {}
  minio: 
    # -- For production installations, we highly recommend to disable the built-in Minio service and to configure an external storage backend via the `storageConfig` section
    enabled: false
  # -- See reference of helm chart values for detailed information on the `storageConfig` section.
  storageConfig: {}

IAM Configuration

You may provide custom credentials for the initial user account in the section pharia-iam.config. This user account is used for user management in the PhariaAI stack via PhariaOS.

pharia-iam:
  config:
    # -- Init password of initial user. To be valid it requires to have 10-70 characters, including at least one uppercase letter, one lowercase letter, and one digit. User will need to change this password on the first login.
    adminPassword:

If you want to enable extra sign-up options such as self-sign-up or sign-up via SSO, you need to enable the rights to configure Zitadel, which is the internal identity provider used in the PhariaAI stack. Please enable the flag pharia-iam.config.adminEnableZitadelManagement which grants the rights to configure sign-up options to your initial user account. For further information about the sign-up options, see "How to Configure Sign-Up Options".

Service Deployment

In the PhariaAI stack, you can deploy a Dockerized service. However, please note that the PhariaAI installation does not include a built-in container registry for hosting these service images. To deploy a custom service, you must provide the Docker registry's authentication secret. This secret is necessary for the system to pull and verify your custom images.

Additionally, you can control the service feature by setting phariaos-manager.usecase.enabled: false in the configuration to disable it when needed.

To configure this, you must supply the Docker image pull secret in your value file as shown below:

phariaos-manager:
  usecase:
    enabled: true
    # -- Docker secret for authenticating with the container registry to pull/verify service images.
    # -- Ensure the secret is available in the same namespace as the PhariaAI release.
    imagePullSecrets:
      - "your-registry-secret"

Skill Deployment

In the PhariaAI stack, you can deploy custom Skills as OCI images. Similar to service deployments, you must provide an OCI registry for deploying Skills. Popular choices for such registries include GitHub Container Registries, GitLab Container Registries, and JFrog Artifactory. The credentials for the used registries must be provided.

Namespace Config

Skills are managed in namespaces. Namespaces are configured in your value file as shown below. From a configuration point of view, each namespace consists of two parts:

An OCI registry to load Skills from.
A namespace configuration (a toml file, typically checked into a Git repository). This file lists the Skills that are deployed to a specific namespace.

You must specify a registry and base repository for each namespace to pull the images from. You must also provide the credentials to access the registry and the credentials to access the namespace configuration file.

In the example below, one namespace my-team is configured. The configuration file is hosted on GitLab and is accessed with the token specified as NAMESPACES__MY_TEAM__CONFIG_ACCESS_TOKEN environment variable. The Skills are pulled from registry.acme.com, which is accessed through BasicAuth with the configurable environmental variables NAMESPACES__MY_TEAM__REGISTRY_USER and NAMESPACES__MY_TEAM__REGISTRY_PASSWORD.

namespaces:
  # camelCase namespace here, will be converted to kebab-case automatically
  myTeam:
    configUrl: "https://gitlab.acme.com/api/v4/projects/42/repository/files/assistant.toml/raw?ref=main"
    registry: "registry.acme.com"
    baseRepository: "engineering/pharia-ai-skills/assistant"
env:
  - name: NAMESPACES__MY_TEAM__CONFIG_ACCESS_TOKEN
    valueFrom:
      secretKeyRef:
        name: pharia-kernel-secrets
        key: skillRegistryPassword
  - name: NAMESPACES__MY_TEAM__REGISTRY_USER
    valueFrom:
      secretKeyRef:
        name: pharia-kernel-secrets
        key: skillRegistryUser
  - name: NAMESPACES__MY_TEAM__REGISTRY_PASSWORD
    valueFrom:
      secretKeyRef:
        name: pharia-kernel-secrets
        key: skillRegistryPassword

Install the Helm chart

The Helm chart is installed using helm upgrade --install. For the Helm install, a respective target Kubernetes namespace <pharia-ai-install-namespace> should be chosen.

The access credentials for the image registry are required to be provided. There are two recommended options.

Option 1. Set the credentials directly by passing them to Helm

helm upgrade --install pharia-ai . \
  --set imagePullCredentials.username=$AA_REGISTRY_USERNAME \
  --set imagePullCredentials.password=$AA_REGISTRY_PASSWORD \
  --values values.yaml --values values-override.yaml \
  -n <pharia-ai-install-namespace>

This command assumes that the default value for the registry is used imagePullCredentials.registry: "alephalpha.jfrog.io" is used. You can override the registry via --set imagePullCredentials.registry=<private-registry>.

During the installation, the Kubernetes (image-pull) secrets with name defined at global.imagePullSecretName and global.imagePullOpaqueSecretName are generated in the install namespace.

Option 2. If you already have a Docker secret in your Kubernetes cluster, you can pass the secret name to Helm

helm upgrade --install pharia-ai . \
  --set global.imagePullSecretName=<secretName> \
  --set global.imagePullOpaqueSecretName=<opaqueSecretName> \
  --values values.yaml --values values-override.yaml \
  -n <pharia-ai-install-namespace>

Post Installation Steps

After installation, navigate to https://login.<YOUR_CONFIGURED_DOMAIN>/ and log in with the initial user account (configured in the helm chart values pharia-iam.config) to complete the setup of the initial user credentials. If you did not provide a custom initial user account password, you can display the autogenerated password with the following command:

kubectl get secret pharia-iam-admin-password  -o jsonpath="{.data.password}" | base64 -d

How to upgrade PhariaAI

Ensure that your registry credentials are up to date and that Helm has access (see Registry Credential Setup).

Thanks to Helm's idempotent operations, the upgrade instruction is the same as for installation. Only the new pharia-ai-version has to be referenced when changing chart version. You also need to run the same installation command when you change configuration values in the values.yaml or values-override.yaml for the same pharia-ai-version.

See How to Upgrade guide for more.

Next Steps

For setting up namespaces, creating collections, and uploading documents for indexing to facilitate Q&A in Pharia Assistant, consult the Turn files into collections guide.

Prerequisites​

How to operate PhariaAI​

Registry Credential Setup​

Create the target namespace​

Download the model weights for the LLMs​

How to install PhariaAI​

Download the Helm chart​

Step 1: Login​

Step 2: Pull and unpack the latest chart version​

Step 3: Change into the chart directory​

Configure the Helm chart​

Configure Ingress​

Configure Database Connections​

External Managed databases​

Configuring Pharia Assistant API​

External Pharia Assistant API Redis​

Configuring Pharia Data API​

External RabbitMQ​

Configuring which models to run​

Scheduling on GPU Nodes​

Inference API Configuration​

Finetuning Service Configuration​

IAM Configuration​

Service Deployment​

Skill Deployment​

Namespace Config​

Install the Helm chart​

Option 1. Set the credentials directly by passing them to Helm​

Option 2. If you already have a Docker secret in your Kubernetes cluster, you can pass the secret name to Helm​

Post Installation Steps​

How to upgrade PhariaAI​

Next Steps​