Skip to main content

Prerequisites

important

The installation process requires familiarity with Kubernetes and Helm.

Credentials

A user account with access to Aleph Alpha Artifactory. We will provide this to you.

On your local machine

note

The documentation assumes you are using Linux or macOS for your installation, but this is not a requirement.

AspectRequirements
Container Orchestration PlatformKubernetes client v 1.29 and above
• You can check this using kubectl version
• Check your connectivity using kubectl get nodes
Package ManagerHelm v 3.0 and above
• You can check this using helm version

On your Kubernetes Cluster

AspectCriteriaMinimum Requirements
HardwareGPUMinimal Setup: 2 GPUs (with MIG), 3 GPUs (without MIG)

Recommended Setup: 6 GPUs (with MIG), 7 GPUs (without MIG)

Note: Actual # of GPUs depends on models selected for deployment.

Type: NVIDIA Ampere, Lovelace or Hopper generation. Currently, only NVIDIA GPUs are supported. Support for other vendors may be added in the future.

GPU Nodes: Your Kubernetes cluster must include GPU nodes to run the inference stack application pods.

During the finetuning of models, additional GPUs will be required. See Finetuning Service Resource Requirements
CPU & Memory24 CPU cores, 128 GB RAM

The exact requirements will depend on the number of users as well as which components of the stack you intend to use.
- Resource requirements DataPlatform & Document Index
- Resource requirements PhariaAssistant
Object StorageQuantity: 3x

Type: minio or any other S3 backend type for PhariaData and PhariaFinetuning

Input & Output Operations (IOPS) maximum: 1000 or above

Throughput maximum: 100 Mb/s or above
Persistent VolumesPersistent volumes accessible by all GPU nodes in the cluster are essential for storing model weights.

Ensure your persistent volumes are configured to be accessible across availability zones if applicable in your environment.
SoftwareNetworkingInstalled in a single namespace with open communication between all services in the namespace
NVIDIA GPU OperatorWe strongly recommend using the NVIDIA GPU Operator v 24 and above on default settings to manage NVIDIA drivers and libraries on your GPU nodes. More details on the GPU Operator setup can be found at GPU Operator Setup.
Ingress controller & domainThe cluster must include an ingress controller to enable external access to the PhariaAI service.

A certificate manager must also be configured to support secure access via TLS (Transport Layer Security).

A dedicated domain must be assigned to the Kubernetes cluster, enabling each service to host its application under a subdomain of this domain (e.g., https://<service-name>.<ingress-domain>).
Relational Database ManagementPostgres v 14.0 and above

Quantity: 1x Large
Storage: 800 GB
CPU: 8x
Memory: 16GB
Network Access & WhitelistingNot required if networking requirements are met.

If you require multiple name spaces please discuss this with our Product Support team.
Artifact ManagementAbility to pull the Helm chart containing the pharia-ai-helm and container images from an external artifact repository manager, such as Jfrog. Credentials for this will be provided to you.
Monitoring & ObservabilityNo fixed requirements but we can recommend the use of Prometheus & Grafana.
Cert managerCert Manager is required to provision webhook certs for Dynamic Model Management feature.
ClusterRolePhariaOS requires a ClusterRole for hardware discovery and model management. By default, the chart will create the necessary ClusterRole and ClusterRoleBinding. For detailed configuration, refer to PhariaOS Manager Settings and How to use existing cluster role.