Modelplane Modelplane docs

InferenceClass Custom Resource

Hardware recipe defining GPU type, count, and provisioning for a node pool.

Concept guide: Define Hardware Classes →

#Metadata

API version
modelplane.ai/v1alpha1
Kind
InferenceClass
Scope
Cluster
Short names
icl

#Example

Manifest
apiVersion: modelplane.ai/v1alpha1
kind: InferenceClass
metadata:
  name: gke-l4-1x-g2
spec:
  description: "GKE g2-standard-8, 1x NVIDIA L4"
  provisioning:
    provider: GKE
    gke:
      machineType: g2-standard-8
      diskSizeGb: 100
      accelerator:
        type: nvidia-l4
        count: 1
  devices:
    - name: gpu
      claim: DRA
      driver: gpu.nvidia.com
      deviceClassName: gpu.nvidia.com
      count: 1
      attributes:
        architecture: { string: Ada Lovelace }
      capacity:
        memory: { value: "23034Mi" }

#Spec

# description optional string

Human-readable description of the class.

# devices required object[] 1–16 items
# attributes optional map[string]object

DRA-style typed attributes for this device. Keys are bare names (e.g. architecture); the domain comes from the device’s driver. Each value sets exactly one typed field.

# capacity optional map[string]object

DRA-style capacity quantities for this device. Keys are bare names (e.g. memory); values are Kubernetes Quantities.

# claim optional enum: DRA | Synthetic default: DRA

How Modelplane treats this device. DRA emits it as a request in a ResourceClaim, so DRA binds a matching device to the pod at admission time; use it for hardware a real DRA driver exposes. Synthetic describes the device for fleet scheduling only and never claims it; use it for hardware that matters for placement but has no DRA driver yet, like an InfiniBand fabric.

# count optional integer 1–64 default: 1

How many of this device a node has.

# deviceClassName optional string 1–253 chars

Name of the cluster-scoped DRA DeviceClass to claim this device through. Required for claim: DRA devices; the DRA driver install creates the DeviceClass (e.g. gpu.nvidia.com). Ignored for Synthetic devices.

# driver required string 1–253 chars

DRA driver that owns this device (e.g. gpu.nvidia.com). Becomes the attribute/capacity domain a nodeSelector reads as device.attributes[""]..

# name required string 1–63 chars

Name of this device within the class (e.g. gpu, nic).

# provisioning optional object

How to provision a node pool of this class. Omit for classes that describe BYO node pools that already exist.

# eks optional object
# accelerator required object

GPU accelerator to attach when provisioning the node group. Provisioning input only: the scheduler matches against spec.devices, not this block.

# count required integer 1–16
# type required string 1–63 chars

GPU accelerator type (e.g. nvidia-a10g, nvidia-h100). Informational - reported on the consuming InferenceCluster’s status.

# diskSizeGb optional integer ≥ 10 default: 100
# instanceType required string ≥ 1 chars

EC2 instance type (e.g. g6.xlarge, p4d.24xlarge). The instance family determines the GPU model; the accelerator block below is informational.

# gke optional object
# accelerator required object

GPU accelerator to attach when provisioning the node pool. Provisioning input only: the scheduler matches against spec.devices, not this block, so count here is the GCP machine’s GPU count and need not be restated in devices.

# count required integer 1–16
# type required string 1–63 chars

GPU accelerator type passed to GCP (e.g. nvidia-l4, nvidia-h100-80gb).

# diskSizeGb optional integer ≥ 10 default: 100
# machineType required string ≥ 1 chars
# provider required enum: GKE | EKS

#Status

# conditions optional object[]