Skip to main content
Version: v2.5.1

Global Config

Device Configs: ConfigMap

note

All the configurations listed below are managed within the hami-scheduler-device ConfigMap.

You can update these configurations using one of the following methods:

  1. Directly edit the ConfigMap: If HAMi has already been successfully installed, you can manually update the hami-scheduler-device ConfigMap using the kubectl edit command to manually update the hami-scheduler-device ConfigMap.

    kubectl edit configmap hami-scheduler-device -n <namespace>

    After making changes, restart the related HAMi components to apply the updated configurations.

  2. Modify Helm Chart: Update the corresponding values in the ConfigMap, then reapply the Helm Chart to regenerate the ConfigMap.

    ArgumentTypeDescriptionDefault
    nvidia.deviceMemoryScalingFloatThe ratio for NVIDIA device memory scaling, can be greater than 1 (enables virtual device memory, experimental feature). For an NVIDIA GPU with M memory, if set to S, vGPUs split from this GPU will get S * M memory in Kubernetes.1
    nvidia.deviceSplitCountIntegerMaximum jobs assigned to a single GPU device.10
    nvidia.migstrategyString"none" for ignoring MIG features, "mixed" for allocating MIG devices by separate resources."none"
    nvidia.disablecorelimitString"true" to disable core limit, "false" to enable core limit."false"
    nvidia.defaultMemIntegerThe default device memory of the current job, in MB. '0' means using 100% of the device memory.0
    nvidia.defaultCoresIntegerPercentage of GPU cores reserved for the current job. 0 allows any GPU with enough memory; 100 reserves the entire GPU exclusively.0
    nvidia.defaultGPUNumIntegerDefault number of GPUs. If set to 0, it will be filtered out. If nvidia.com/gpu is not set in the pod resource, the webhook checks nvidia.com/gpumem, resource-mem-percentage, and nvidia.com/gpucores, adding nvidia.com/gpu with this default value if any of them are set.1
    nvidia.resourceCountNameStringvGPU number resource name."nvidia.com/gpu"
    nvidia.resourceMemoryNameStringvGPU memory size resource name."nvidia.com/gpumem"
    nvidia.resourceMemoryPercentageNameStringvGPU memory fraction resource name."nvidia.com/gpumem-percentage"
    nvidia.resourceCoreNameStringvGPU core resource name."nvidia.com/cores"
    nvidia.resourcePriorityNameStringvGPU job priority name."nvidia.com/priority"

Chart Configs: arguments

You can customize your vGPU support by setting the following arguments using -set, for example

helm install hami hami-charts/hami --set devicePlugin.deviceMemoryScaling=5 ...
ArgumentTypeDescriptionDefault
devicePlugin.service.schedulerPortIntegerScheduler webhook service nodePort.31998
scheduler.defaultSchedulerPolicy.nodeSchedulerPolicyStringGPU node scheduling policy: "binpack" allocates jobs to the same GPU node as much as possible. "spread" allocates jobs to different GPU nodes as much as possible."binpack"
scheduler.defaultSchedulerPolicy.gpuSchedulerPolicyStringGPU scheduling policy: "binpack" allocates jobs to the same GPU as much as possible. "spread" allocates jobs to different GPUs as much as possible."spread"

Pod configs: annotations

ArgumentTypeDescriptionExample
nvidia.com/use-gpuuuidStringIf set, devices allocated by this pod must be one of the UUIDs defined in this string."GPU-AAA,GPU-BBB"
nvidia.com/nouse-gpuuuidStringIf set, devices allocated by this pod will NOT be in the UUIDs defined in this string."GPU-AAA,GPU-BBB"
nvidia.com/nouse-gputypeStringIf set, devices allocated by this pod will NOT be in the types defined in this string."Tesla V100-PCIE-32GB, NVIDIA A10"
nvidia.com/use-gputypeStringIf set, devices allocated by this pod MUST be one of the types defined in this string."Tesla V100-PCIE-32GB, NVIDIA A10"
hami.io/node-scheduler-policyStringGPU node scheduling policy: "binpack" allocates the pod to used GPU nodes for execution. "spread" allocates the pod to different GPU nodes for execution."binpack" or "spread"
hami.io/gpu-scheduler-policyStringGPU scheduling policy: "binpack" allocates the pod to the same GPU card for execution. "spread" allocates the pod to different GPU cards for execution."binpack" or "spread"
nvidia.com/vgpu-modeStringThe type of vGPU instance this pod wishes to use."hami-core" or "mig"

Container configs: env

ArgumentTypeDescriptionDefault
GPU_CORE_UTILIZATION_POLICYStringDefines GPU core utilization policy:
  • "default": Default utilization policy.
  • "force": Limits core utilization below "nvidia.com/gpucores".
  • "disable": Ignores the utilization limitation set by "nvidia.com/gpucores" during job execution.
"default"
CUDA_DISABLE_CONTROLBooleanIf "true", HAMi-core will not be used inside the container, leading to no resource isolation and limitation (for debugging purposes).false