Version: Next

Real-time device usage endpoint

You can get the real-time device memory and core utilization by visiting <GPU-node-ip>:31992/metrics, or add it to a prometheus endpoint, as the command below:

curl <GPU-node-ip>:31992/metrics

It contains the following host-level metrics:

Metrics	Description	Example
hami_host_gpu_utilization_ratio	GPU core utilization ratio on host (0-100)	`{device_index="0",device_type="NVIDIA-NVIDIA H200",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",zone="vGPU"}` 0
hami_host_gpu_memory_used_bytes	GPU real-time device memory usage on host	`{device_index="0",device_type="NVIDIA-NVIDIA H200",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",zone="vGPU"}` 2.87244288e+08

It also exposes per-container and per-vGPU metrics for each scheduled task:

Metrics	Description	Example
hami_container_device_utilization_ratio	Container device SM utilization ratio	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 0
hami_container_device_memory_bytes	Container device memory usage breakdown in bytes	`{buffer_size="0",container="cuda",context_size="0",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",module_size="0",namespace="default",offset="0",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 0
hami_container_last_kernel_elapsed_seconds	Seconds since last kernel execution in container	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 3664
hami_vgpu_memory_used_bytes	vGPU device memory usage in bytes	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 0
hami_vgpu_memory_limit_bytes	vGPU device memory limit in bytes	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 2.097152e+10
hami_vgpu_memory_buffer_bytes	Container device memory buffer size in bytes	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 6.83935744e+08
hami_vgpu_memory_context_bytes	Container device memory context size in bytes	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 0
hami_vgpu_memory_module_bytes	Container device memory module size in bytes	`{container="cuda",device_uuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",namespace="default",pod="vgpu-share",vdevice_index="0",zone="vGPU"}` 0

note

The context_size, module_size, buffer_size and offset labels on hami_container_device_memory_bytes will be deprecated in v2.10.0. Use hami_vgpu_memory_context_bytes, hami_vgpu_memory_module_bytes and hami_vgpu_memory_buffer_bytes instead.