跳到主要内容
版本:v1.3.0

Real-time device usage

Real-time device usage endpoint

You can get the real-time device memory and core utilization by visiting {GPU node node ip}:31992/metrics, or add it to a prometheus endpoint, as the command below:

curl {GPU ndoe ip}:31992/metrics

It contains the following metrics:

MetricsDescriptionExample
Device_memory_desc_of_containerContainer device meory real-time usage{context="0",ctrname="2-1-3-pod-1",data="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",module="0",offset="0",podname="2-1-3-pod-1",podnamespace="default",vdeviceid="0",zone="vGPU"} 0
Device_utilization_desc_of_containertContainer device real-time utilization{ctrname="2-1-3-pod-1",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",podname="2-1-3-pod-1",podnamespace="default",vdeviceid="0",zone="vGPU"} 0
HostCoreUtilizationGPU real-time utilization on host{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",zone="vGPU"} 0
HostGPUMemoryUsageGPU real-time device memory usage on host{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",zone="vGPU"} 2.87244288e+08
vGPU_device_memory_limit_in_bytesdevice limit for a certain container{ctrname="2-1-3-pod-1",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",podname="2-1-3-pod-1",podnamespace="default",vdeviceid="0",zone="vGPU"} 2.62144e+09
vGPU_device_memory_usage_in_bytesdevice usage for a certain container{ctrname="2-1-3-pod-1",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",podname="2-1-3-pod-1",podnamespace="default",vdeviceid="0",zone="vGPU"} 0