跳转到文档内容
版本:下一个

HAMi vNPU Core Integration

HAMi-vnpu-core is an Ascend NPU in-container resource controller written in Rust. It implements user-space interception via libvnpu.so (interceptor) and Limiter (manager). Two environment variables are used to declare resource quotas: NPU_MEM_QUOTA for memory limits and NPU_PRIORITY for scheduling priority. This design integrates that capability into HAMi scheduling to support Ascend NPU memory virtualization and compute time-slice soft partitioning.

Prerequisites

Ascend driver version 25.5 or later is required. The chip must have device-share mode enabled:

npu-smi set -t device-share -i <id> -d <value>
ParameterDescription
idDevice ID, obtained via npu-smi info -l
valueContainer sharing mode: 0 = disabled (default), 1 = enabled

HAMi Scheduler Changes

Extended Resource Names

The existing huawei.com/Ascend910B3-memory resource is reused for memory allocation. A new huawei.com/Ascend910B3-core resource is added; pods that declare this resource use vnpu soft partitioning instead of the original hard partitioning logic.

Resource NameUnitMeaningExample
huawei.com/Ascend910B3integerNumber of NPU cards1
huawei.com/Ascend910B3-memoryMiBMemory quota28672 (28 GiB)
huawei.com/Ascend910B3-coreintegerPercentage20, 40

Filter Phase

The Fit function core logic is updated to ensure that the total compute capacity across all containers on a single card does not exceed 100. PatchAnnotation is also updated to inject the new annotation format.

Expected Pod quota annotation format:

{
"huawei.com/Ascend910B3": "[
{
\"UUID\": \"xxx\",
\"memory\": 28672,
\"core\": 20
}
]"
}

The PatchAnnotations function in pkg/device/ascend/device.go adds memory and core fields for vnpu soft partitioning:

func (dev *Devices) PatchAnnotations(pod *corev1.Pod, annoInput *map[string]string, pd device.PodDevices) map[string]string {
commonWord := dev.CommonWord()
devList, ok := pd[commonWord]
if ok && len(devList) > 0 {
for _, dp := range devList {
for _, val := range dp {
rtInfo = append(rtInfo, RuntimeInfo{
UUID: val.UUID,
Temp: tempName,
MemQuota: memory,
Priority: core,
})
}
}
}
return *annoInput
}

Limiter Process Startup

The Limiter process is started via a Kubernetes postStart lifecycle hook injected by MutateAdmission in pkg/device/ascend/device.go:

lifecycle:
postStart:
exec:
command:
- "bash"
- "-c"
- |
export RUST_LOG=info
/usr/local/hami-vnpu-core/limiter > /usr/local/hami-vnpu-core/inst1_manager.log 2>&1 &

Because postStart cannot guarantee completion before the container entrypoint, libvnpu.so loops until the Limiter's shared memory is available before allowing the workload to proceed:

impl SchedulerClient {
pub fn new() -> Self {
let pid = std::process::id();
let shmem_name = local_shmem_name();
let shm_path = format!("/dev/shm/{}", shmem_name);
let mut retry_count = 0;

while !std::path::Path::new(&shm_path).exists() {
std::thread::sleep(std::time::Duration::from_millis(100));
retry_count += 1;
if retry_count > 600 {
panic!("[Scheduler] FATAL: Limiter not found after 60 seconds.");
}
}

let shmem = shmem::shm_setup::open_shmem::<LocalContainerShmem>(shmem_name.as_str());
}
}

Ascend Device Plugin Changes

Host Path Layout

The Limiter binary and libvnpu.so are placed at a fixed host path so they can be mounted into containers:

/usr/local/hami-vnpu-core/
├── limiter
├── libvnpu.so
└── ld.so.preload

ld.so.preload content:

/hami-vnpu-core/target/debug/libvnpu.so

Shared Memory Directory

sudo mkdir -p /usr/local/hami-shared-region
sudo chmod 777 /usr/local/hami-shared-region

Allocate Function Enhancements

The Allocate function in the device plugin is updated to inject the following into each container:

func (ps *PluginServer) Allocate(ctx context.Context, reqs *v1beta1.AllocateRequest) {
/*
1. Volume mounts:
A. Huawei driver and SMI toolchain
B. vnpu-core binary path: /usr/local/hami-vnpu-core
C. HAMi interceptor library via /etc/ld.so.preload
D. HAMi compute-partition shared directory: /usr/local/hami-shared-region:/hami-shared-region

2. Environment variables:
A. Visible device IDs
B. Shared memory path: NPU_GLOBAL_SHM_PATH = /hami-shared-region/{ID}_global_registry
C. Memory quota: NPU_MEM_QUOTA (read from Annotation, e.g. 28672)
D. Priority: NPU_PRIORITY (read from Annotation, e.g. 20)
*/
}
CNCFHAMi 是 CNCF Sandbox 项目