Patch now: Critical Nvidia bug allows container escape, complete host takeover

33% of cloud environments using the toolkit impacted, we're told

Jessica Lyons Thu 26 Sep 2024 // 21:42 UTC

A critical bug in Nvidia's widely used Container Toolkit could allow a rogue user or software to escape their containers and ultimately take complete control of the underlying host.

The flaw, tracked as CVE-2024-0132, earned a 9.0 out of 10 CVSS severity rating, and affects all versions of Container Toolkit up to and including v1.16.1, and Nvidia GPU Operator up to and including 24.6.1.

Nvidia issued a fix on Wednesday with the latest version of Container Toolkit (v1.16.2) and Nvidia GPU Operator (v24.6.2). The vulnerability does not impact use cases where Container Device Interface (CDI) is used.

This particular library is used across clouds and AI workloads. According to infosec house Wiz, 33 percent of cloud environments have a buggy version of Nvidia Container Toolkit installed, rendering them vulnerable.

Wiz security researchers found and disclosed the bug on September 1, and the GPU giant has confirmed it is as concerning as the cloud security shop makes it out to be.

"A successful exploit of this vulnerability may lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering," Nvidia warned in its security advisory.

Again, this is exploitable by someone or something that's been allowed to or managed to run or run within a container on a vulnerable host.

CVE-2024-0132 is a Time of Check Time of Use (TOCTOU) vulnerability, a type of race condition. This can allow the attacker to gain access to resources that they should not have access to.

Specific to Nvidia Container Toolkit: "Any environment that allows the use of third party container images or AI models – either internally or as-a-service – is at higher risk given that this vulnerability can be exploited via a malicious image," Wiz kids Shir Tamari, Ronen Shustin, Andres Riancho said in a write-up about the bug.

To exploit CVE-2024-0132, an attacker would need to craft a specially designed image and then get the image to run on the target platform, either indirectly, by convincing/tricking the user into running the malicious image, or directly, if the attacker has access to shared GPU resources.

In a single-tenant compute environment, this could happen if a user downloads a malicious container image — say, via a social engineering attack where the user believes the container image is coming from a trusted source. In this scenario, the attacker could then take over the user's workstation.

In a shared environment, such as Kubernetes-powered one, however, a miscreant with permission to deploy a container could escape it and then access data or secrets of other applications on the same node or cluster, the researchers noted.

This second scenario "is especially relevant for AI service providers that allow customers to run their own GPU-enabled container images," they warned.

"An attacker could deploy a harmful container, break out of it, and use the host machine's secrets to target the cloud service's control systems," the researchers continued. "This could give the attacker access to sensitive information, like the source code, data, and secrets of other customers using the same service."

Wiz isn't providing too many technical details about how to exploit the vuln because the security shop wants to ensure that vulnerable organizations have time to deploy the fix — and not have their host system taken over with root privileges.

But the researchers promised more to come soon, including exploit details, so it's a good idea to get ahead of the would-be attackers on this one. ®

Security

Patches

Patch now: Critical Nvidia bug allows container escape, complete host takeover

33% of cloud environments using the toolkit impacted, we're told

Windows Themes zero-day bug exposes users to NTLM credential theft

Emergency patch: Cisco fixes bug under exploit in brute-force attacks

Microsoft SharePoint RCE flaw exploits in the wild – you've had 3 months to patch

VMware fixes critical RCE, make-me-root bugs in vCenter - for the second time

ParTec expands supercomputer patent fight from Microsoft to Nvidia

Wanted. Top infosec pros willing to defend Britain on shabby salaries

Critical default credential in Kubernetes Image Builder allows SSH root access

xAI picked Ethernet over InfiniBand for its H100 Colossus training cluster

Just how private is Apple's Private Cloud Compute? You can test it to find out

Nvidia CEO whines Europeans aren’t buying enough GPUs

Five Eyes nations tell tech startups to take infosec seriously. Again

No-Nvidias networking club convenes in search of open GPU interconnect

About Us

Our Websites

You Privacy