- Table of contents
New Vulnerability in R’s Deserialization Discovered
Security researchers have identified a vulnerability, CVE-2024-27322, in the R programming language that permits arbitrary code execution by deserializing untrusted data. This flaw can be exploited when loading RDS (R Data Serialization) files or packages, which are commonly shared among developers and data scientists. An attacker can craft malicious RDS files or packages containing embedded arbitrary R code, triggering execution on the victim’s device upon interaction.
What is R?
R is an open-source language and software environment for statistical computing, data visualization, and machine learning. With a robust core language and extensive library support, R is widely adopted, often as the primary language for statistics students. Due to its prowess in analyzing large datasets, its usage extends across industries like healthcare, finance, and government. Additionally, R has gained traction in AI/ML for handling complex data.
The Comprehensive R Archive Network (CRAN) repository hosts over 20,000 packages, with R-forge boasting over 2,000 projects and 15,000 users.
Impact
The vulnerability arises from an error in the readRDS function, which is responsible for loading RDS and RDX files that transfer serialized R objects for processing. Serialization facilitates state capture and data exchange. RDS stores the state for a single object, while RDX with RDB enables data transfer for multiple objects. The issue lies in the RDS format’s support for PROMSXP object code, where uncomputed expressions are evaluated during deserialization using “eval,” potentially enabling arbitrary code execution by substituting expressions in RDS or RDX files.
R Supply Chain Attacks
Sharing Objects
Upon investigating GitHub, our team uncovered that readRDS, a potential vector for exploiting this vulnerability, is referenced in over 135,000 R source files. Upon reviewing repositories, we discovered that a substantial portion of these usages involved untrusted, user-provided data, which could lead to a complete compromise of the system running the program.
Notably, source files containing potentially vulnerable code were found in projects maintained by R Studio, Facebook, Google, Microsoft, AWS, and other major software vendors.
R Packages
CVE-2024-27322
R packages facilitate sharing of compiled R code and data for statistical tasks. As of the time of writing, the CRAN package repository boasts 20,681 available packages. Anybody can upload packages to this repository, provided they meet certain criteria, such as containing specific files (e.g., a description) and passing automated checks (which do not currently assess this vulnerability).
R packages utilize the RDS format to save and load data. During compilation of a package, two files are generated:
- .rdb file: Contains serialized objects as binary blobs of data.
- .rdx file: Includes metadata for each serialized object within the .rdb file, including their offsets
When a package is loaded, metadata stored in RDS format within the .rdx file is utilized to locate objects within the .rdb file. These objects are then decompressed and deserialized, effectively loading them as RDS files.
Consequently, R packages are susceptible to deserialization vulnerabilities and can be exploited in supply chain attacks via package repositories. An attacker can take over an R package by simply replacing the .rdx file with a maliciously crafted version. When the package is loaded, the code will execute automatically.
Furthermore, if one of the core system packages (e.g., compiler) has been tampered with, the malicious code will execute upon R initialization. One particularly perilous aspect of this vulnerability is that instead of merely replacing the .rdx file, the exploit can be injected into any offset within the RDB file, rendering detection extremely challenging.
Recommendations
CERT Coordination Center (CERT/CC) has issued an advisory for CVE-2024-27322, cautioning against arbitrary code execution via malicious RDS or RDX files.
- Update R to version 4.4.0 or later promptly.
- Until then, avoid interaction with untrusted RDS files or packages to mitigate risks.
- There is currently no public proof-of-concept or exploitation evidence available.