Move VM Vulnerability: Network Shutdown and Potential Hard Fork in Sui, Aptos, and Public Blockchain

user-image
BEOSIN
BEOSIN
Jun 23, 2023

[Xangle Digest]

※ This article contains content originally published by a third party. Please refer to the bottom of the article for the copyright notice regarding this content. 

 

 

 

 

 

 

Background

Move is a new blockchain programming language used by platforms such as Aptos and Sui. Recently, Beosin security research team discovered a stack overflow vulnerability caused by recursive calls. This vulnerability can lead to a total network shutdown, prevent new validators from joining the network, and potentially result in a hard fork.

Upon discovering and verifying this vulnerability, we immediately (on May 30, 2023) contacted the Sui team via email. Following their advice, we submitted the vulnerability to the Immunefi bug bounty platform on June 2, 2023. However, the official team responded that they had internally identified the issue a month ago and had been working on a private security fix. They released the fix on the same day we submitted it to Immunefi (June 2, 2023). We understand and respect their response.

IMG_256

The vulnerability has been fixed in the current version, so we are now publicly disclosing our research findings. 

 

Knowledge Basics

Move virtual machine is implemented in the Rust programming language. The main unit of organization and distribution of Move code is a Package. A Package consists of a set of modules, which are defined in separate files with the extension .move. These files include Move functions and type definitions.

The minimum package directory structure is shown below, which includes a manifest file, a lock file, and a sources subdirectory containing one or more module files.

 

Packages can be published on the blockchain. A Package can contain multiple Modules, and a Module can contain multiple functions and structs.

Function parameters can be structs, and structs can be nested within other structs, as shown below:

In the Rust programming language, when making recursive function calls without limiting the depth of the calls, it can lead to stack overflow or depletion of CPU and memory resources. The Move virtual machine is implemented in the Rust language.

 

Vulnerability Description

Within the Move virtual machine, recursive functions are frequently used to handle various structured data, such as serialized data, nested structs, nested arrays, and generic nesting. To prevent stack overflow caused by recursive calls, it is necessary to check the depth of recursive calls.

The image above shows the depth of parsing for the Move virtual machine limiting simple and complex type structures.

The image above shows the depth limitation of the SIGNATURE_TOKEN within the Move virtual machine bytecode.

Although the Move virtual machine has recursive call depth checks in many places, there are still certain cases that have not been taken into account.

Let's consider an attack scenario: defining a struct A, then nesting struct B within A, and nesting struct C within B, and so on, continuing the nesting indefinitely. If the Move virtual machine uses a recursive function to handle this nesting relationship, it will crash due to stack overflow or insufficient resources. Although Move has limitations on the number of structs that can be defined within each module, we can create an unlimited number of modules.

This gives us an attack strategy:

  1. Generate 25 packages (can be more than 25), each containing 1 module.
  2. Each module defines 64 structs (can be more than 64 in Aptos) with a chained nesting relationship. The first struct in each module nests the last struct from the previous module.
  3. Each module includes a callable entry function. This function takes a parameter of the type of the last struct (the 64th struct) from the previous module. The function creates and returns an instance of the last struct in the current module.
  4. Publish each package in order.
  5. Call the entry function in each module in order.

 

During our testing on Sui mainnet_v1.1.1_, we observed the following phenomena in our test environment with 4 validators:

  1. After running the PoC once, all 4 validators immediately crash due to stack overflow.
  2. After at least 3 validators crash and restart, all full nodes crash.
  3. After at least 3 validators crash and restart, new validators joining the network crash at least once.
  4. After at least 3 validators crash and restart, new full nodes joining the network sometimes crash once.
  5. If lucky, certain validators or full nodes cannot be restarted after a crash unless all local databases are deleted.

Regarding Sui mainnet_v1.2.0, we observed the following phenomena in our test environment with 4 validators:

  1. After running PoC once, at least 1 validator crashes due to stack overflow or out of memory.
  2. Running the PoC again can make the second validator crash. After that, the entire network cannot accept new transactions.
  3. Crashed validators may be unable to restart. Deleting all local databases of the crashed validator and running it again would result in a crash after some time, and it cannot be restarted anymore.
  4. When a new validator joins the network, it crashes.

 

We conducted a simple test on Aptos and found that Aptos also crashes.

 

PoC

Sui PoC

For each created module, it is published to the Sui chain and the "mint" function is called to obtain the created "object." The "object" is then passed as a parameter to the "mint" function of the next module until the Sui node crashes.

 

Aptos PoC

For each created module, it is published to the Aptos chain and the "mint" function is called until the Aptos node crashes.

 

Vulnerability Fix

Sui mainnet_v1.2.1 (June 2, 2023), Aptos mainnet_v1.4.3 (June 3, 2023), and Move-language versions released after June 10, 2023 have addressed this vulnerability.

Sui patch:

https://github.com/MystenLabs/sui/commit/8b681515c0cf435df2a54198a28ab4ef574d202b

The patch code imposes limitations on the depth of type references in the creation of structs, vectors, and generics. The key function added is "check_depth_of_type."

 

Aptos patch:

https://github.com/aptos-labs/aptos-core/commit/47a0391c612407fe0b1051ef658a29e35d986963

Similar to Sui, the patch code also imposes limitations on the depth of type references in the creation of structs, vectors, and generics. The key function added is "check_depth_of_type."

 

Move-language patch:

https://github.com/move-language/move/commit/8f5303a365cf9da7554f8f18c393b3d6eb4867f2

Similar to Sui and Aptos, the patch code also imposes limitations on the depth of type references in the creation of structs, vectors, and generics. The key function added is "check_depth_of_type."

 

Vulnerability Impact

This vulnerability exploit is very simple and consumes a very small amount of gas per attack. However, its impact is significant and can lead to a total network shutdown, prevent new validator nodes from joining the network, and potentially cause a hard fork. This vulnerability affects Sui mainnet_v1.2.1, Aptos mainnet_v1.4.3, and versions of Move-language prior to June 10th.

Why can this vulnerability potentially cause a hard fork?

  1. Malicious attackers can create struct nesting relationships of arbitrary depth and deploy these malicious structs on the blockchain. They can then send immutable malicious transactions targeting these structs. Although this process may cause network crashes, some malicious transactions will still be deployed on the chain.
  2. To patch this vulnerability, we can limit the depth of recursive calls. However, this means that we can no longer reference the malicious structs already deployed on the blockchain and cannot verify historical transactions related to these malicious structs within the virtual machine. Only a hard fork can resolve this issue.
  3. Due to the severe impact of hard fork testing on the current network, we have abandoned that test. However, theoretically, we believe it is feasible.

 

Summary

A simple recursive function call leading to a stack overflow can cause a total network shutdown, and with additional manipulation, it may even result in a hard fork. Therefore, the security of the blockchain should always be the top priority. We recommend project teams to pay close attention to such vulnerabilities and consider engaging professional blockchain security organizations for comprehensive audits.

 

-> Click here to read the full report.

 

 

주의사항
본 글에 기재된 내용들은 작성자 본인의 의견을 정확하게 반영하고 있으며 외부의 부당한 압력이나 간섭 없이 작성되었음을 확인합니다. 작성된 내용은 작성자 본인의 견해이며, (주)크로스앵글의 공식 입장이나 의견을 대변하지 않습니다. 본 글은 정보 제공을 목적으로 배포되는 자료입니다. 본 글은 투자 자문이나 투자권유에 해당하지 않습니다. 별도로 명시되지 않은 경우, 투자 및 투자전략, 또는 기타 상품이나 서비스 사용에 대한 결정 및 책임은 사용자에게 있으며 투자 목적, 개인적 상황, 재정적 상황을 고려하여 투자 결정은 사용자 본인이 직접 해야 합니다. 보다 자세한 내용은 금융관련 전문가를 통해 확인하십시오. 과거 수익률이나 전망이 반드시 미래의 수익률을 보장하지 않습니다.
본 제작 자료 및 콘텐츠에 대한 저작권은 자사 또는 제휴 파트너에게 있으며, 저작권에 위배되는 편집이나 무단 복제 및 무단 전재, 재배포 시 사전 경고 없이 형사고발 조치됨을 알려드립니다.