ZeroValidation

Admin Forgery in LayerZero

Source

Executive Summary

We describe two critical trusted-party vulnerabilities in LayerZero contracts. One in the Endpoint contract, and one in the UltraLightNodeV2 contract. These vulnerabilities allow the LayerZero MultiSig to exploit user applications by passing arbitrary messages to the application without Relayer or Oracle sign-off. We show that one of these vulnerabilities is being actively used, on mainnet, to modify messages sent to the Stargate contracts. Because the vulnerability is being actively exploited by LayerZero code, we believe the LayerZero team to be aware of it. Because they have publicly denied this capability, we believe they may be deliberately concealing the extent of their control over applications.

Note on Disclosure

A copy of this post was provided to LayerZero team via Twitter DM, as no security email could be found. Because this is a trusted-party vulnerability, the risk of full disclosure is low. We have chosen to fully disclose because we believe that LayerZero is aware of these issues, and public disclosure is the best way to prompt app developers to set configuration. This is covered in more detail further below.

Overview

Applications using LayerZero with default configurations may receive arbitrary, unauthenticated messages via system processing channels, even absent Oracle/Relayer collusion. This allows the LayerZero MultiSig to forge any cross-chain message to any application that uses the default configuration.

Forgery may occur in two ways, both of which require executing contract calls from the LayerZero team multisig. These methods exploit a “drag-along” mechanism in the company’s idiosyncratic upgrade system to bypass all security checks without compromising any protocol actor.

  • Crit 1. For default-configured apps, the LayerZero MultiSig may arbitrarily submit messages via the Endpoint by changing the default Receiving library.
    • This allows fraudulent message delivery to local applications.
    • This attack bypasses the Oracle & Relayer 2-of-2 multisig entirely.
    • I have labeled this critical, as it represents an easily-exploited departure from the stated security model, and could result in theft of all user funds.
  • Crit 2. For default-configured apps, the LayerZero MultiSig may arbitrarily modify message payloads during processing by the UltraLightNode, after Oracle and Relayer sign-off.
    • This allows fraudulent message delivery to local applications.
    • This may occur even when the application has configured a trusted Relayer.
    • This attack alters message payloads after the Oracle & Relayer 2-of-2 multisig has signed the message.
    • I have labeled this critical, as it represents an easily-exploited departure from the stated security model, and could result in theft of all user funds.
    • The LayerZero MultiSig may also disable the User Application’s mitigation paths.

The following informational items do not seem to be documented in any source I can find. They are presented as evidence that LayerZero was aware of the above vulnerabilities, and chose not to disclose or otherwise address them.

  • Info 1. For Aptos connections, the UltraLightNode does not use block-header data, and does not process transaction inclusion proofs.
    • This occurs whenever the application relies on the verification library “FPValidator”, which is default for the Aptos network.
    • The UltraLightNode logic is deliberately subverted to substitute message hashes for block hashes.
    • This has no security impact, because ULN security relies only on the Oracle and Relay 2-of-2 Multisig, not on block hashes.
    • This is evidence that LayerZero is aware they can use Verification Libraries to bypass intended UltraLightNode behavior.
  • Info 2. UltraLightNode Proof Validator Libraries already modify in-flight messages (as described in Crit 2), in a non-malicious manner in order to special-case certain Stargate operations.
    • Correct special-casing of these payload modifications in the proof validation library seem to be a requirement of correct Stargate function.
    • These modifications perform a safety check on contract recipients for Stargate messages
    • It appears that this is a backdoor after-the-fact patch to Stargate.
    • It is unclear whether this has any security implications
    • This is evidence that LayerZero is aware that they can use Verification Library modifications to corrupt application logic. 

Checking for Vulnerability

Checking for vulnerability is complicated, as the applications use different configuration APIs with overlapping names. This is needlessly confusing. We have tried to be as straightforward as possible with these instructions.

Crit 1

In the endpoint contract, call uaConfigLookup with the address of the application. If the receivingLibrary is set to 0x0, the application is vulnerable. See the Mitigation section below.

Crit 2

Crit 2 occurs on a chain-by-chain basis. In the ULNv2 contract call appConfig with the address of the application, and the relevant chain id. If the inboundProofLibraryVersion returned is 0, the application is vulnerable. See the Mitigation section below.

Stargate

At time of writing, Stargate currently does not have a receive library set. It is therefore vulnerable to Crit 1. Stargate currently does not have a proof validation library set for any chainid. It is therefore vulnerable to Crit 2.

Mitigation

To mitigate this attack, LayerZero applications must configure send and receive libraries within the Endpoint. If using the UltraLightNode verifying library, the application must also configure a proof verification library for each chain. Any time a new mainnet chain is added, the application must also specify a proof verification for that chain before accepting messages from it. Permanent mitigation of Crit 2 is not possible, as new chains may be enrolled in the future.

These calls must be made from the Application, to the Endpoint

LayerZero may make a new version of the UltraLightNode contract which does not automatically upgrade users to new proof verification libraries. This will address Crit 2, but require applications to actively switch to new proof verifying versions. This will not address Crit 1.

Details

Resources

Terminology

  • User Application (UA) – A contract that consumes and dispatches messages via the LayerZero protocol. E.g Stargate
  • Verifying Library (VL) – A contract that receives messages via an Oracle-Relayer 2-of-2 multisig, and sends messages via transaction receipts. A UA should connect to EXACTLY 1 of these. Connecting to different send + receive libraries is permissible but has adverse consequences.
  • Endpoint – this contract manages the relationship between UAs and VLs. It records the send + receive VL for each UA. It passes verified inbound messages from VL to UA, and outbound messages from UA to VL. It also allows configuration of the VL by the UA.
  • UA Configuration – The instructions that the UA gives the protocol via the Endpoint. This includes its choice of VL, Relayer, etc. If UA Configuration contains unset keys (as it does by default), a default is used.
  • UltraLightNode (ULN) – A VL intended to store block headers, and process transactions according to generic proofs under those headers. Proof libraries contain message parsing, and may be configured on a per-remote-chain basis. The current ULN is ULNv2.
  • Proof Library (PL) – A library used by the ULN to validate message proofs under block headers, authenticate the message, and parse the message into a deliverable format.
  • LayerZero MultiSig – A 2-of-5 multisig that owns the Endpoint and ULN contracts. This address chooses permissible VLs in the Endpoint, and PLs in the ULN.
  • Default Verifying/Proof Library (DVL or DPL) – when applications do not configure a specific library, a default is used. The LayerZero MultiSig may set the Default Library at any time. The Endpoint has a DVL, the ULN has a DPL.

Drag-along overview

The LayerZero contracts use a bespoke upgradability mechanism to add new functionality at specific points using external library calls. The LayerZero team allowlist new libraries for these callsites via the LayerZero MultiSig. Applications may use any allowlisted library. The MultiSig cannot remove a library from the allowlist.

When a UA does not specify a library, the Default Library is used. When the LayerZero MultiSig selects a Default Library, all UAs with default UA Configurations begin to use that library. In this sense the LayerZero MultiSig “drags” all applications along with it, unless they have explicitly configured a library. In this way the LayerZero MultiSig may upgrade any application that has not specifically revoked that right.

We call this right the “drag-along” right. It is present on the Endpoint and the ULN, although implementations differ.

Crit 1: ZeroValidation Verifier Drag-along

The Endpoint Verifier Drag-along is straightforward. Three functions implement VL addition. 

  1.  newVersion is used to add a new permissible version
  2. setDefaultSendVersion is used to set the DVL used for sending messages
  3. setDefaultReceiveVersion is used to set the DVL used for receiving messages

Because there are no restrictions on the input to newVersion, the LayerZero MultiSig may add any VL of their choosing. If they install a malicious VL, they can then call setDefaultReceiveVersion to upgrade the DVL to their new evil VL.

During message receipt, if the UA Configuration does not have a configured receiveLibraryAddressthe default is used. The message is authenticated because it came from the (evil) DVL. It is then passed to the UA receiver interface, which does not receive info about its DVL.

The simplified TX flow is:

  1. Add a malicious VL
  2. Set the malicious VL to default
  3. Exploit UAs that have been dragged along

Because the evil VL can pass any message to a UA, it can interfere with the application and cause incorrect operation. For Stargate, this means arbitrary theft of tokens.

Crit 2: ZeroValidation Proof Drag-along 

The ULN Proof Drag-along attack is much more interesting. It occurs because the Proof Library has two functions: 

  1. It validates proofs
  2. It parses the payload from the proven message

In ULNv2, the validateTransactionProof function is responsible for accepting proofs and messages from the relayer. It may be called only by the relayer. After checking the confirmations mapping for the Oracle’s confirmation of the block data, it prepares for message dispatch. At this point it has received sign-off from both the Oracle and the Relayer, as required by the 2-of-2 multisig security model.

On L91-93 the function loads the per-remote-chain PL, and passes it a block hash and a proof. The PL is responsible for validating the proof, and extracting the message from the proof. It returns a packet containing the message payload and metadata. This packet is parsed from the proof by the PL, and is treated as trusted data.

The remainder of the function verifies the metadata before passing the packet to the endpoint for dispatch to the UA. However, no checks are performed on _packet.payload, which contains the message to the UA. As a result, the PL can substitute the message with arbitrary data. This omission is necessary to preserve the existing PL abstraction.

In addition to allowing arbitrary message payload substitutions, the PL is not constrained by staticcall (as the ILayerZeroValidationLibrary interface does not define the function as view. Therefore the PL can perform any arbitrary set of function calls to outside contracts, update local state, and generally do anything that any contract could do. For example, a malicious PL could behave normally when transactions originate with the relayer, fooling gas estimation and pre-flight checks.

This channel is interesting because it substitutes the message after the Oracle and Relayer have signed off on the contents. In other words, it allows the LayerZero MultiSig to compromise the system even if the application uses a trusted Relayer. A valid, intended message may be substituted with a fraudulent message, without compromising the trusted parties.

This PL drag-along could have been prevented by requiring the payload and proof to be delivered separately, rather than in a single argument. The ULN could then check for modification of the payload by the PL. Other cross-chain protocols follow this interface, as parsers are well-known to be trusted code. Forcing explicit separation of message and metadata would allow the ULN to verify that the message was not substituted with a malicious payload by the PL. This cannot be implemented in the current ULN. We recommend making this change in future versions.

The simplified TX flow is:

  1. Add a malicious PL that modifies packet payloads
  2. Wait for the Oracle and Relayer to agree on a message
  3. While the Relayer TX is in flight, set the Default PL to the evil PL
  4. Exploit UAs that have been dragged along

Info 1: Aptos Proof Validation Does Not Use the ULN Header Model

The FP validator used to accept Aptos messages does not validate any proof with respect to the header. Instead message hashes are substituted for block header hashes, in a deliberate misuse of ULN logic.

Where Ethereum-based chains use an EL header signed by the Oracle, and a MPT validator to check inclusion proofs under the committed trie, the Aptos FP Validator instead substitutes message hashes for block hashes. The Oracle adds confirmations to the message hash, and the ULN passes the message and the message hash to the FP Validator Library. The PL simply checks that the hash corresponds to the proof.

Because Aptos is a per-txn consensus, parsing and proving connection to the consensus process does not neatly fit the ULN’s header/block expectation. We suspect the FP Validator is a deliberate departure from the ULN’s originally intended usage by the LayerZero team. The team used a custom PL to subvert and modify the behavior of the ULN to accommodate an unintended blockchain design. This change also undermines a core design goal of the ULN: that each message must be connected to a blockheader.

This is likely a workaround to another limitation of the LayerZero design: because the Endpoint does not allow multiple verification libraries per application, messages from all chains connected to an application must pass through the same PL. Therefore, the Aptos verifier must use the ULN. But because Aptos does not follow the header/block model like EVM chains, and the ULN assumes this model, there’s a fundamental incompatibility that is addressed by shoehorning Aptos messages into the LayerZero system by subverting the ULN.

It would be significantly cheaper to use LayerZero if all validation libraries followed this model. Given that both the ULN header sync and the simpler message sync have the same security (non-collusion of the Relayer and Oracle 2-of-2 multisig, with censorship propagating to the entire channel), it seems odd to use block hashes and MPTs at all. They add validation overhead without adding security. We recommend that the ULN be dropped in favor of a simpler 2-of-2 multisig.

While this modification changes the semantics of the ULN’s behavior, it does not reflect active exploitation of Crit 2. To our reading, this indicates that the LayerZero team understands that the ULN is equivalent to 2-of-2 multisig, and uses the ULN+PL primarily to obscure this security model.

Info 2: Proof Validation Contracts Contain Backdoor Upgrades for Stargate

The MerklePatriciaTrie validator appears to validate inclusion under a standard Ethereum Merkle-Patricia Trie. We have not verified that it functions as intended.

Both the MPT validator and the FP validator provide special escapes during parsing & validation. In FPValidator these escapes occur on L50-51. In MPTValidator, on L53-55.

The _secureStgTokenPayload function checks that a stargate token transfer message does not transfer tokens to the 0 address. If the transfer appears to do that, it is modified to transfer tokens to a “dead” address instead. It seems likely that this is used to prevent Stargate from erroring on ERC20 contracts that cannot transfer to the zero address, by re-writing the zero address to a burn address. It appears to prevent a minor repeatable DoS on Stargate. We have not validated this belief.

The _secureStgPayload function checks Stargate Swap payloads. It ensures that if a contract call is present in the payload, that the target of that call is a contract. If the target address is not a contract, the message is rewritten to remove the contract call. It seems likely that this is used to prevent Stargate from erroring on calls to accounts with no code, which would cause stuck messages. It appears to prevent a minor repeatable DoS on Stargate. We have not validated this belief.

Based on the behavior and special casing, it appears that these checks were added to proof validation to prevent Stargate users from accidentally burning assets. Because Stargate is immutable, the checks had to be inserted via owner-controlled mutability at the message parsing layer. In effect, the proof parsing & validation libraries were used as a backdoor upgrade to Stargate’s messaging logic. It seems unlikely that other UAs know that they are paying extra gas to fix Stargate issues.

This is an active, and apparently non-malicious, exploitation of Crit 2. LayerZero is using Crit 2 to rewrite message packets in the wild, today.

Analysis

This issue likely exists on all deployed LayerZero chains (except Aptos?). I have not examined LayerZero deployments on non-Ethereum chains. This is left as an exercise to the reader.

Governance is not a fix, only applications can mitigate effectively

LayerZero’s security model is explicitly 2-of-2 multisig between Relayer and Oracle. The addition of the 2-of-5 LayerZero corporate multisig is a significant weakening of this security model. This weakening occurs at multiple layers of the LayerZero architecture, and in unpatchable contracts. All future LayerZero applications must explicitly mitigate this vulnerability.

If contract ownership is passed to a larger multisig or a token-voting scheme, the capability is not removed. It merely passes to the new owner. There is no way to remove this capability from the LayerZero contracts without burning ownership and removing the VL extension capability entirely.

The LayerZero upgradability pattern allows vulnerabilities and prevents mitigations

LayerZero’s library registration pattern is a non-standard upgradability feature. Its security properties are not well understood, and it is no wonder several auditors repeatedly missed these issues. As implemented, it does not prove a check on owner capabilities in default-configured scenarios.

More to the point, the upgradability pattern allows security-critical elements to be upgraded (the VL and PL), while forbidding upgrades to core business logic. The business logic that maps UAs to VLs and PLs cannot change. The business logic that passes messages from VLs to Endpoint to UA cannot change. The business logic that manages UA configuration of Endpoint and VL cannot change. However, message parsing and handling, which can always result in total system compromise can change. These critical vulnerabilities exist because of the non-standard upgrade pattern, and cannot be fixed because of the immutability of that upgrade pattern.

Which is to say, Crit 1 cannot be addressed by the LayerZero team, because the Endpoint cannot be upgraded. Crit 2 can only be addressed by the introduction of a new VL.

For LayerZero upgrades, the security-critical code is mutable, but the non-critical code is immutable. This is an objectively bad design for a secure upgradability system.

LayerZero seems to know the ULN offers no security

LayerZero’s use of header hashes is security theater. Because the ultimate root of trust is the Oracle & Relayer 2-of-2 multi-sig’s agreement on a header, valid MPT proofs can be created to forge any message the Oracle and Relayer agree upon. LayerZero’s subversion of their own ULN to achieve this for the Aptos channel shows that they are aware that the ULN offers no security over the simple agreement of Oracle and Relayer. LayerZero accepts 2-of-2 security without headers or other consensus information for the Aptos channel. This is a strong indication that the LayerZero team does not believe that the ULN header model adds any security, and chose the ULN model for non-security reasons. It also calls into question their other claims about security and the ULN.

Why public disclosure?

Public disclosure likely does not significantly increase risk of exploit

Because the exploit may only be performed by insiders with known identities, who (according to on-chain evidence) are aware of the exploit, disclosing publicly does not increase the likelihood that they will exploit this contract right now. If they intended to do so, they likely would have done so already.

Disclosure is likely to prompt mitigation actions from LayerZero’s Stargate arm, and other applications integrating with LayerZero.

LayerZero certainly knew PLs could modify message payloads

It does not appear that this capability was purposefully designed. Rather, it seems that LayerZero discovered the capability after the fact and used it to “patch” a deployed application. It is unclear why they failed to disclose the capability to arbitrarily re-write message payloads. It is likely that they are not actively malicious, although the reliance on 2-of-5 employees is worrisome.

LayerZero is actively exploiting a vulnerability in their protocol to modify Stargate message payloads in the wild.

Despite that, docs and audit reports did not accurately reflect risk

LayerZero docs claim to require no application configuration. And while the ability to set new proof validation logic within the ULN verifying library (used in Crit 2) is explicitly desired and disclosed, the ability to automatically migrate apps to new verifying libraries (used in Crit 1 & Crit 2) is not. Since it seems apparent that LayerZero knew about the trusted-party vulnerability (because they were exploiting it), the omission is glaring.

LayerZero may be deliberately concealing these issues

LayerZero explicitly denied the existence of this capability while we were documenting their use of it. Yes, this happened.

The owner of the ULN contract cannot choose proof libraries on behalf of the user application. The owner can only add new proof libraries (i.e. append-only registry) of which the user application can select from. – raz (Ryan Zarick, CTO)

This is false. The owner of the ULN can choose verifying libraries and proof libraries on behalf of Stargate and other applications, in the wild, today. They can use this capability to compromise the application at will.

The LayerZero CTO concealed by omission a vulnerability that the team is actively exploiting. This is in line with the history of dissembling and misrepresentation in LayerZero security discussion. Because there appears to be active concealment of the issue, we doubt that LayerZero would correctly handle a private disclosure. We feel compelled to publish these vulnerabilities so that applications may mitigate. We expect this publication to be greeted with denial and deflection.

Timeline

Jan 29

  • 11:10 ET – James was asked to analyze LayerZero code in connection with a proposed governance action
  • Approx 13::45 ET – James began to analyze contracts
  • Approx 14:30 ET – James discovers Crit 2.
  • 15:07 ET – James notifies a second qualified security researcher that a trusted-party vulnerability may exist, and asks for analysis and confirmation. They wish to remain anonymous.
  • Approximately 15:45 ET – James discovers Crit 1 while documenting Crit 2.
  • 16:37 ET – James notifies employer of the existence of trusted-party vulnerabilities, and of his intent to fully disclose
  • Approximately 17:33 ET – James discovers Info 1 & Info 2 while documenting Crit 2.
  • 19:35 ET – James finishes drafting this document, submits to outside security researcher for review

Jan 30

  • 8:47 ET – James submits this doc to a co-worker for additional review and validation.
  • 10:45 ET – James notifies LayerZero of intent to fully disclose
  • 11:00 ET – James publishes this document

Corrections:

  • Timeline mistakenly labeled Jan 20/21 instead of Jan 29/30

Leave a Reply