Featured

A Deep-Dive Into MultiSig Architecture on Arch

This piece is by Amine ElQaraoui, Chief Technology Officer at Arch

Arch Network

23 Jul 2024 • 10 min read

BUIDL in Public: Arch's Multisig Solution

This piece is by Amine ElQaraoui, Chief Technology Officer at Arch

When building anything novel, each decision comes with potential pros and cons. That’s especially true when crafting on blockchains, where every choice made about architecture can have profound implications for the long-term health and efficiency of your project.

I’ve been thinking a lot about these trade-offs while working with our team to build Arch, a bridgeless Bitcoin-native application platform that helps developers unlock smart contract functionality on the Bitcoin base layer.

Rather than keep our decision-making siloed, I’m going to start publicly sharing some of the choices we’re making at Arch and why we’re making them — a process that I hope will promote healthy and interesting discussions with other blockchain builders.

Today, I’m going to focus on how we’re implementing multisigs on Arch, which is a critical step in building a robust, efficient, and scalable execution platform for Bitcoin.

First, a TLDR; on Bitcoin multisigs + Arch

On Ethereum and Solana, smart contract wallets and program-derived addresses create the necessary infrastructure to allow for “multisignatures,” a security mechanism that requires multiple private keys to authorize a transaction.

While multisigs don’t directly add programmability to blockchains, they are a necessary precursor because they make it possible to create conditional execution — that is, the network will only execute an action if multiple signatures have approved of it.

Various signature schemes have been proposed to add this functionality to Bitcoin.

Scripted multisigs — including P2MS (Pay-to-Multisig) and Tapscript multisigs — offer high security. However, they offer little privacy because all data is stored publicly on-chain, which also leads to them having high transaction fees and network bloat as a result.
Scriptless multisigs are able to make transactions secure while putting less data on-chain. However, such protocols, like MuSig, rely on all signers being honest and cooperative during the signing process. If even one participant behaves maliciously or provides incorrect data, the protocol fails to produce a valid signature, leading to potential delays or failures in completing execution of a transaction. In most cases, this makes them unsuitable for real-world application.
FROST — Flexible Round-Optimized Schnorr Threshold — is being relied upon by some Bitcoin L2s, including Stacks and Botanics, to provide an alternative. Unlike MuSig, this protocol allows transactions to go through so long as they can get a thresholdof signers to participate and be responsive. This model offers significant security while also reducing gas fees and higher execution rates. However, its major weakness is that it can’t guarantee robustness. Networks must be able to guarantee that they’ll have a signature signed within a certain time range, but networks that rely on FROST alone can be prone to failure, leading to temporary disruptions or pauses in network activity.

At Arch, we’ve decided to use FROST, but have added on top of it a communication protocol

called ROAST — Robust Asynchronous Schnorr Threshold Signatures — that helps addresses some of its gaps in robustness.

In a typical FROST signature session, a subset of signers is chosen, and the entire session fails if one node doesn’t behave correctly. ROAST coordinates with the various nodes to make such failure significantly less likely.

That allows Arch to guarantee robustness by ensuring that a signature can be produced so long as just 51% of the network is honest and cooperative. This positions Arch to not just offer more efficient, scalable programmability, but also greater reliability in our performance.

A more technical deep deep-dive

For those who are interested, I’ve also included this breakdown of the various concepts I mentioned above, and a deeper technical explanation for the strengths and weaknesses that I mentioned before.

Scripted multisigs

Scripted multisigs, including P2MS (Pay-to-Multisig) and Tapscript multisigs, are pivotal features in Bitcoin’s transaction scripting capabilities, offering flexibility and enhanced security in managing digital assets.

P2MS allows multiple public keys to be linked to a single Bitcoin address, specifying a required number of those keys to authorize spending.

For example, a 2-of-3 multisig script necessitates any two out of three designated parties to provide their signatures for transaction execution:

OP_2 <pubkey_1> <pubkey_2> <pubkey_3> OP_3 OP_CHECKMULTISIG

While powerful, raw P2MS scripts are complex and limited to accommodating up to 3 public keys before becoming non-standard. They are typically embedded within P2SH (Pay-to-Script-Hash) or P2WSH (Pay-to-Witness-Script-Hash) constructs, simplifying transaction creation and extending multisig capabilities to up to 15 public keys.

Tapscript, an evolution of Bitcoin scripting, introduces OP_CHECKSIGADD alongside OP_CHECKSIG, allowing for more efficient multisig constructions.

<pubkey_1> OP_CHECKSIG <pubkey_2> OP_CHECKSIGADD <pubkey_3> OP_CHECKSIGADD 2 OP_EQUAL

OP_CHECKSIGADD aggregates multiple signatures into a single operation, reducing transaction size and enhancing privacy. Tapscript multisigs improve on traditional P2MS by optimizing script efficiency and accommodating more complex spending conditions while maintaining compatibility with Bitcoin’s scripting rules and ensuring backward compatibility with legacy nodes.

However, this approach comes at a cost in terms of transaction fees and network bloat due to increased script complexity and size — a problem we can start to address with scriptless multisigs.

Scriptless multisigs

Scriptless multisignatures represent a transformative leap in digital signature technology within the Bitcoin ecosystem, offering profound benefits over traditional scripted multisig setups.

Unlike scripted multisig, which relies on Bitcoin Script opcodes such as OP_CHECKMULTISIG, scriptless multisignatures leverage two or more private keys to generate a single signature that can be verified using just one public key.

This approach drastically reduces the on-chain footprint by publishing only one public key and one signature per transaction, regardless of the number of signers involved.

This efficiency not only results in lower transaction fees — where multiple signers collectively pay the same fee as a single signer for an equivalent transaction — but also enhances privacy significantly.

By making multisignature transactions indistinguishable from single-signature transactions on the blockchain, scriptless multisignatures contribute to greater anonymity and confidentiality for Bitcoin users.

One critical reason why this streamlined approach is more challenging with ECDSA, compared to newer signature schemes like Schnorr, lies in the inherent limitations of ECDSA itself.

ECDSA (Elliptic Curve Digital Signature Algorithm) relies on the mathematical representation of elliptic curves for digital signatures. Each signer generates a unique signature using their private key and the transaction’s digest, which is then verified against their corresponding public key.

However, ECDSA does not inherently support the aggregation of multiple signatures into a single compact signature that can be verified against a single public key. Therefore, in ECDSA multisignature setups, each signer must individually sign the transaction, resulting in multiple signatures being published on-chain.

In contrast, Schnorr signatures, also based on elliptic curve cryptography, offer several advantages over ECDSA. In addition to their simplicity and efficiency, Schnorr signatures allow for the aggregation of multiple signatures into a single combined signature. This aggregated signature can be verified against a single combined public key, reducing the transaction’s size and computational overhead.

The mathematical representation of Schnorr signatures involves the use of elliptic curve operations, where each signer contributes to the aggregated signature in a way that preserves the security guarantees of the algorithm.

Protocols like MuSig build on Schnorr signatures to enable secure and efficient multisignature transactions in Bitcoin, facilitating enhanced privacy and scalability without compromising on security.

As a result, scriptless multisignatures are particularly advantageous with Schnorr signatures, offering enhanced security, efficiency, and privacy benefits that are pivotal for the future evolution of Bitcoin’s transactional capabilities.

MuSig

MuSig is a protocol designed specifically for aggregating public keys and Schnorr signatures. Its primary goal is to allow multiple users, each possessing their own private key, to collaboratively create a combined public key that appears indistinguishable from any other Schnorr public key.

This combined public key retains the same compact size as a single-user pubkey, preserving efficiency in transaction data storage and management. The protocol defines a secure method through which these users can engage in a cooperative process to generate a multisignature that corresponds to this aggregated public key.

Similarly, the resulting multisignature is designed to be indistinguishable from any other Schnorr signature, ensuring uniformity and compatibility within Bitcoin.

In contrast to traditional script-based multisig approaches, which require the inclusion of multiple signatures and public keys in the transaction script, MuSig significantly reduces the on-chain footprint.

By aggregating signatures into a single compact form that corresponds to a single combined public key, MuSig transactions consume less block space. This reduction not only contributes to lower transaction fees but also enhances privacy by making multisignature transactions appear identical to single-signature transactions on the blockchain.

The MuSig protocol family includes three variants, each offering different trade-offs in terms of implementation complexity and interaction requirements:

MuSig (MuSig1): This variant is the foundational protocol, known for its simplicity in implementation. However, it necessitates three rounds of communication among participants during the signing process, which can increase the time and effort required to finalize a transaction.
MuSig2: Building upon MuSig1, MuSig2 aims to streamline the signing process by reducing one round of communication. It achieves this by integrating part of the communication with the key exchange phase, potentially aligning the signing process more closely with conventional script-based multisig approaches. However, this integration requires careful handling to prevent unintended repetition of signing steps, ensuring the security and integrity of the multisignature.
MuSig-DN (Deterministic Nonce): Known for its robustness against certain types of attacks, MuSig-DN is the most complex variant to implement. It maintains strict separation between communication and key exchange phases to mitigate vulnerabilities associated with repeated session attacks. Although more intricate in its design, MuSig-DN offers enhanced security assurances, making it suitable for scenarios where utmost security is paramount.

Still, MuSig does come with its own set of limitations.

One critical limitation is its reliance on all signers being honest and cooperative during the signing process. Unlike some other cryptographic schemes, MuSig requires all participants to provide their input correctly and honestly to produce a valid multisignature.

If any participant behaves maliciously or provides incorrect data during the signing process, the protocol fails to produce a valid signature, leading to potential delays or failures in completing transactions.

This aspect necessitates trust and coordination among all signers involved.

FROST

The FROST (Flexible Round-Optimized Schnorr Threshold) signature scheme represents a significant advancement in the realm of threshold signatures, particularly within the context of Schnorr digital signatures.

Unlike single-party signatures, threshold signatures require collaboration among a specified threshold of participants, each holding a share of a common private key. This collaborative nature introduces network overhead, particularly notable when participants use network-limited devices or unreliable networks.

FROST addresses these challenges by optimizing the signing process to minimize network rounds, thereby reducing costs associated with communication delays and ensuring efficient use of resources.

Threshold signature schemes are cryptographic protocols designed to enable multiple parties to collectively generate a single valid signature. This capability is particularly valuable in scenarios where trust must be distributed among multiple entities.

Traditional threshold schemes often require multiple rounds of communication among participants to ensure the integrity and security of the signature generation process. In contrast, FROST stands out by offering an approach that minimizes the number of required communication rounds while preserving the threshold property.

This means that only a specified subset of participants — typically denoted as ‘t’ out of ’n’ total participants — is required to cooperate to produce a valid signature. This flexibility allows FROST to support various practical use cases where some participants may be offline or unresponsive during signing operations, without compromising overall security.

The protocol introduces several innovative features aimed at enhancing both security and efficiency. One key innovation is its ability to perform signing operations in a single round under ideal conditions, which significantly reduces latency and enhances throughput. This single-round capability is crucial for applications requiring rapid transaction processing and where network constraints pose operational challenges.

Moreover, FROST maintains the fundamental threshold property, ensuring that only a predefined number of participants need to cooperate to produce a valid signature. This flexibility allows for robust deployment scenarios where some participants may be offline or inaccessible during signing operations, without compromising security.

Security is paramount in FROST, as it incorporates mechanisms to detect and mitigate potential forgery attempts by malicious participants. By allowing honest participants to identify and exclude misbehaving peers from future operations, FROST ensures the integrity of the signing process even in adversarial environments. This capability is essential for maintaining trust and reliability in applications such as financial transactions and network consensus protocols, where the authenticity of signatures is critical.

In practical terms, FROST supports both two-round and optimized single-round signing protocols, depending on operational requirements and network conditions. This flexibility makes it suitable for diverse use cases, from real-time transaction validation in cryptocurrencies to secure multi-party consensus in distributed networks.

However, it’s important to note that FROST prioritizes efficiency over robustness; all participating signers must be honest, as even one incorrect signing share requires the entire signing phase to fail.

ROAST

Existing Schnorr threshold signature schemes face significant challenges in real-world applications.

They often assume synchronous network conditions and lack robustness, meaning they cannot guarantee the successful generation of a valid signature when the required threshold of honest signers is present, especially in the presence of disruptive or malicious participants.

This limitation severely restricts the widespread adoption of threshold signatures within the cryptocurrency ecosystem, particularly in second-layer protocols built atop blockchain networks.

In response to these critical challenges, ROAST (Robust Asynchronous Schnorr Threshold signatures), aims at enhancing any semi-interactive threshold signature scheme.

ROAST transforms these schemes into robust and asynchronous signing protocols, addressing the crucial need for reliable signature generation in dynamic and potentially adversarial environments.

The core requirement for ROAST is that the underlying signing protocol supports identifiable aborts, remains resistant to forgery under concurrent signing sessions, and follows a semi-interactive structure involving one preprocessing round and one signing round.

By leveraging these essential features, ROAST ensures that even in scenarios where disruptive or malicious participants are present, a valid signature can be consistently and securely generated, provided that a threshold number of honest signers participate.

ROAST achieves robustness and asynchronous operation by orchestrating multiple instances of the underlying semi-interactive signing protocol concurrently, each involving subsets of signers that collectively meet the threshold requirement.

This parallel execution ensures that the failure of one signing session due to malicious behavior or network issues does not compromise the overall signature generation process.

Importantly, ROAST includes mechanisms to identify and exclude disruptive signers, allowing the remaining honest participants to continue the signing process without restarting from scratch. This approach not only enhances the reliability and security of threshold signatures but also paves the way for their practical deployment in decentralized systems requiring strong cryptographic assurances.

By facilitating secure and efficient cooperative signing across distributed networks, ROAST contributes significantly to advancing the scalability and usability of blockchain technologies beyond traditional multisignature schemes.

Final Thoughts

The evolution of multisignature technologies in blockchain systems, particularly within the Bitcoin ecosystem, demonstrates a clear trajectory towards enhanced efficiency, privacy, and security. From scripted multisigs to scriptless solutions like MuSig, and advanced threshold signature schemes such as FROST and ROAST, each iteration addresses specific limitations of its predecessors while introducing new capabilities.

We’re implementing FROST and ROAST together to build Arch, presenting what we believe will be a secure, scalable, and robust foundation for builders to create the next wave of Bitcoin dApps — enabling a number of meaningful innovations in Bitcoin-native DeFi, DAOs, and other decentralized use cases.

I would love to get your feedback on this article and the work we’re doing. The best way to reach me directly is to follow me @0xfinetuned on X.com and DM me.

Also, our multi-phased testnet is underway!

About the Author:

Amine ElQaraoui, Co-Founder and CTO — LinkedIn — Twitter

Amine is an alumnus of ENSIAS with a specialized engineering degree in artificial intelligence, and has applied his extensive technical knowledge to the cryptocurrency and Web3 sectors. His technological journey began at 13, leading him to a deep interest in Bitcoin and distributed systems. With over a decade of programming experience, Amine has also contributed to the banking industry, managing data infrastructure for Moroccan banks. His fascination with the ICO boom and smart contracts steered him towards dedicating himself to building cryptocurrency technologies.