Privacy and Anonymity in Monero

Prerequisites, Primitives & Notations
Monero Privacy and Anonymity
Monero Transaction Structure and Data

Prerequisites, Primitives & Notations

Bitcoin Transactions

Before getting into Monero, let’s first see how Bitcoin (BTC) transactions work.

In BTC, every wallet has an Elliptic Curve Public-Private Key Pair. Let $G$ be the Generator of the Elliptic Curve used by BTC. If $x$ is someone’s private key, then $P = xG$ is his public key.

Let’s dive into a transaction between Alice & Bob. Alice rents an apartment from Bob & wants to send 0.015 BTC as rent to Bob. Though we talk about coins, there really isn’t any coin being passed around with each transaction. Everyone other than miners spend what is called as Unspent Transaction Output (UTXO) - i.e. everyone has BTC because someone else sent it to them in a transaction. They can spend only their UTXOs. So if Alice wants to send some money to Bob, she has to have UTXO(s) which was sent to her by someone else. Just today, Alice was paid 0.1 BTC as salary by her employer - so she has a UTXO of 0.1 BTC & she hasn’t spent any of it yet. She creates a new transaction where she sends 0.015 BTC to Bob. However, she cannot only spend part of a UTXO, she has to spend it fully, so she also sends the remaining change to herself as part of the same transaction. So she sends 0.0845 BTC to herself & 0.015 BTC to Bob. The balance (0.1 - 0.0845 - 0.015 = 0.0005) is automatically assumed to be the mining fee. Alice locks Bob’s output (i.e. the 0.015 BTC she sent to Bob) with Bob’s public key & locks her own output (0.0845 BTC) with her own public key. When Bob wants to spend his output, he has to show proof of possession of the private key corresponding to the public key which locks that output. Alice’s employer had locked Alice’s output of the salary transaction with Alice’s Public Key. So for Alice’s spend to be valid, Alice needs to provide proof that possession of the corresponding private key. Alice takes a modified version of the transaction & signs it with her private key & attaches it to the transaction - the signature is her proof of possession of the private key. Any verifier (the nodes) can verify the signature was indeed created by the actual owner of the UTXO.

Few more things are verified

is the sum of inputs is greater than sum of the outputs? Alice’s transaction has only one input (0.1 BTC) & has 2 outputs (0.015 BTC & 0.0845 BTC). The verifiers verify that $0.1 \stackrel {?}{>} 0.015 + 0.0845$
Was this really a Unspent Transaction Output - i.e. has already spent the BTC she got as salary earlier? A UTXO is defined by a Transaction Id (TxId) & an output Index for each output. Each node in the blockchain maintains a list of UTXOs & updates it whenever a new block is added to the blockchain. They go through each transaction in a newly minted block & remove all the inputs in each transaction from their list & add all the outputs in each transaction to their list of UTXOs. If any of the input UTXOs do not exist in their list, it means that particular UTXO has already been spent & they reject the block. This is how double spending is prevented in a BTC Blockchain.

Hash function Notations used in this post

$H()$ denotes a Hash function like Keccak/SHA3.
$H_s()$ denotes a Hash to Scalar function. This function may call a Hash function like $H()$ & then convert the output hash to an integer and return the integer which to be used as a scalar.
$H_p()$ denotes a Hash to Point function. This function may call a Hash function like $H()$ & after converting the output to an integer, it maps the integer to an Elliptic Curve point & returns the point.
When we write something like $H(inp_1, inp_2, …, inp_n)$, it usually means that these inputs are first encoded as byte arrays & then concatenated before passing it as a single input to the hash function.
In my examples, I use sha3_256 hash function from hashlib (Monero uses a variant of SHA-3 named Keccak256 which is similar except that it uses different padding).

Pedersen Commitments

You can look up just the What is a commitment part of this linked post to understand what is a commitment.

Alice claims that she can predict the winner of a horse race in which 12 horses are competing. She thinks Horse #$5$ will win the race. But she doesn’t want to tell Bob before the race is over. However, if she tells Bob only after the race is over, then he obviously won’t believe that she predicted it before hand. Hence she commits to the number before the race & opens the commitment after the race is over to prove that she did predict it correctly.

Let’s try to build a simple commitment scheme from scratch to help Alice. A hash function $H$ can work as a commitment. She can send the hash of $5$ i.e. $H(5)$ to Bob as a commitment. After the race is over & Horse #5 has won the race, Alice tells Bob that this was the same as what she predicted. Bob can hash the number $5$ himself & compare it to hash sent by Alice beforehand to check if she had in fact predicted it correctly. Such a commitment though binding isn’t hiding. Since there are only 12 horses - even before the race is complete, Bob can hash all numbers from 1 to 12 & check which hash matches to find the horse chosen by Alice. Also, in the next race, if Alice again thinks Horse #5 will win & sends $H(5)$ as the commitment, Bob can compare with the earlier commitment & figure out that Alice has sent a commitment to 5.

Instead, Alice samples a random number $b$ called as a blinding factor & sends $H(5, b)$ & sends it to Bob (as mentioned earlier, both $5$ & $b$ will be converted to bytes & concatenated before hashing). Since $b$ can be a huge number, Bob can’t brute force all combinations from 1 to 12 to figure out which was the number chosen by Alice. Once the race is over, Alice can disclose $5$ & $b$ to Bob & he can compute $H(5, b)$ to check if she had indeed chosen $5$. And for every new commitment, Alice choses a new blinding factor even if she choses 5 in a future race, the commitment won’t reveal the value to Bob till she reveals the new blinding factor.

A function $C(y) = y G$ can also work as commmitment for the number $y$ - here $G$ is a generator of an Elliptic Curve where the discrete log problem is hard.

Again, this is binding but not hiding. The trick of $C(5, b)$ will not work to fix this because unlike the SHA-3 Hash function, this is an algebraic function- converting to bytes & concatenating makes no sense here & just adding up the 2 numbers will not be Binding. Let’s say Alice had sampled $b = 25$ & sent $C(5, 25)$ i.e. $C(30)$ to Bob as the commitment. If Horse 7 won instead of 5, even then Alice can be dishonest and say her blinding factor was $b=23$ & claim that she did predict correctly.

Pedersen Commitments solves this by having 2 Generators $G$ & $B$ such that the relation between $B$ & $G$ isn’t known - i.e. nobody knows an $n$ such that $B = nG$. The 2nd generator $B$ is used for the blinding factor. Monero uses $B = H_p(G)$ to compute a 2nd generator which satisfies this requirement.

A Pedersen commitment for $u$ is defined as

$C_u = u G + b B $

One very useful quality of a Pedersen commitment is that it’s additively homomorphic. If we create 2 commitments - one for the number 5 & the other for number 12.

$C_1 = 5G + b_1B$

$C_2 = 12G + b_2G$

Then, $C_3 = C_1 + C_2 = 17G + (b_1 + b_2)B$

$C_3$ automatically becomes the commitment for 17 with a blinding factor of $b_1 + b_2$

Schnorr Signature

Let’s first look at the Schnorr Proof of Possession of a Private Key without involving messages.

Let $G$ be the Generator of an Elliptic Curve. Alice wants to prove that she knows a private key $x$ such that her Public Key $P$ is $P = xG$.

Alice samples a random $r$ & computes $R = rG$ & sends $R$ to Bob.

Bob samples a random $c$ & sends it to Alice. Alice computes $s = r - cx$ & sends $s$ to Bob.

Bob checks if $sG \stackrel {?}{=} R -cP$ - if it is, then he is convinced that Alice is in possession of private key corresponding to $P$.

Why does this proof work?

$s = r - cx$

Multiply both sides by $G$

$sG = rG - cxG$

But $R = rG$ and $P = xG$.

So $sG = R -cP$

If Alice knows $c$ before she choses $R$, then she can easily chose a random $s$ & an $R$ such that $R = sG + cP$. That’s why for this proof to be sound, Bob should send $c$ only after Alice has sent him $R$. The problem with this however is that it is interactive. But, it can easily be made non-interactive by using Fiat Shamir. Instead of Bob sending $c$, Alice herself computes $c = H_s(R)$. Since $c$ is computed from $R$, Alice cannot choose a $R$ beforehand such that $R = sG + cP$, so the proof is sound.

Schnorr Signature of a message $M$ is the same except $c$ is computed as

$c = H_s(M, R)$

Signing a message also provides proof of possession of the private key used to sign the message.

Below is a sagemath implementation. I use an insecure toy Elliptic Curve for all the examples in this for ease of understanding.

import hashlib

p = 7919
Fp = GF(p)
a = Fp(6628)
b = Fp(1900)
Es = EllipticCurve(Fp, [a,b])
G = Es(1892, 4372) # Generator
q = G.order()
Fq = GF(q)

def point_to_bytes(pt_):
    if(pt_ == 0):
        return "None,None".encode("utf8")

    bx =  int(pt_.x()).to_bytes(32, byteorder="big")
    by =  int(pt_.y()).to_bytes(32, byteorder="big")

    return bx + by

def bytes_to_point(bpt_):
    if(bpt_ == b'None,None'):
        return Es(0)
       
    x = int.from_bytes(bpt_[:32], byteorder="big")
    y = int.from_bytes(bpt_[32:], byteorder="big")

    return Es(x,y)

def hash_s(msg_, pt_):
    h =  hashlib.sha3_256(msg_ + point_to_bytes(pt_)).digest()
    return int.from_bytes(h, byteorder="big")

def schnorr_sign(x_, msg_):
    r = Fq.random_element()
    R = r * G
    c = hash_s(msg_,R)
    s = int(r - c*x_).to_bytes(32, byteorder="big")
    Rb = point_to_bytes(R)
    sig = (Rb, s)

    return sig

def schnorr_verify(Pub_, sig_, msg_):
    R = bytes_to_point(sig_[0])
    s = int.from_bytes(sig_[1], byteorder="big")
    c = hash_s(msg_, R)
    left = s*G
    right = R - c*Pub_
    return(left == right) 

x = int(Fq.random_element()) #Private Key
Pub = x*G #Public Key

msg = 'HelloWorld'.encode('utf8') #Message to Sign

sig = schnorr_sign(x, msg)

if(schnorr_verify(Pub, sig, msg)):
    print("Signature Verified")

Note:

The Discrete Log problem isn’t hard in this Curve like it should be.
For simplicity, I don’t compress Elliptic Curve Points like Monero does.
Protocols/methods I use for converting inputs before hashing may not be same used by Monero - but it’s not relevant for understanding the cryptographic concepts.
Many Schnorr implementations use $s = r + cx$ but Monero uses $s = r - cx$ (and also adjusts verification accordingly) - not much of a difference between the two ways - except that the first is optimised for signing & Monero’s way is optimised for verification. On a blockchain, each signing is verified many, many times & hence it may help to optimise verification over signing.

Ring Signature

Consider a group/ring of Key Pairs which are unrelated & belong to different people. The Owner of one of the Pairs in the Ring wants to sign a message in a way that it’s possible for a verifier to verify that one of the ring members signed the message without being able to determine which of the members did it. For constructing a ring signature like this, the Signer uses her own Public-Private keypair, but only the Public Keys of the others. The other members of the Ring aren’t involved in the signing operation. Here, I discuss Ring Signatures where the underlying signature scheme is a Schnorr Signature.

Signing:

Let’s consider a Ring of Public Keys $\lbrace P_0, P_1, …, P_k, …, P_{n-1} \rbrace$

One of these keys $P_k$ is Alice’s Public key - i.e. Alice knows a private key $x$ such that $P_k = xG$

Alice randomly samples $r$ and computes $R_k= rG$
Alice computes $c_{k+1} = H_s(M, R_k)$
Alices samples random $s_{k+1}$ & computes $R_{k+1} = s_{k+1}G + c_{k+1}P_{k+1}$
Alice computes $c_{k+2} = H_s(M, R_{k+1} )$
Alice then samples $s_{k+2}$ & computes $R_{k+2} = s_{k+2}G + c_{k+2}P_{k+2}$ & $c_{k+3} = H_s(M, R_{k+2} )$
Alice continues this all the way to the $n^{th}$ element where she samples a random $s_{n-1}$, computes $R_{n-1}$ & computes $c_0 = H_s(M, R_{n-1})$
Alice samples $s_0$, computes $c_1$, continues all the way to sampling $s_{k-1}$ & computing $c_k = H_s(M, R_{k-1})$
Alice has sampled random $s_0$ to $s_{k-1}$ & $s_{k+1}$ to $s_{n-1}$ except for $s_k$
Alice sets $s_k = r_k - c_kx$

The signature resulting from this operation is $\lbrace c_0, s_0, s_1, …, s_{n-1}\rbrace$

Verification:

Verifier computes $R_0 = s_0G + c_0P_0$
Verifier computes $c_1 = H_s(M, R_0)$
Verifier computes $R_1 = s_1G + c_1P_1$
Verifier computes $c_2 = H_s(M, R_1)$
Verifier continues all the way to $c_{n-1} = H_s(M, R_{n-2})$ & $R_{n-1} = s_{n-1}G + c_{n-1}P_{n-1}$
Then $c_0 = H_s(M, R_{n-1} )$

If the $c_0$ computed by the verifier equals the $c_0$ sent by the signer, then it means the signature has verified.

Why does it work?

The Signer closed the loop with

$s_k = r - c_kx$

So $c_{k+1} = H_s(M, rG) $

$\qquad \quad = H_s(M, (s_k + c_kx)G$

$\qquad \quad = H_s(M, s_kG + c_kxG)$

But $xG = P_k$

So $c_{k+1} = H_s(M, s_kG + c_kP_k)$

Which is a similar structure as the other $c$’s. And this closes the loop.

A sagemath example

def ring_sign(msg_, Pubs_, s_, k_, x_):
    sc = s_.copy()
    r = Fq.random_element()
    c = hash_s(msg_, r*G)
    
    c0=c

    for i in range(n-1):
        idx = (i + k_ + 1)%n
        R = sc[idx]*G + c*Pubs_[idx]
        c = hash_s(msg_, R)
        if (idx == (n-1)):
            c0 = c

    # Overwite the k'th s with a computed one
    sc[k_] = r - c*x_
    
    return (c0,sc)

def sig_verify(msg_, sig_, Pubs_):
    C = sig_[0]
    for i in range(n):
        c = hash_s(msg_, sig_[1][i]*G + C*Pubs_[i])
        C = c
    
    return (sig[0] == c)   

# Main
n = 5 # Total Number of Public Keys used by Signer including her own
k = 2 # The index position of Signer's Public Key in the List of Public Keys

# Generate the main Public/Private Key Pair for the signer
x = Fq.random_element()
P = x*G

Pubs = [] # List of Public Keys
s = [] # List of Randomly sampled s's

# We generate n random Public Key. We also sample n s's
for i in range(n):
    Pubs.append(Fq.random_element()*G)
    s.append(Fq.random_element())

# Overwrite the k'th element with the Signer's Public Key
Pubs[k] = P

msg = 'HelloWorld'.encode('utf8')

sig = ring_sign(msg, Pubs, s, k, x)

if sig_verify(msg, sig, Pubs):
    print("Signature Verified")

Ring Signature with multiple private keys

What if a person has multiple Private Keys & wants to do ring signatures proving proof of possesion of all of them? One way is to do multiple rings, one for each private key - i.e. let’s say you have 2 private keys, you create 2 rings each having it’s own dummy public keys. And create 2 separate ring signatures.

However, it’s also possible to create just 1 signature which proves possession of multiple keys owned by the same person.

Signer has $m$ private keys $ \lbrace x_0, …, x_{m-1}\rbrace$. For each of the $m$ private keys, he collects $n-1$ dummy Public Keys (whose Private Keys he doesn’t know) and creates $m$ rings each with $n$ Public Keys

Let’s consider an example of 3 Private Keys ($m = 3$) & a ring of 5 for each ($n=5$). Let’s assume the signer’s Public Key is in Column $k = 2$ (3rd column considering 0 based indexing) of the below matrix

\[\left[ \begin{matrix} & &Signer\\ P_{0,4}&P_{0,1}&||P_{0,2}||& P_{0,3}& P_{0,4}\\ P_{1,0}&P_{1,1}& ||P_{1,2}||& P_{1,3}& P_{1,4}\\ P_{2,0}&P_{2,1}& ||P_{2,2}||& P_{2,3}& P_{2,4}\\ \end{matrix} \right]\]

The Signer knows Private Keys $x_0, x_1$ & $x_2$ corresponding to public keys $P_{0,2}, P_{1,2}$ & $P_{2,2}$ the remaining Public Keys in each row aren’t his, so he doesn’t know the Private Keys corresponding to them

Alice samples a random $r_i$ & computes $R_{i,k} = r_iG$ for all $i \in \lbrace 0, …, m-1 \rbrace$
Alice computes $c_{k+1} = H_s(M, R_{0,k},R_{1,k}, … , R_{m-1,k})$
Alices samples a random $s_{i,{k+1}}$ & computes $R_{i,k+1} = s_{i,k+1}G + c_{k+1}P_{i,k+1}$ for all $i \in \lbrace 0, …, m-1 \rbrace$
Alice computes $c_{k+2} = H_s(M, R_{0,k+1},R_{1,k+1}, … , R_{m-1,k+1})$
Alice continues this all the way to the $n^{th}$ key where she samples a random $i, s_{n-1}$ & computes $R_{i,n-1}$ similarly for all $i \in \lbrace 0, …, m-1 \rbrace$
Alice computes $c_0 = H_s(M, R_{0,n-1},R_{1,n-1}, … , R_{m-1,n-1})$
Alice continues the same all the way upto sampling $s_{i, k-1}$ for all $i \in \lbrace 0, …, m-1 \rbrace$ & computes $c_k$
For all $i \in \lbrace 0, …, m-1 \rbrace$, Alice sets $s_{i,k} = r_i - c_k x_i$

The signature resulting from this operation is $\lbrace c_0, s_{0,0}, s_{0,1},…, s_{0,n-1}, …, s_{m-1,0}, s_{m-1,1},…, s_{m-1,n-1} \rbrace$

Verification:

Verifier computes $R_{i,0} = s_{i,0}G + c_{0}P_{i,0}$ for all $i \in \lbrace 0, …, m-1 \rbrace$
Verifier computes $c_1 = H_s(M, R_{0,0},R_{1,0}, … , R_{m-1,0})$
Verifier computes for $c_2, c_3$ similarly all the way till she computes

$\qquad \qquad c_0 = H_s(M, R_{0,n-1},R_{1,n-1}, … , R_{m-1,n-1})$

If the computed $c_0$ equals the $c_0$ sent by the signer, then it means the signature is verified.

Elliptic Curve Diffie Hellman

This post on Cofactor Clearing describes Elliptic Curve Diffie Hellman & Cofactor Clearing.

Since Monero uses a Composite Order Curve with a cofactor of $8$, Monero multiplies by $8$ during Diffie Hellman for Cofactor Clearing. Since our examples use a Prime Order Curve, we don’t do Cofactor Clearing in our examples.

This completes the pre-requisites/primitives.

Monero Privacy and Anonymity

Bitcoin isn’t anonymous and there is very little privacy. If you have ever sent BTC to Alice, you would know her BTC address. You can trawl the ledger & check how many UTXOs are locked with Alice’s Public Key & thus you would know the sum total of BTCs in Alice’s wallet (this is exactly the same way Alice’s wallet knows how much money she has - the coins aren’t actually in the wallet). Likewise, if you know Alice’s employer’s public key, you can figure out Alice’s salary. If you know anyone else’s Public Key, you can figure out if Alice has paid them or received money from them.

Monero fixes these issues & makes transactions anonymous & private.

Stealth Addresses

In Monero, each person (actually wallet) has 2 Elliptic Curve Key Pairs

View Public & Private Key
Spend Public & Private Key

Let Bob’s View Private Key be $v_b$ & Spend Private Key be $s_b$ respectively. His View Public Key & Spend Public key are $V_b = v_b G$ & $S_b = s_b G$ respectively.

When Alice wants to send Money to Bob, Alice samples a random number $r$ & computes $K_{dh}$

$K_{dh} = H_s(r V_b \space)$

If a transaction has multiple outputs, then index of the output is added to the Hash input to compute $K_{dh}$. I am ignoring it here (and other places also) to keep it simple.

Alice then computes a new derived one-time Output Public Key $P_o$ & also computes $R$

$P_o = K_{dh} G + S_b$

$R = r G$

Instead of sending money to Bob’s address as done in BTC, Alice sends the money to $P_o$ & publishes $R$ in cleartext as part of the transaction data.

In the above expression $G$ and $S_b$ are publicly known. But $K_{dh} = H_s(r V_b \space)$ has a term $r$ which is known only to Alice.

However, $V_b = v_b G$ & $R = r G$

$r V_b = r (v_b G) = v_b (r G) = v_b R$

Since $v_b$ is known to Bob (and no one else), only Bob can compute $K_{dh} = H_s(v_b R)$ even without knowing $r$.

So above, Alice used Diffie-Hellman to share a secret $K_{dh}$ with Bob.

But most transactions have more than one output - either sender sends money to more than one receiver in the same transaction. If nothing else, she has to send change back to herself because the amount she wants to send to Bob will not match the exact amount of the input(s) she is spending. So, typically instead of just $K_{dh} = H_s(r V_b \space)$, we will have

$K^0_{dh} = H_s(r V^0_b \space)$

$K^1_{dh} = H_s(r V^1_b \space)$

where the superscript $0, 1, …$ etc is the output index - if a transaction has multiple ouputs, then it’s indexed as $0, 1, …$ etc

$V^i_b$ is the View Output Key of the receiver of the $i$-th receiver & Alice will compute the corresponding one time Public Address as

$P^i_o = K^i_{dh} G + S^i_b$

where $S^i_b$ is the $i$-th receiver’s spend Public Key

In BTC, everytime a new mined block appears, your wallet scans the block and finds transactions sent to your address. In Monero, your wallet uses the $R$ of each new transaction to compute $P_o$ from it (using Bob’s View Private Key) & checks if the transaction is sent to $P_o$ to find transactions meant for him.

Only the View Private Key is required for viewing transactions coming to your wallet & the Spend Private Key isn’t required.

The Output address $P_o =K_{dh} G + S_b$

can be considered as

$P_o = K_{dh} G + s_b G$

$\quad = (K_{dh} + s_b) G$

Let $x_o = K_{dh} + s_b$

$P_o = x_o G$

So the private key for this output is considered as $x_o$.

When Bob wants to spend this output, he has to prove possession of $x_o$ - the view private key ($v_b$) & the spend private key ($s_b$) both need to be known to know $x_o$ i.e. to spend the UXTO. So Bob can hand off his View Private to someone to delegate the responsibility of checking transactions without allowing them to spend it.

In BTC, where money is sent to non-secret addresses, in Monero, it’s sent to a one-time address to preserve privacy & anonymity.

Amount Commitments

In BTC, the amount is represented in the transaction entry as the actual value. In Monero, the transaction data has Pedersen commitments of the amounts instead. Since these commitments are Hiding, nobody else can figure out the amounts in any transaction.

The sender chooses the blinding factor $b$ for the commitments of the output as $b_v = H_s(“commitment\space mask”, K_{dh})$

The sender creates a Pedersen commitment for amount $v$ computed as

$C_v = v G + b_v B$

There are already commitments for all inputs (created by whoever sent those amounts to sender). Sender creates commitments for all outputs. She chooses blinding factors for different outputs except one randomly. And that one is chosen explicitly such that the sum of the input blinding factors is equal to the sum of the output blinding factors.

In BTC, the nodes verify that for every transaction, sum of inputs is greater than or equal to the sum of the outputs. Since Pedersen commitments are additively homomorphic, the nodes can verify if the sum of commitments of the inputs is greater than or equal to the sum of the commitments of the output if the blinding factors are chosen as above.

The sender also encrypts the amount value ($v$) as below & stores it in the transaction data

$EncAmt = v\space \bigoplus_8 \space H_s(“amount”, K_{dh})$

Here $\bigoplus_8$ denotes XOR between the first 8 bytes of $v$ & the Hash $H_s(…)$. Amounts on the blockchain are restricted to 8-byte values & hence only 8 bytes are XORed.

Receiver also knows $K_{dh}$, so he can do a $\bigoplus_8$ of EncAmt with $H_s(“amount”, K_{dh})$ to get the Amount Value in cleartext. He also can compute the Blinding factor & thus check the Pedersen Commitment to see if he received the amount he was expecting.

The strings passed as first param to the hash functions are for doing Domain Separation

If there was no domain separation string used & the computation was as below

$b_v = H_s(K_{dh})$

$Enc(val) = v\space \bigoplus_8 \space H_s(K_{dh})$

This can lead to an attack to find out the actual amount being spent. If you combine both of the above, then you have

$C_v = v G + b_v B$
$b_v = H_s(K_{dh})$
$Enc(v) = v\space \bigoplus_8 \space b_v$

$Enc(v)$ & $C_v$ are known - they are present in the transaction data. So if an attacker knows (or can guess) the general range of a transaction (that’s it’s between some amounts x to y XMR) - then they can try each of the values in the range to check if that fits in all 3 of the above equations. Domain Separation helps to prevent this.

Another thing here is that $K_{dh}$ is already a hash

$K_{dh} = H_s(r V_b \space)$

There is no security reason for hashing the hash $K_{dh}$ again. Earlier versions of Monero didn’t use domain separation & instead used a single hash & double hash in the 2 places so as to prevent the exhaustive search of a range to find out the cleartext amount. Once domain separation was implemented, I am guessing they had the choice to replace both places with either $K_{dh}$ or $H_s(K_{dh})$. $H_s(K_{dh})$ is already computed in other places in the code, so it didn’t make a difference as to which of the two options they chose & they chose the double hash.

RingCT

When spending a UTXO, if the sender signs it with the one time Private/Public Key pair, then even though others don’t know the Public Key, the person who sent the UTXO to the sender does know it (it was she who created the Diffie Hellman one time address). So she would know who is the spender in a new transaction spending the output she sent & would know when it’s spent. To hide this, Monero uses Ring Confidential Transactions (RingCT). For ease of understanding, let’s take a transaction which has only 1 input. The spender collects multiple other TXOs from the Monero Blockchain to mix with her transaction - each TXO has a One-time Public Key & Pedersen Commitments associated with it. The sender signs the Current Transaction with a Ring Signature using the Ring of Public Keys associated with the unrelated transactions to hide which UTXO is being spent. However, it’s not yet foolproof. The fact that the sender has set blinding factors such that the sum of the input commitments are equal to the sum of output commitments causes a privacy problem. Even though the sender signs the transaction using a Ring Signature to provide anonymity, only one of the those input commitments which would sum up to the output commitments of the current transaction - the output commitment of the current transactions won’t sum up to the input commitments of any of the dummy transactions in the Ring. This would make it easy to idenfity the sender’s UTXOs from the Ring of UTXOs. To avoid this issue, Monero adds another tweak.

The input UTXO amounts already have existing commitments created by the sender of those Transactions.

The existing input commitment of value $i$ is

$C_i = i G + b B $

The sender knows $b$ & $i$ as described in the Amount Commitments section. The sender creates a new commitment for the same commitment with a different blinding factor

$C’_i = i G + b’B$

This is called as a Pseudo Commitment

Now $C_i - C’_i = (b - b’)B$ is a commitment to a $0$ amount because the $i G$ terms would cancel out. Only the sender knows the $z = b-b’$ which is the private key for the zero commitment. Let $P_z = z B$

We already have one Private Key ($x_o$) from the Stealth Addresses section for which the Signer has to prove & this is now a second one.

The spender chooses a blinding factor for both the Pseudo Commitments & Output Commitments so that their blinding factors cancel each other out

Let a transaction be represented by 2 Inputs (A & B) & 2 Outputs (C & D).

The check if $A + B \stackrel{?}{=} C + D$ can be done by checking if

$Pseudo_A + Pseudo_B \stackrel{?}{=} Out_C + Out_D$

where $Pseudo_A$ & $Out_C$ are Pseudo Pedersen Commitment of Input A & Pedersen Commitment of Output B respectively & so on.

Key Image

Ring Signatures ensure that no one can figure out which is the TXO which is being spent. However, the downside of this is that the validators also can’t figure out the same & hence they cannot validate if the TXO is being spent for the second time. The nodes can no longer maintain a list of UTXOs to prevent double spending. So Monero also embeds a Key Image in each transaction & makes it part of the transaction data & also the signature.

This is how a Key Image is computed

$KI = x_o\star H_p(P_o)$

where $P_o$ is the Public Key corresponding to the One-time Address & $x_o$ is the correponding private key of $P_0$

If you notice, the Key Image is unique to a UTXO. 2 different transactions get linked if the same Key Pair is used to generate signature containing the same Key Image. This is called Linkability. Nodes maintain list of all Key Images which are already spent to prevent double spending. The Nodes verify if the Key Image of each transaction in a new block already exists in their list - if it does, that means that transaction is spending a TXO which has already been spent & the nodes reject the block.

Also, the Key Image can be computed only by the person spending the UTXO & not by the person who sent him the UTXO. So your employer who paid your salaries in Monero will not be able to find out which transaction you started spending it in.

A Ring Signature which allows validation of a Key Image is called as a Linkable Group Signature.

MLSAG

The signature has to prove possession of 2 private keys (the Private Key for the stealth address & the one for the commitment to zero). Each unrelated TXO which Alice has rounded up to form the ring has a one-time Public Key of the receiver of that output ($P_o$). It also includes the Pseudo Pedersen Commitments of that transaction. The Pseudo Pedersen Commitment of that Transaction minus the original Input Pedersen commitment linked to that transaction gives us the second Public Key ($P_z$) - the signer knows these 2 Public Keys for each member of the Ring without knowing the corresponding Private Keys. She could create a Ring Signature with multiple private keys by signing the transaction data. However, this signature doesn’t include the Key Image, so it’s tweaked further to create a Multilayer Linkable Spontaneous Anonymous Group Signatures (MLSAG).

Each Input TXO which is being spent in a transaction has a different stealth address even if belongs to the same spender. Each such address is a private key whose possession has to be proven by the spender. Other than that, the spender has to prove that he knows the private key to the zero commitment (Pseudo Commitment minus the Input Commitment).

In Monero, the Signer generates a signature for each UTXO she is spending in a transaction - i.e. if she spends money from 3 UTXOs in a transaction, she generates 3 MLSAG signatures for the transaction - each signatures is a proof of possession for the private key corresponding to the one time address of that UTXO & also the private key for the commitment to zero generated from that UTXO.

For each input TXO, the spender collects other TXOs (not belonging to her) from the blockchain. Each such dummy TXO also has the Pseudo Commitments & Output Commitments & she can compute the Public Key corresponding to the zero commitment of that transaction (as can the validator) though she won’t know the private key to the zero commitment.

To the signature described in the Ring Signature with multiple private keys section, we add Key Image linking as described below.

This is how $c_{k+1}$ was computed

$c_{k+1} = H_s(M, R_{0,k},R_{1,k}, … , R_{m-1,k})$

We change this to

$KI = x_{m-1}\star hash_p(P_{m-1, k})$

$DKI_k = r_{m-1}\star hash_p(P_{m-1, k})$

$c_{k+1} = H_s(M, R_{0,k},R_{1,k}, … , R_{m-1,k}, DKI_k)$

Then for computing $c_{k+2}$,

$DKI_{k+1} = s_{m-1,k+1}\star hash_p(P_{m-1, k+1}) + c*KI$

$c_{k+2} = H_s(M, R_{0,k+1},R_{1,k+1}, … , R_{m-1,k+1}, DKI_{k+1})$

and so on

The signature resulting from this operation is $\lbrace KI, c_0,s_0,s_1,…,s_{n-1} \rbrace$

Verification:

Verifier computes $R_{i,0} = s_{i,0}G + c_{0}P_{i,0}$ for all $i \in \lbrace 0, …, m-1 \rbrace$
Verifier computes $c_1 = H_s(M, R_{0,0},R_{1,0}, … , R_{m-1,0})$
Verifier computes for $c_2, c_3$ similarly all the way till she computes

$\qquad \qquad c_0 = H_s(M, R_{0,n-1},R_{1,n-1}, … , R_{m-1,n-1})$

If the computed $c_0$ equals the $c_0$ sent by the signer, then it means the signature has verified.

An additional check also needs done in Monero because unlike our example, Monero uses a Composite Order Curve. A signer could add a point on the Cofactor subgroup to the Key Image to get a new Key Image. And then create a signature with all $c_i$’s which are multiples of the order of the Cofactor Subgroup - this can be done by trial & error. This will give the signer a 2nd Key Image & Signature for the same UTXO which will verify correctly - thereby allowing double spending. Hence Monero validators check if the Key Image is a point on the main group of the elliptic curve, this is trivially checked by checking if $q KI \stackrel {?}{=}0$

A generalisation of MLSAG could compute a KeyImage for each one of the Private Keys in possession of the signer & publish a Key Image Vector instead of just a Key Image. However, in the Monero usecase, it only makes sense to compute the KeyImage for the 1st key - i.e. the spend key. Hence we do not do a Key Image for the commitment to zero.

Below is the sagemath program for a signature for spending one of the input UTXO of a transaction. Since each signature will always prove possession of 2 keys (One-time Address & Commitment to Zero), we will always have $m=2$. Monero also fixes that the first row of the Matrix to be the One Time Address Public Key & the 2nd row to be the Commitment to Zero Key - so the Key Image is generated from the First Row.

def hash_s(msg_, pts_, kpt_):
    b = msg_
    for j in range(m):
        b = b + point_to_bytes(pts_[j]) 
    b = b + point_to_bytes(kpt_)
    h =  hashlib.sha3_256(b).digest()
    return Fq(int.from_bytes(h, byteorder="big"))

def hash_p(pt_):
    h = hashlib.sha3_256(point_to_bytes(pt_)).digest()
    return int.from_bytes(h, byteorder="big")*G

def ring_multi_sign(msg_, Pubs_, s_, xs_):
    sc = [a[:] for a in s_] ## Deep Copy a 2d list
    
    r = [None]*m
    R = [None]*m

    for i in range(m):
        r[i] = Fq.random_element()
        R[i] = r[i]*G
    
    ki = xs_[0]*hash_p(Pubs_[0][k])
    kpt = r[0]*hash_p(Pubs_[0][k])
    
    c = hash_s(msg_, R, kpt)
    c0 = c

    for j in range(n-1):
        idx = (j + k + 1)%n
        for i in range(m):
            R[i] = sc[i][idx]*G + c*Pubs_[i][idx]
            
        kpt = sc[0][idx]*hash_p(Pubs_[0][idx]) + c*ki
        c = hash_s(msg_, R, kpt)
        if(idx == (n-1)):
            c0 = c

        for i in range(m):   
            sc[i][k] = r[i] - c*xs_[i]

    return (c0, sc, ki)

def multi_ring_verify(msg_, sig_, Pubs_):
    c = sig_[0]
    s = sig_[1]
    ki = sig_[2]

    for j in range(n):
        R = [None]*m
        for i in range(m):
            R[i] = s[i][j]*G + c*Pubs_[i][j]
        kpt = s[0][j]*hash_p(Pubs_[0][j]) + c*ki
        c = hash_s(msg_, R, kpt)

    return (sig_[0] == c)

#Main
m = 2 # No of private keys known by signer (Rows)
n = 5 # Total number of Public Keys in each Ring (Columns)
k = 2 # The index Column index of the Signer's Keys

Pubs = [[None]*n for _ in range(m)]
s = [[None]*n for _ in range(m)]

Ps = [None]*m # Signer's Public Keys
xs = [None]*m # Signer's Private Keys

# Generate the main Public/Private Key Pair for the signer
for i in range(m):
    xs[i] = Fq.random_element()
    Ps[i] = xs[i]*G

# We generate n*m random Public Keys. We also sample n*m s's
for i in range(m):
    for j in range(n):
        Pubs[i][j] = Fq.random_element()*G
        s[i][j] = Fq.random_element()

# Overwrite the k'th element with the Signer's Public Key
for i in range(m):
    Pubs[i][k] = Ps[i]

msg = 'HelloWorld'.encode('utf8')

sig = ring_multi_sign(msg, Pubs, s, xs)
if(multi_ring_verify(msg, sig, Pubs) == True):
    print("Signature Verified")

One more validation needs to be done - a check that each of the outputs lies in the range between $0$ and $2^{64} - 1$. I have covered that in a Separate Post - Bulletproofs & Range Proofs in Monero using Bulletproofs

Monero Transaction Structure and Data

Below is the JSON structure of a Monero Spend Transaction data - the JSON field name in bold & what it represents in brackets. I have skipped some fields which aren’t significant

Tx hash (Transaction Id)
- version (Version - version 2 means RingCT signatures)
vin (below fields for each Input/UTXO being spent)
- key_offsets (a link to each input transaction on the blockchain including the dummy transactions used in the signature)
- k_image (the Key Image generated for that Input)
vout (below fields for each output of the transaction)
- target:[key] (one-time stealth address of the recipient)
- extra (the $R$ corresponding to the stealth address)
rct_signatures (though the name is rct_signatures, this field doesn’t contain the signature which is actually stored in the field MGs)
- txnfee (Transaction Fee)
- pseudoOuts (list of all psuedo commitments for the transaction - one for each vin)
- ecdhInfo (contains below list of items - one item for each vout)
  - mask (blinding factor for the commitment)
  - amount (EncAmt)
- outPk (list of output Pedersen commitments)
rangeSigs (Range Proofs)
MGs (list of signatures, one for each vin. Each signature containing the following fields)
- ss (the list - $\lbrace s_0,s_1,…,s_{n-1} \rbrace$
- cc ($c_0$ of the signature)

Written on December 4, 2024

Share this: