The KZG/Kate Polynomial Commitment Scheme
Introduction
Commitment schemes are fundamental components of many cryptographic protocols. A secure commitment scheme allows a committer to publish a value, called the commitment, which binds her to a message (binding) without revealing it (hiding). Later, she may open the commitment and reveal the committed message to a verifier, who can check that the message is consistent with the commitment.
Consider the following scenario.
-
Peggy wants to commit to a message $m$. To do so, she writes down $m$ on a piece of paper, puts it in a box, and locks it using a padlock.
-
Peggy gives the box to Victor.
-
If Peggy wants to, she can later open the commitment by giving Victor the key to the padlock.
There are two basic properties here, which are essential to any commitment scheme:
-
Having given away the box, Peggy cannot anymore change what is inside. Hence, when the box is opened, we know that what is revealed really was the choice that Peggy committed to originally. This is called the Binding property.
-
When Victor receives the box, he cannot tell what is inside before Peggy decides to give him the key. This is called the Hiding property.
Polynomials are frequently used in cryptography because they can be used to encode a lot of information. A Polynomial Commitment Scheme (PCS) allows the committer to commit to a polynomial with a commitment string that can be used by a verifier to confirm claimed evaluations of the committed polynomial. We will look at one such PCS called KZG (Kate, Zaverucha, Goldberg). It’s also referred to as the Kate Commitment Scheme.
Pre-requisite topic before we discuss the KZG PCS
Elliptic Curve Pairings
An Elliptic Curve Pairing is defined as a map
$e : \mathbb G_1 \times \mathbb G_2 \mapsto \mathbb G_T$
$\mathbb G_1$ can be equal to $\mathbb G_2$ i.e. $\mathbb G_1 = \mathbb G_2 = \mathbb G$. Such a pairing is called as a Symmetric Pairing. If they are not equal, then it’s called as an Asymmetric Pairing.
Let $G_1$ & $G_2$ be generators of $\mathbb G_1$ & $\mathbb G_2$ respectively. Let $a$ & $b$ be scalars. The core property of Elliptic Curve Pairings which we use the most is the following.
$e(a G_1, b G_2) = e(b G_1, a G_2) = e(ab G_1, G_2) = e(G_1, ab G_2) = e(G_1, a G_2)^{b} = e(G_1, G_2)^{ab}$
If you are not familiar with pairings, I have given a small introduction to Pairings as part of this post - Weil Pairing and the MOV attack.
The Schwartz-Zippel Lemma
Let $f$ be a polynomial from $\mathbb F_p[x]$ of degree less than or equal to $d$. Let $p \approx 2^{256}$ & $d \le 2^{40}$. If we test $f$ at a random value $r \in \mathbb F_p$, then the probability that $f(r) = 0$ would be $\frac {d}{p}$ which for these values of $d$ & $p$ would be very, very small. So, if $f(r) = 0$, then with high probability, $f$ is a zero polynomial (i.e., zero at all points). This is a simple test to check if a committed polynomial is a zero polynomial. Open the commitment to the polynomial at a random point & if it opens to zero, then it is a zero polynomial.
This also helps to prove that 2 polynomials are equal. If the 2 polynomials are $f$ & $g$, we create a new polynomial $p(x) = f(x)\space - \space g(x)$.
If we test $p$ at a random $r \in \mathbb F_p$ & $f(r)$ turns out to be $0$, then $p$ is a zero polynomial with very high probability. If $p$ is a zero polynomial, then it obviously means $f = g$.
The Commitment Scheme tests the polynomials only at a small number of points ($a$ & the opening point). The Binding propery of the commitment scheme is based on the Schwartz-Zippel Lemma.
The Commitment Scheme
Let $F(x)\in \mathbb F_p[x]$ be the polynomial which needs to be committed. $F(x)$ is a polynomial of degree $d$ or less.
We will first describe the commitment using Symmetric Elliptic Curve Pairing - i.e. $e : \mathbb G \times \mathbb G \mapsto \mathbb G_T$. $\mathbb G$ is an Elliptic Curve group of order $p$ with a generator $G$.
Trusted Setup
The first stage of the commitment scheme is a trusted setup. During the trusted setup, a random value $a \in \mathbb F_p$ is sampled. Then the following tuple of size $d+1$ known as the Reference String is generated - $\lbrace G, aG,a^2G, a^3G, …, a^dG \rbrace$. After generating the tuple, the random value $a$ is destroyed such that it’s not known to anyone including the Prover or the Verifier. The destroyed value is called as toxic waste. If toxic waste is known to anyone, they can use it to generate fraud proofs.
Commitment
Let’s denote the commitment of the polynomial $F(x)$ as $C_F$ where
$C_F = F(a)\cdot G$ where $F(a)$ is the polynomial evaluated at $x=a$.
Though $a$ has been deleted, the committer can stil compute the commitment using the Reference String.
Let $F(x) = f_0 + f_1x + f_2x^2 + … + f_dx^d$
$F(a) = f_0 + f_1a + f_2a^2 + … + f_da^d$
So $C_F = F(a)\cdot G = (f_0 + f_1a + f_2a^2 + … + f_da^d)\cdot G$
$C_F = f_0\cdot G + f_1a\cdot G + f_2a^2\cdot G + … + f_da^d\cdot G$
Though the committer doesn’t know $a$, she knows $\lbrace G, aG,a^2G, a^3G, …, a^dG \rbrace$ from the reference string & can evaluate $C_F$ & evaluate the commitment. The committer sends the commitment to the verifier.
Since $G$ is a generator of the Elliptic Curve group $\mathbb G$, the commitment is a point on the Elliptic Curve.
The Hiding property of the commitment is based on the Discrete Logarithm problem being hard in $\mathbb G$
Full Open & Verify
In a full open, the committer sends the polynomial to the verifier & the verifier can use the reference string & compute the commitment himself & verify if it matches with the commitment sent originally by the committer.
Partial Open or Evaluation Proof
In KZG, committer can also do a partial open (i.e. evaluation at a single value) which is called as the Evaluation Proof.
The verifier sends a value $b$ randomly selected from $\mathbb F_p$ to the committer. The committer evaluates the polynomial $F(x)$ at $x=b$ as $F(b) = c$ & sends $c$ to the verifier. The committer also has to provide a proof to the verifier that $F(b) = c$.
The proof which the committer provides to the verifier is the commitment of the Quotient Polynomial $Q(x)$ i.e. $C_Q$
$Q(x)$ is defined as
$Q(x) = \frac {F(x) - F(b)} {x-b} = \frac {F(x) - c} {x-b}$
As per Little Bezout’s Theorem, if $F(x)$ is a polynomial, then $F(x) - F(b)$ is perfectly divisible by $(x-b)$ i.e. the remainder is $0$. So $Q(x)$ above is a polynomial i.e. it won’t have any variable in the denominator or any negative exponents. Since $Q(x)$ is a polynomial, then the committer can calculate the commitment of $Q(x)$ using the Reference String as
$C_Q = Q(a)\cdot G$
If $F(x)$ weren’t perfectly divisible by $(x-b)$, then $Q(x)$ would have a denominator & negative exponents & the commiter wouldn’t be able to evaluate it using just the Reference String & without knowing $a$.
The committer sends $C_Q$ to the verifier as the Evaluation Proof.
Verifying
A commitment scheme is said to be complete if anything which is true is provable. It is said to be sound if everything which is provable is true - i.e. anything which is false cannot be proven by the scheme.
The verifier has $C_F$, $C_Q$ & $c$. He needs to verify that $F(b) = c$
$Q(x) = \frac {F(x) - c} {x-b}$
So,
$(x-b)\cdot Q(x) = F(x) - c$
If you evaluate this at $x = a$, it becomes
$(a-b)\cdot Q(a) = F(a) - c$
Multiplying both sides by the generator $G$, we get
$(a-b)\cdot Q(a)\cdot G = F(a)\cdot G - c\cdot G$
Now, $C_Q = Q(a)\cdot G$ & $C_F = F(a)\cdot G$. So substituting, we get
$(a-b)\cdot C_Q = C_F - c\cdot G$
If the verifier is able to verify that the above equality holds good, then he has verified the commitment. But since the verifier doesn’t know the value of $a$, he cannot directly verify if this equality holds good.
However, the verifier can use Elliptic Curve Pairings to verify whether the above equality holds even without knowing $a$. The pairing is denoted by the map
$e:\space \mathbb G$ X $\mathbb G \mapsto \mathbb G_T$
This is the equality to check
$(a-b)\cdot C_Q \stackrel {?}{=} C_F - c\cdot G$
Each input to the pairing map needs to be an element of the Group. Any commitment is a scalar multiple of the generator of the group, so both $C_F$ & $C_Q$ are elements of $\mathbb G$. Since $C_Q$ is an element of the group, multiplying it with the scalar $a-b$ also will result in an element of the group. So $(a-b)\cdot C_Q$ is also an element of $\mathbb G$. $C_F$ & $c\cdot G$ are both elements of the group, so $C_F - c\cdot G$ is also an element of the group. So all the terms in the equality are group elements. So they can be inputs to the pairing map.
By passing the sides of the equality as the first parameter to the map & pass $G$ as the 2nd param for each side, we can turn the equality to be verified into
$e((a-b)\cdot C_Q, G) \stackrel {?}{=} e(C_F - c\cdot G, G)$
This still doesn’t help us because we don’t know $a$.
However, bilinearity property of the pairing map means that
$e(\alpha A,B) = e(A, \alpha B)$ if $\alpha$ is a scalar.
So we can rewrite it as
$e(C_Q, (a-b)\cdot G) \stackrel {?}{=} e(C_F - c\cdot G, G)$
Further simplifying
$e(C_Q, aG - bG) \stackrel {?}{=} e(C_F - cG, G)$
Though we don’t know $a$, we do know $aG$ from our Reference String. So now the verifier can check whether the above equality is true or not. This ends the proof.
Note that any pairing based equality check $e(A, B) \stackrel {?}{=} e(P,Q)$ can also be expressed in a diff way.
$e(A, B) \stackrel {?}{=} e(P,Q)$
$e(A, B) \stackrel {?}{=} e((-1)\cdot (-P),Q)$
Again using the bilinearly property on the Right hand side, we get
$e(A, B) \stackrel {?}{=} e(-P,Q)^{-1}$
which is the same as
$e(A, B)\cdot e(-P,Q) \stackrel {?}{=} 1$
Many texts & papers represent pairing based checks in the above way.
Asymmetric Pairings
Above, we used a symmetric pairing to prove the correctness of the above verification equation. However, an Asymmetric Pairing (i.e. $\mathbb G_1 \ne \mathbb G_2$) can also be used & is usually preferred. The proof for the verify equation using Asymmetric Pairings is very similar & given below.
Let’s start from this point in the earlier proof.
$(a-b)\cdot Q(a) \stackrel {?}{=} F(a) - c$
Multiplying both sides by $G_1$
$(a-b)\cdot Q(a)\cdot G_1 \stackrel {?}{=} F(a) \cdot G_1 - c \cdot G_1$
$(a-b)\cdot C_Q \stackrel {?}{=} C_F - c \cdot G_1$
Using pairings,
$e((a-b)\cdot C_Q,G_2) \stackrel {?}{=} e(C_F - c \cdot G_1, G_2)$
Using the bilinear property,
$e(C_Q,(a-b)\cdot G_2) \stackrel {?}{=} e(C_F - c \cdot G_1, G_2)$
$e(C_Q,a\cdot G_2 - b\cdot G_2) \stackrel {?}{=} e(C_F - c \cdot G_1, G_2)$
Everything except $a\cdot G_2$ is either calculated or found in the SRS for $G_1$. And $a\cdot G_2$ will be part of the SRS for $G_2$ & that is also known and thus the equality can be verified.
Batch Mode
Batch Mode Single Polynomial, multiple points
KZG commitments can also be opened & verified at multiple points using a single proof.
In the stand-alone opening, the committer evaluated $F(x)$ at $b \in \mathbb F_p$ as $F(b) = c$ & provided $c$ along with the proof to the verifier.
In batch mode, the verifier sends to the committer a set of values $ B = \lbrace b_1, b_2, b_3, …, b_t \rbrace$ such that $t \lt d$, the committer evaluates $f(b_1) = c_1, f(b_2) = c_2, …, f(b_t) = c_t$ and constructs the set $C = \lbrace c_1, c_2, c_3, …, c_t \rbrace$.
Let $P(x) = (x-b_1)(x-b_2)…(x-b_t)$
Since the degree of $F(x)$ is $d$ & $t \lt d$, we can divide $F(x)$ by $P(x)$. Let the quotient of the division be $Q(x)$ & the remainder be $R(x)$ (Note that we aren’t saying here that $F(x)$ is divisible by $Q(x)$, so we have remainder here).
i.e. $F(x) = P(x) Q(x) + R(x)$
The committer computes $Q(x)$ & also the commitment for $Q(x)$ i.e. $C_Q$ & sends those also to the verifier along with the set $C$
The committer can also send $R(x)$ to the verifier. Alternately, the verifier can also find the polynomial $R(x)$ himself as shown below.
$F(x) = P(x) Q(x) + R(x)$ with $P(x) = (x-b_1)(x-b_2)…(x-b_t)$
For any $b_i \in B = \lbrace b_1, b_2, b_3, …, b_t \rbrace$, it’s pretty apparent that $P(x)$ is zero. & hence the middle term vanishes at all $b_i \in B$.
So for all $b_i \in B$, $F(x) = R(x)$
Since $F(b_i) = c_i$ for all $\lbrace b_i \in B, c_i \in C \rbrace$, likewise
$R(b_i) = c_i$ for all $\lbrace b_i \in B, c_i \in C \rbrace$
Since the degree of $Q(x)$ is $t$ & $R(x)$ is the remainder of dividing $F(x)$ by $Q(x)$, the degree of $R(x)$ is less than $t$. Since the verifier knows evaluation of $R(x)$ at $t$ points, he can find $R(x)$ using Lagrange’s Interpolation. So now $R(x)$ is known to the verifier.
The Verifier can also find the polynomial $P(x)$ which is
$P(x) = (x-b_1)(x-b_2)…(x-b_t)$
The Verifier also computes the commitments of $P(x)$ & $R(x)$
$C_P = P(a)\cdot G$
$C_R = R(a)\cdot G$
Now the verifier can verify the Batch Evaluation by doing the following steps.
1) Verifier checks if $F(b_i) \stackrel {?}{=} R(b_i)$ for all $b_i \in B$. The committer has provided the set $C$ which are the evaluations of $F(x)$ at all $b_i \in B$. And the verifier knows $R(x)$, so he can also evaluate $R(x)$ at all the $b_i \in B$. So he can verify if $F(b_i) \stackrel {?}{=} R(b_i)$ for all $b_i \in B$
2) The verifier has to verify that the following equality holds
$F(x) \stackrel {?}{=} P(x) Q(x) + R(x)$
$F(x) - R(x) \stackrel {?}{=} P(x) Q(x)$
Multiply both sides by the generator $G$
$F(x)\cdot G - R(x)\cdot G \stackrel {?}{=} P(x) Q(x)\cdot G$
Evaluate the above at $a$
$F(a)\cdot G - R(a)\cdot G \stackrel {?}{=} P(a) Q(a)\cdot G$.
$F(a)\cdot G$ is the committment of $F$ i.e. $C_F$. Likewise, 2 other terms above are also committments. So,
$C_F - C_R \stackrel {?}{=} P(a)\cdot C_Q$
The verifier needs to evaluate the above to verify the proof. However, since $a$ is unknown, he cannot evaluate $P(a)$. But like before, he can use pairings.
$C_F$ & $C_R$ are both elements of $\mathbb G$, so $C_F - C_R$ is also an element of $\mathbb G$
$C_Q$ is an element of $\mathbb G$ & $P(a)$ is a scalar. So $P(a)\cdot C_Q$ is also an element of $\mathbb G$.
So we can apply the pairing map to both sides
$e(C_F - C_R, G) \stackrel {?}{=} e(P(a)\cdot C_Q, G)$
Note that $P(a)$ is a scalar here.And the bilinearity property of the pairing map means that
$e(\alpha A,B) = e(A, \alpha B)$ where $\alpha$ is a scalar.
So, we can rewrite the Right Hand Side as
$e(C_F - C_R, G) \stackrel {?}{=} e(C_Q, P(a)\cdot G)$
$C_P= P(a).G$, so
$e(C_F - C_R, G) \stackrel {?}{=} e(C_Q, C_P)$
The verifier knows all the terms of this pairing & can evaluate to the check if the equality holds or not.
Batch Mode Multiple Polynomials, same point
Note: The KZG paper covers standalone openings & batch opening of a single polynomial at multiple points.
However, there are also protocols for batch opening of multiple polynomials at the same point & batch opening of multiple polynomials at multiple points. Both are covered in the PLONK paper. These protocols are described below.
Consider $f_1, f_2,…, f_t \in \mathbb F_p[x]$ with $t < d$ & $z\in \mathbb F_p$. Let $C_{f_1},C_{f_2}, …, C_{f_t}$ be the commitments to these polynomials.
The openings $f_1(z)=s_1, f_2(z) = s_2, … ,f_t(z)= s_t$ have to be verified.
-
Verifier sends random $\gamma \in \mathbb F_p$ to Prover
-
Prover computes the polynomial
$\qquad h(x) = \sum_{i=1}^t \gamma^{i-1} \cdot \frac {f_i(x) - f_i(z)}{x-z}$
- Prover computes & sends $C_h$, the commitment for $h(x)$ to the verifier.
Now,
$f_i(z) = s_i$
Verifier has to check if
$h(x) \stackrel {?}{=} \sum_{i=1}^t \gamma^{i-1} \cdot \frac {f_i(x) - s_i}{x-z}$
$h(x)(x-z) \stackrel {?}{=} \sum_{i=1}^t \gamma^{i-1} \cdot (f_i(x) - s_i)$
$h(x)(x-z) \stackrel {?}{=} \sum_{i=1}^t \gamma^{i-1} \cdot f_i(x) - \sum_{i=1}^t \gamma^{i-1} \cdot s_i$
Multiplying both sides by G & evaluating at $x=a$
$h(a)\cdot G \cdot (a-z) \stackrel {?}{=} \sum_{i=1}^t \gamma^{i-1} \cdot f_i(a)\cdot G - \sum_{i=1}^t \gamma^{i-1} \cdot s_i\cdot G $
$C_h = h(a)\cdot G$
$C_{f_i} = f_i(a)\cdot G$
$C_h \cdot (a-z) \stackrel {?}{=} \sum_{i=1}^t \gamma^{i-1} \cdot C_{f_i} - \sum_{i=1}^t \gamma^{i-1} \cdot s_i\cdot G \quad — (Eq\space\space 1)$
Let $F = \sum_{i=1}^t \gamma^{i-1} \cdot C_{f_i}$
Let $v = \sum_{i=1}^t \gamma^{i-1} \cdot s_i\cdot G$
Substituting the above in $(Eq\space\space 1)$
$C_f \cdot (a-z) \stackrel {?}{=} F - v \quad — (Eq\space\space 2)$
Using pairings,
$e(C_f \cdot (a-z), G) \stackrel {?}{=} e(F - v, G)$
Using the bilinearity property,
$e(C_f, (a-z)\cdot G) \stackrel {?}{=} e(F - v, G)$
$e(C_f, aG - zG) \stackrel {?}{=} e(F - v, G)$
$z$ is known, so $zG$ is known. $aG$ is known from the SRS. $F$ & $v$ are also computable. So all the terms in the above are known & the verifier can check if the equality holds or not.
Batch Mode Multiple Polynomials, multiple points
In this case, we have $f_1, f_2,…, f_t \in \mathbb F_p[x]$ & $t < d$ with $z\in \mathbb F_p$. Let $C_{f_1},C_{f_2}, …, C_{f_t}$ be the commitments to these polynomials.
We also have $f’_1, f’_2,…, f’_t \in \mathbb F_p[x]$ with $t’ < d$ & $z’\in \mathbb F_p$.
Let $C_{f_1’},C_{f_2’}, …,C_{f_{t’}’}$
be the commitments to these polynomials.
And likewise $f’'_i$s, with $z’’$ and many more such sets.
Let’s, for a moment, assume we have only 2 such sets $f$ & $f’$ & batch them together. We can extend the same method for any number of such sets.
In the multiple polynomial, same point protocol, $F$ & $v$ were computed, now we also have $F’$ & $v’$. $F$ & $F’$ can be computed in parallel. And likewise $v$ & $v’$. So similar to $(Eq 2)$, we have
$C_f \cdot (a-z) \stackrel {?}{=} F - v$
$C_{f’} \cdot (a-z’) \stackrel {?}{=} F’ - v’$
Verifier sends a random $r \in \mathbb F_p$ & Prover uses it to combine the above 2 equatons in a linearly independent way so that they can be tested together. This trick is explained here
$C_f(a-z) + r\cdot C_{f’}\cdot (a-z’) \stackrel {?}{=} F + rF’ - v - rv’$
Let $H = F+rF’ - v - rv’$
$C_f(a-z) + r\cdot C_{f’}\cdot (a-z’) \stackrel {?}{=} H$
$a \cdot C_f + ra \cdot C_{f’} \stackrel {?}{=} H + z\cdot C_f + z’r\cdot C_{f’}$
$a(C_f + r \cdot C_{f’}) \stackrel {?}{=} H + z\cdot C_f + z’r\cdot C_{f’}$
Using pairings,
$e(a(C_f + r \cdot C_{f’}), G) \stackrel {?}{=} e(H + z\cdot C_f + z’r\cdot C_{f’}, G)$
Using the bilinearity property
$e(C_f + r \cdot C_{f’}, aG) \stackrel {?}{=} e(H + z\cdot C_f + z’r\cdot C_{f’}, G)$
So all the terms in the above are known & the verifier can check if the equality holds or not.
For more than 2 sets, we will combine them linearly independent using $\lbrace 1, r, r^2, r^3, …\rbrace$ as explained.
One of the advantages of the KZG PCS is that the commitment size is constant. Irrespective of how long the polynomial is (i.e. degree of the polynomial), the commitment is always just one element of the group $\mathbb G$. Likewise, the evaluation proof is also a commitment, so there again, the size is constant for the evaluation proof both at a single value and also in batch mode.