Few questions answered about PlonK


Multiplicative Subgroup

Q: Why does $\mathcal{P} \mathfrak{lon}\mathcal{K}$ use a multiplicative subgroup?

A: There are multiple reasons why $\mathcal{P} \mathfrak{lon}\mathcal{K}$ uses a multiplicative subgroup

$(1)\space$ Every element of a finite field is a root of unity. A Finite Field $\mathbb F_p$ has a multiplicative subgroup of order $n$ only if $n$ divides $p-1$. All primitive roots of unity in a finite field also form a multiplicative subgroup of the field. Let $\omega$ be a primitive $n$th root of unity in $\mathbb F_p$ i.e. $\omega^n = 1$ This forms a multiplicative subgroup of order $n$ - let’s call it $H$

$H = \lbrace 1, \omega, \omega^2, \omega^3, …, \omega^{n-1} \rbrace$

This can support a circuit with a maximum of n gates.

In Groth16 with $n$ gates, we use $\lbrace 1, 2, 3, …, n \rbrace$ to represent the gates. There, we compute the vanishing polynomial as

$z_H(X) = (X-1)\cdot (X-2)\cdot(X-3) \dots \dots(X-n)$ (where 1, 2, 3 etc are the gate numbers)

The number of gates is usually very large (may be a million gates or more). So, computing the vanishing polynomial which has a million such terms is quite expensive.

$\mathcal{P} \mathfrak{lon}\mathcal{K}$ numbers the gates using the elements of the multiplicative subgroup $H$. Now the vanishing polynomial becomes

$z_H(X) = (X-1) \cdot(X-\omega)\cdot (X-\omega^2)\cdot …\cdot (X-\omega^{n-1})$

Now,

Let’s consider the polynomial $X^n - 1$

  • For $X = 1$,

    $1^n = 1$, so $(X-1)$ is a root of $X^n - 1$

  • For $X = \omega$,

    Since $\omega$ is the $n$th root of unity, $X^n = 1$ & hence $\omega$ is a root of $X^n - 1$

  • For $X = \omega^2$

    ${\omega^2}^n = {\omega^n}^2 = {1}^2 = 1$.

    So $\omega^2$ is also a root of $X^n - 1$

  • Like this, we can prove that every element of $H$ is a root of $X^n - 1$ & since $X^n - 1$ is degree $n$, the maximum number of roots it can have is $n$.

So $X^n - 1 = (X-1) \cdot (X-\omega)\cdot (X-\omega^2) \dots \dots (X-\omega^{n-1})$

So $z_H = X^n - 1$

So now the vanishing polynomial $Z_H$ is very easy to compute instead of having to multiply a million terms.

$(2)$ Using a multiplicative subgroup allows an efficient & elegant Product Check on the subgroup.

Take a look at $\mathcal{P} \mathfrak{lon}\mathcal{K}$’s Permutation proof.

This proof uses

$z(\omega X) = z(X) \cdot m(X) \qquad\qquad$ (where $m(X) = \frac {f’(X)}{g’(X)})$

We can define $z$ so because we are operating in a multiplicative subgroup $H = \lbrace 1,\omega, \omega^2, \omega^3, …, \omega^{n-1} \rbrace$, where multiplying by each element $\omega$ gives us the next element & thus we get the “right shift” relation between $z$ at an element & $z$ at the next element.

$(3)$ Using a multiplicative subgroup of a Finite Field is required for usage of Fast Fourier Transform to speed up some operations.


Blinding

Q: In Round 1 (Page 28 of the $\mathcal{P} \mathfrak{lon}\mathcal{K}$ paper), random blinding scalars are used to modify the 3 wire polynomials (the polynomials representing the left, right & output of the gates). What exactly is blinding?

A: Let’s say you have a polynomial $f(X)$ of degree $d$ & it’s commitment $C_f$. Let’s say the verifier selects random value $r$ & the prover sends the evaluation of $f$ at $r$ i.e. $f(r) = z$

A polynomial of degree $d$ can be recreated with $d+1$ evaluations by using Lagrange Interpolation. So, each opening of a Polynomial Commitment leaks some info about the polynomial.

Hence SNARKs use a trick called blinding to make it zero knowledge. Multiply the vanishing polynomial $z_H$ by a random Polynomial $R(X)$ & add it to $f(X)$ to create a new polynomial $F(X)$

$F(X) = R(X)\cdot z_H + f(X)$

Instead of committing and opening $f(X)$, the prover commits & opens $F(X)$.

$z_H$ is zero on the set the constraints are checked on - so on this set $F(X) = f(X)$. So other than in the commitment & opening, the SNARK can continue to use $f$ instead of $F$

The next question is what should be the degree of the random polynomial $R$. That depends on how many points you open $f$ at - if there is an opening at only one point, then $R$ needs to be of minimum degree $1$, if there are two openings, then there $R$ needs to be at least of degree $2$. The new polynomial $F(X)$ which is going to be opened instead of $f(X)$ has to be of degree of $f$ plus the number of points at which $f$ needs to be opened.

In Round 1, the left, right & output polynomials are evaluated only at one point each & hence a random polynomial of degree $1$ is used for blinding - for e.g. for blinding the opening of $a(X)$, $R(X) = (b_1 X + b_2)$ is used as the random polynomial. The round 2 polynomial $z(X)$ is evaluated at 2 points & hence a degree 2 random polynomial $(b_7 X + b_8 X + b_9)$ is used.


Linear Independence

Q: In Round 3 (Page 29), when combining different polynomials to form $t(X)$, why are different powers of $\alpha$ i.e. $ \alpha^0, \alpha^1, \alpha^2$ used.

A: Let’s say we have 4 polynomials - $f_1$, $f_2$, $f_3$ & $f_4 \in \mathbb F_p[X]$ where the max degree of these polynomials is $d$ which is very, very small as compared to $p$.

We want to combine them into one polynomial $f$ such that if $f$ is 0 at some point, then all of $f_i$’s are also zero at the same point.

Consider the set ${1, z, z^2, z^3}$ This is a linearly independent set.

We can use this set to combine four variables $a_1, a_2, a_3$ & $a_4$ like this

$g(Z) = a1 + a2\cdot Z + a3\cdot Z^2 + a4\cdot Z^3$

If $g(Z)=0$ at some $Z \ne 0$, then it means $a_1 = a_2 = a_3 = a_4 = 0$ (by the definition of a linearly independent set)

So, we can combine the polynomials $f_i$’s as

$f(X,Z) = f1(Z) + Z\cdot f2(x) + Z^2 \cdot f3(X) + Z^3\cdot f4(X)$

At some $X = r_1$, let

$f_1(r_1) = a_1$, $f_2(r_1) = a_2$, $f_3(r_1) = a_3$, $f_4(r_1) = a_4$,

So now

$f(X=r_1, X) = a1 + a2\cdot Z + a3\cdot Z^2 + a4\cdot Z^3$

At some random value $r_2$ chosen from $\mathbb F_p$, if

$f(X=r_1, Z = r_2) = 0$

then it means $a1 = a2 = a3 = a4 = 0$

i.e. $f_1(r_1, r_2) =f_2(r_1, r_2) = f_3(r_1, r_2) = f_4(r_1, r_2) = 0$

If $f_1$, $f_2$, $f_3$, $f_4$ are all $0$ at some random value $r_2$, then by the Schwartz-Zippel lemma, $f_1$, $f_2$, $f_3$, $f_4$ are all zero polynomials with very high probability because the maximum degree of these polynomials is very, very small as compared to $p$

So if we want to test if multiple polynomials are zero polynomials or not, we combine them using a lineraly independent set so we can test them with just one evaluation at a random point rather than testing them separately. Round 3 in the $\mathcal{P} \mathfrak{lon}\mathcal{K}$ paper creates the polynomial $t(X)$ with the linearly independent set $[1, \alpha, \alpha^2]$ to do this. In Round 5, the set $\lbrace 1, v, v^2, v^3, v^4, v^5 \rbrace$ is used to combine several polynomials to form a single opening proof polynomial $W_\zeta(X)$. It may be used at other places also.


Split Polynomials

Q: Why does the $\mathcal{P} \mathfrak{lon}\mathcal{K}$ protocol split the Quotient Polynomial into 3 polynomials?

A: In Round 3 (Page 29 of the $\mathcal{P} \mathfrak{lon}\mathcal{K}$ Paper), the Quotient Polynomial $t(X)$ is split into 3 polynomials

$t(X) = t’_{lo}(X) + X^n t’_{mid}(X) + X^{2n}t’_{hi}(X)$

This means 3 commitments are needed instead of one which increases proof size, so why is it done?

The tradeoff is that opening time gets reduced because of having smaller polynomials & hence the prover time gets reduced.

The opening time of a polynomial depends on the degree of the polynomial. However, having smaller polynomials may not help if the 3 polynomials are opened separately . The $\mathcal{P} \mathfrak{lon}\mathcal{K}$ paper presents a technique for batched opening of different polynomials at different points (on Page 10) which ensures that the opening time of 3 smaller polynomis is much lower than the opening time of one polynomial of roughly 3 times the size.


Field Element Reduction Optimisation

Q: In Round 5 (Page 30), the Linearisation Polynomial $r(X)$ is computed. What exactly is the Linearisation Polynomial?

A: Let’s look at how one proves the identity $h_1(X)\cdot h_2(X) - h_3(X) = 0$?

The normal way would be for the prover to send commitments for $h_1, h_2, h_3$. Then the verifier chooses a random number $r$ & the prover sends evaluation proofs for the 3 polynomials at $r$. The verifier then checks if $h_1(r)\cdot h_2(r) - h_3(r) \stackrel {?}{=} 0$.

$\mathcal{P} \mathfrak{lon}\mathcal{K}$ uses an optimisation which allows it to send one less evaluation proof.

This optimisation is based on the fact that Polynomial commitments like KZG are additively homomorphic, but not multiplicatively homomorphic.

$F(X) = f_0 + f_1X + f_2X^2 + … + f_dX^d$

$G(X) = g_0 + g_1X + g_2X^2 + … + g_dX^d$

Commitment Reference String = $\lbrace G, aG,a^2G, a^3G, …, a^dG \rbrace$

Commitment of $F$ & $G$ would be

$C_{F} = F(a)\cdot G = (f_0 + f_1a + f_2a^2 + … + f_da^d)\cdot G$

$C_{F} = f_0\cdot G + f_1a\cdot G + f_2a^2\cdot G + … + f_da^d\cdot G$

and

$C_{G} = g_0\cdot G + g_1a\cdot G + g_2a^2\cdot G + … + g_da^d\cdot G$

Now if we have a polynomial

$E(X) = F(X) + G(X) = f_0 + f_1X + f_2X^2 + … + f_dX^d + g_0 + g_1X + g_2X^2 + … + g_dX^d$

It’s obvious that the commitment for $E$ would be the sum of the commitments for $F$ & $G$

i.e. $C_{E} = C_F + C_G$

However the commitment is not multiplicatively homomorphic, i.e. if $H(X) = F(X) \cdot G(X)$, then you wouldn’t be able to compute the commitment of $H$ from those of $F$ & $G$.

Getting back to how the prover proves the identity $h_1(X)\cdot h_2(X) - h_3(X) = 0$

The prover evaluates $h_1$ at $r$ as $h_1(r) = c$ & sends the evaluation proof for the same. The verifier creates a new polynomial called as the Linearisation Polynomial

$L(X) = c\cdot h_2(X) - h_3(X)$

Because $h_1$ has been replaced by a constant $c$ in $L(X)$, $L$ doesn’t contains a multiplication of two polynomials. Hence the verifier can compute the commitment of $L$ by himself as

$C_L = c\cdot C_{h_2} - C_{h_3}$

where $C_L$, $C_{h_2}$ & $C_{h_3}$ are commitments of $L$, $h_2$ & $h_3$ respectively.

Since the commitment to $L$ has been computed using comittments to $h_2$ & $h_3$, verifying if $L(r) =0$ & $h_1(r) = c$ would be enough for the verifier to verify that $h_1(r)\cdot h_2(r) - h_3(r) {=} 0$. The verifier doesnt need to know the evaluation of $h_2$ & $h_3$ & prover doesn’t need to send the evaluation proofs for $h_2$ & $h_3$.

So instead of sending commitments to $h_1, h_2, h_3$ & evaluation proofs for all 3, the prover needs to send only commitments to $h_1, h_2, h_3$ & evaluation proofs for $L$ & $h_1$.

(In the $\mathcal{P} \mathfrak{lon}\mathcal{K}$ paper, this is on Page 18 titled “Reducing the number of field elements”).

Hits

Written on July 21, 2023