In cryptography, a zero-knowledge proof is a protocol in which one party (the prover) can convince another party (the verifier) that some given statement is true, without conveying to the verifier any information beyond the mere fact of that statement's truth. The intuition underlying zero-knowledge proofs is that it is trivial to prove possession of the relevant information simply by revealing it; the hard part is to prove this possession without revealing this information (or any aspect of it whatsoever).[1]
In light of the fact that one should be able to generate a proof of some statement only when in possession of certain secret information connected to the statement, the verifier, even after having become convinced of the statement's truth, should nonetheless remain unable to prove the statement to further third parties.
Zero-knowledge proofs can be interactive, meaning that the prover and verifier exchange messages according to some protocol, or noninteractive, meaning that the verifier is convinced by a single prover message and no other communication is needed. In the standard model, interaction is required, except for trivial proofs of BPP problems.[2] In the common random string and random oracle models, non-interactive zero-knowledge proofs exist. The Fiat–Shamir heuristic can be used to transform certain interactive zero-knowledge proofs into noninteractive ones.[3] [4] [5]
There is a well-known story presenting the fundamental ideas of zero-knowledge proofs, first published in 1990 by Jean-Jacques Quisquater and others in their paper "How to Explain Zero-Knowledge Protocols to Your Children".[6] The two parties in the zero-knowledge proof story are Peggy as the prover of the statement, and Victor, the verifier of the statement.
In this story, Peggy has uncovered the secret word used to open a magic door in a cave. The cave is shaped like a ring, with the entrance on one side and the magic door blocking the opposite side. Victor wants to know whether Peggy knows the secret word; but Peggy, being a very private person, does not want to reveal her knowledge (the secret word) to Victor or to reveal the fact of her knowledge to the world in general.
They label the left and right paths from the entrance A and B. First, Victor waits outside the cave as Peggy goes in. Peggy takes either path A or B; Victor is not allowed to see which path she takes. Then, Victor enters the cave and shouts the name of the path he wants her to use to return, either A or B, chosen at random. Providing she really does know the magic word, this is easy: she opens the door, if necessary, and returns along the desired path.
However, suppose she did not know the word. Then, she would only be able to return by the named path if Victor were to give the name of the same path by which she had entered. Since Victor would choose A or B at random, she would have a 50% chance of guessing correctly. If they were to repeat this trick many times, say 20 times in a row, her chance of successfully anticipating all of Victor's requests would be reduced to 1 in 220, or 9.5610−7.
Thus, if Peggy repeatedly appears at the exit Victor names, then he can conclude that it is extremely probable that Peggy does, in fact, know the secret word.
One side note with respect to third-party observers: even if Victor is wearing a hidden camera that records the whole transaction, the only thing the camera will record is in one case Victor shouting "A!" and Peggy appearing at A or in the other case Victor shouting "B!" and Peggy appearing at B. A recording of this type would be trivial for any two people to fake (requiring only that Peggy and Victor agree beforehand on the sequence of As and Bs that Victor will shout). Such a recording will certainly never be convincing to anyone but the original participants. In fact, even a person who was present as an observer at the original experiment should be unconvinced, since Victor and Peggy could have orchestrated the whole "experiment" from start to finish.
Further, if Victor chooses his As and Bs by flipping a coin on-camera, this protocol loses its zero-knowledge property; the on-camera coin flip would probably be convincing to any person watching the recording later. Thus, although this does not reveal the secret word to Victor, it does make it possible for Victor to convince the world in general that Peggy has that knowledge—counter to Peggy's stated wishes. However, digital cryptography generally "flips coins" by relying on a pseudo-random number generator, which is akin to a coin with a fixed pattern of heads and tails known only to the coin's owner. If Victor's coin behaved this way, then again it would be possible for Victor and Peggy to have faked the experiment, so using a pseudo-random number generator would not reveal Peggy's knowledge to the world in the same way that using a flipped coin would.
Peggy could prove to Victor that she knows the magic word, without revealing it to him, in a single trial. If both Victor and Peggy go together to the mouth of the cave, Victor can watch Peggy go in through A and come out through B. This would prove with certainty that Peggy knows the magic word, without revealing the magic word to Victor. However, such a proof could be observed by a third party, or recorded by Victor and such a proof would be convincing to anybody. In other words, Peggy could not refute such proof by claiming she colluded with Victor, and she is therefore no longer in control of who is aware of her knowledge.
Imagine your friend "Victor" is red-green colour-blind (while you are not) and you have two balls: one red and one green, but otherwise identical. To Victor, the balls seem completely identical. Victor is skeptical that the balls are actually distinguishable. You want to prove to Victor that the balls are in fact differently coloured, but nothing else. In particular, you do not want to reveal which ball is the red one and which is the green.
Here is the proof system. You give the two balls to Victor and he puts them behind his back. Next, he takes one of the balls and brings it out from behind his back and displays it. He then places it behind his back again and then chooses to reveal just one of the two balls, picking one of the two at random with equal probability. He will ask you, "Did I switch the ball?" This whole procedure is then repeated as often as necessary.
By looking at the balls' colours, you can, of course, say with certainty whether or not he switched them. On the other hand, if the balls were the same colour and hence indistinguishable, there is no way you could guess correctly with probability higher than 50%.
Since the probability that you would have randomly succeeded at identifying each switch/non-switch is 50%, the probability of having randomly succeeded at all switch/non-switches approaches zero ("soundness"). If you and your friend repeat this "proof" multiple times (e.g. 20 times), your friend should become convinced ("completeness") that the balls are indeed differently coloured.
The above proof is zero-knowledge because your friend never learns which ball is green and which is red; indeed, he gains no knowledge about how to distinguish the balls.[7]
One well-known example of a zero-knowledge proof is the "Where's Waldo" example. In this example, the prover wants to prove to the verifier that they know where Waldo is on a page in a Where's Waldo? book, without revealing his location to the verifier.[8]
The prover starts by taking a large black board with a small hole in it, the size of Waldo. The board is twice the size of the book in both directions, so the verifier cannot see where on the page the prover is placing it. The prover then places the board over the page so that Waldo is in the hole.[8]
The verifier can now look through the hole and see Waldo, but cannot see any other part of the page. Therefore, the prover has proven to the verifier that they know where Waldo is, without revealing any other information about his location.[8]
This example is not a perfect zero-knowledge proof, because the prover does reveal some information about Waldo's location, such as his body position. However, it is a decent illustration of the basic concept of a zero-knowledge proof.
A zero-knowledge proof of some statement must satisfy three properties:
The first two of these are properties of more general interactive proof systems. The third is what makes the proof zero-knowledge.[9]
Zero-knowledge proofs are not proofs in the mathematical sense of the term because there is some small probability, the soundness error, that a cheating prover will be able to convince the verifier of a false statement. In other words, zero-knowledge proofs are probabilistic "proofs" rather than deterministic proofs. However, there are techniques to decrease the soundness error to negligibly small values (for example, guessing correctly on a hundred or thousand binary decisions has a 1/2100 or 1/21000 soundness error, respectively. As the number of bits increases, the soundness error decreases toward zero).
A formal definition of zero-knowledge must use some computational model, the most common one being that of a Turing machine. Let,, and be Turing machines. An interactive proof system with for a language is zero-knowledge if for any probabilistic polynomial time (PPT) verifier there exists a PPT simulator such that:
\forallx\inL,z\in\{0,1\}*,\operatorname{View}\hat\left[P(x)\leftrightarrow\hatV(x,z)\right]=S(x,z),
where is a record of the interactions between and . The prover is modeled as having unlimited computation power (in practice, usually is a probabilistic Turing machine). Intuitively, the definition states that an interactive proof system is zero-knowledge if for any verifier there exists an efficient simulator (depending on) that can reproduce the conversation between and on any given input. The auxiliary string in the definition plays the role of "prior knowledge" (including the random coins of). The definition implies that cannot use any prior knowledge string to mine information out of its conversation with, because if is also given this prior knowledge then it can reproduce the conversation between and just as before.
The definition given is that of perfect zero-knowledge. Computational zero-knowledge is obtained by requiring that the views of the verifier and the simulator are only computationally indistinguishable, given the auxiliary string.
These ideas can be applied to a more realistic cryptography application. Peggy wants to prove to Victor that she knows the discrete logarithm of a given value in a given group.[10]
For example, given a value, a large prime, and a generator
g
The protocol proceeds as follows: in each round, Peggy generates a random number, computes and discloses this to Victor. After receiving, Victor randomly issues one of the following two requests: he either requests that Peggy discloses the value of, or the value of .
Victor can verify either answer; if he requested, he can then compute and verify that it matches . If he requested, then he can verify that is consistent with this, by computing and verifying that it matches . If Peggy indeed knows the value of, then she can respond to either one of Victor's possible challenges.
If Peggy knew or could guess which challenge Victor is going to issue, then she could easily cheat and convince Victor that she knows when she does not: if she knows that Victor is going to request, then she proceeds normally: she picks, computes, and discloses to Victor; she will be able to respond to Victor's challenge. On the other hand, if she knows that Victor will request, then she picks a random value, computes, and discloses to Victor as the value of that he is expecting. When Victor challenges her to reveal, she reveals, for which Victor will verify consistency, since he will in turn compute, which matches, since Peggy multiplied by the modular multiplicative inverse of .
However, if in either one of the above scenarios Victor issues a challenge other than the one she was expecting and for which she manufactured the result, then she will be unable to respond to the challenge under the assumption of infeasibility of solving the discrete log for this group. If she picked and disclosed, then she will be unable to produce a valid that would pass Victor's verification, given that she does not know . And if she picked a value that poses as, then she would have to respond with the discrete log of the value that she disclosed but Peggy does not know this discrete log, since the value she disclosed was obtained through arithmetic with known values, and not by computing a power with a known exponent.
Thus, a cheating prover has a 0.5 probability of successfully cheating in one round. By executing a large-enough number of rounds, the probability of a cheating prover succeeding can be made arbitrarily low.
To show that the above interactive proof gives zero knowledge other than the fact that Peggy knows, one can use similar arguments as used in the above proof of completeness and soundness. Specifically, a simulator, say Simon, who does not know, can simulate the exchange between Peggy and Victor by the following procedure. Firstly, Simon randomly flips a fair coin. If the result is "heads", then he picks a random value, computes, and discloses as if it is a message from Peggy to Victor. Then Simon also outputs a message "request the value of " as if it is sent from Victor to Peggy, and immediately outputs the value of as if it is sent from Peggy to Victor. A single round is complete. On the other hand, if the coin flipping result is "tails", then Simon picks a random number, computes, and discloses as if it is a message from Peggy to Victor. Then Simon outputs "request the value of " as if it is a message from Victor to Peggy. Finally, Simon outputs the value of as if it is the response from Peggy back to Victor. A single round is complete. By the previous arguments when proving the completeness and soundness, the interactive communication simulated by Simon is indistinguishable from the true correspondence between Peggy and Victor. The zero-knowledge property is thus guaranteed.
Peggy proves to know the value of (for example her password).
g
The value can be seen as the encrypted value of . If is truly random, uniformly distributed between zero and, then this does not leak any information about (see one-time pad).
The following scheme is due to Manuel Blum.[11]
In this scenario, Peggy knows a Hamiltonian cycle for a large graph . Victor knows but not the cycle (e.g., Peggy has generated and revealed it to him.) Finding a Hamiltonian cycle given a large graph is believed to be computationally infeasible, since its corresponding decision version is known to be NP-complete. Peggy will prove that she knows the cycle without simply revealing it (perhaps Victor is interested in buying it but wants verification first, or maybe Peggy is the only one who knows this information and is proving her identity to Victor).
To show that Peggy knows this Hamiltonian cycle, she and Victor play several rounds of a game:
It is important that the commitment to the graph be such that Victor can verify, in the second case, that the cycle is really made of edges from . This can be done by, for example, committing to every edge (or lack thereof) separately.
If Peggy does know a Hamiltonian cycle in, then she can easily satisfy Victor's demand for either the graph isomorphism producing from (which she had committed to in the first step) or a Hamiltonian cycle in (which she can construct by applying the isomorphism to the cycle in).
Peggy's answers do not reveal the original Hamiltonian cycle in . In each round, Victor will learn only 's isomorphism to or a Hamiltonian cycle in . He would need both answers for a single to discover the cycle in, so the information remains unknown as long as Peggy can generate a distinct every round. If Peggy does not know of a Hamiltonian cycle in, but somehow knew in advance what Victor would ask to see each round, then she could cheat. For example, if Peggy knew ahead of time that Victor would ask to see the Hamiltonian cycle in, then she could generate a Hamiltonian cycle for an unrelated graph. Similarly, if Peggy knew in advance that Victor would ask to see the isomorphism then she could simply generate an isomorphic graph (in which she also does not know a Hamiltonian cycle). Victor could simulate the protocol by himself (without Peggy) because he knows what he will ask to see. Therefore, Victor gains no information about the Hamiltonian cycle in from the information revealed in each round.
If Peggy does not know the information, then she can guess which question Victor will ask and generate either a graph isomorphic to or a Hamiltonian cycle for an unrelated graph, but since she does not know a Hamiltonian cycle for, she cannot do both. With this guesswork, her chance of fooling Victor is, where is the number of rounds. For all realistic purposes, it is infeasibly difficult to defeat a zero-knowledge proof with a reasonable number of rounds in this way.
Different variants of zero-knowledge can be defined by formalizing the intuitive concept of what is meant by the output of the simulator "looking like" the execution of the real proof protocol in the following ways:
There are various types of zero-knowledge proofs:
Zero-knowledge proof schemes can be constructed from various cryptographic primitives, such as hash-based cryptography, pairing-based cryptography, multi-party computation, or lattice-based cryptography.
Research in zero-knowledge proofs has been motivated by authentication systems where one party wants to prove its identity to a second party via some secret information (such as a password) but does not want the second party to learn anything about this secret. This is called a "zero-knowledge proof of knowledge". However, a password is typically too small or insufficiently random to be used in many schemes for zero-knowledge proofs of knowledge. A zero-knowledge password proof is a special kind of zero-knowledge proof of knowledge that addresses the limited size of passwords.
In April 2015, the one-out-of-many proofs protocol (a Sigma protocol) was introduced. In August 2021, Cloudflare, an American web infrastructure and security company, decided to use the one-out-of-many proofs mechanism for private web verification using vendor hardware.[13]
One of the uses of zero-knowledge proofs within cryptographic protocols is to enforce honest behavior while maintaining privacy. Roughly, the idea is to force a user to prove, using a zero-knowledge proof, that its behavior is correct according to the protocol.[14] Because of soundness, we know that the user must really act honestly in order to be able to provide a valid proof. Because of zero knowledge, we know that the user does not compromise the privacy of its secrets in the process of providing the proof.
In 2016, the Princeton Plasma Physics Laboratory and Princeton University demonstrated a technique that may have applicability to future nuclear disarmament talks. It would allow inspectors to confirm whether or not an object is indeed a nuclear weapon without recording, sharing, or revealing the internal workings, which might be secret.[15]
Zero-knowledge proofs were applied in the Zerocoin and Zerocash protocols, which culminated in the birth of Zcoin (later rebranded as Firo in 2020)[16] and Zcash cryptocurrencies in 2016. Zerocoin has a built-in mixing model that does not trust any peers or centralised mixing providers to ensure anonymity. Users can transact in a base currency and can cycle the currency into and out of Zerocoins.[17] The Zerocash protocol uses a similar model (a variant known as a non-interactive zero-knowledge proof)[18] except that it can obscure the transaction amount, while Zerocoin cannot. Given significant restrictions of transaction data on the Zerocash network, Zerocash is less prone to privacy timing attacks when compared to Zerocoin. However, this additional layer of privacy can cause potentially undetected hyperinflation of Zerocash supply because fraudulent coins cannot be tracked.[19] [20]
In 2018, Bulletproofs were introduced. Bulletproofs are an improvement from non-interactive zero-knowledge proofs where a trusted setup is not needed.[21] It was later implemented into the Mimblewimble protocol (which the Grin and Beam cryptocurrencies are based upon) and Monero cryptocurrency.[22] In 2019, Firo implemented the Sigma protocol, which is an improvement on the Zerocoin protocol without trusted setup.[23] [24] In the same year, Firo introduced the Lelantus protocol, an improvement on the Sigma protocol, where the former hides the origin and amount of a transaction.[25]
Zero-knowledge proofs by their nature can enhance privacy in identity-sharing systems, which are vulnerable to data breaches and identity theft. When integrated to a decentralized identifier system, ZKPs add an extra layer of encryption on DID documents.[26]
Zero-knowledge proofs were first conceived in 1985 by Shafi Goldwasser, Silvio Micali, and Charles Rackoff in their paper "The Knowledge Complexity of Interactive Proof-Systems". This paper introduced the IP hierarchy of interactive proof systems (see interactive proof system) and conceived the concept of knowledge complexity, a measurement of the amount of knowledge about the proof transferred from the prover to the verifier. They also gave the first zero-knowledge proof for a concrete problem, that of deciding quadratic nonresidues mod . Together with a paper by László Babai and Shlomo Moran, this landmark paper invented interactive proof systems, for which all five authors won the first Gödel Prize in 1993.
In their own words, Goldwasser, Micali, and Rackoff say:
Of particular interest is the case where this additional knowledge is essentially 0 and we show that [it] is possible to interactively prove that a number is quadratic non residue mod m releasing 0 additional knowledge. This is surprising as no efficient algorithm for deciding quadratic residuosity mod m is known when m’s factorization is not given. Moreover, all known NP proofs for this problem exhibit the prime factorization of m. This indicates that adding interaction to the proving process, may decrease the amount of knowledge that must be communicated in order to prove a theorem.
The quadratic nonresidue problem has both an NP and a co-NP algorithm, and so lies in the intersection of NP and co-NP. This was also true of several other problems for which zero-knowledge proofs were subsequently discovered, such as an unpublished proof system by Oded Goldreich verifying that a two-prime modulus is not a Blum integer.[27]
Oded Goldreich, Silvio Micali, and Avi Wigderson took this one step further, showing that, assuming the existence of unbreakable encryption, one can create a zero-knowledge proof system for the NP-complete graph coloring problem with three colors. Since every problem in NP can be efficiently reduced to this problem, this means that, under this assumption, all problems in NP have zero-knowledge proofs.[28] The reason for the assumption is that, as in the above example, their protocols require encryption. A commonly cited sufficient condition for the existence of unbreakable encryption is the existence of one-way functions, but it is conceivable that some physical means might also achieve it.
On top of this, they also showed that the graph nonisomorphism problem, the complement of the graph isomorphism problem, has a zero-knowledge proof. This problem is in co-NP, but is not currently known to be in either NP or any practical class. More generally, Russell Impagliazzo and Moti Yung as well as Ben-Or et al. would go on to show that, also assuming one-way functions or unbreakable encryption, there are zero-knowledge proofs for all problems in IP = PSPACE, or in other words, anything that can be proved by an interactive proof system can be proved with zero knowledge.[29] [30]
Not liking to make unnecessary assumptions, many theorists sought a way to eliminate the necessity of one way functions. One way this was done was with multi-prover interactive proof systems (see interactive proof system), which have multiple independent provers instead of only one, allowing the verifier to "cross-examine" the provers in isolation to avoid being misled. It can be shown that, without any intractability assumptions, all languages in NP have zero-knowledge proofs in such a system.[31]
It turns out that, in an Internet-like setting, where multiple protocols may be executed concurrently, building zero-knowledge proofs is more challenging. The line of research investigating concurrent zero-knowledge proofs was initiated by the work of Dwork, Naor, and Sahai.[32] One particular development along these lines has been the development of witness-indistinguishable proof protocols. The property of witness-indistinguishability is related to that of zero-knowledge, yet witness-indistinguishable protocols do not suffer from the same problems of concurrent execution.[33]
Another variant of zero-knowledge proofs are non-interactive zero-knowledge proofs. Blum, Feldman, and Micali showed that a common random string shared between the prover and the verifier is enough to achieve computational zero-knowledge without requiring interaction.
The most popular interactive or non-interactive zero-knowledge proof (e.g., zk-SNARK) protocols can be broadly categorized in the following four categories: Succinct Non-Interactive ARguments of Knowledge (SNARK), Scalable Transparent ARgument of Knowledge (STARK), Verifiable Polynomial Delegation (VPD), and Succinct Non-interactive ARGuments (SNARG). A list of zero-knowledge proof protocols and libraries is provided below along with comparisons based on transparency, universality, plausible post-quantum security, and programming paradigm.[34] A transparent protocol is one that does not require any trusted setup and uses public randomness. A universal protocol is one that does not require a separate trusted setup for each circuit. Finally, a plausibly post-quantum protocol is one that is not susceptible to known attacks involving quantum algorithms.
Pinocchio[35] | 2013 | zk-SNARK | Procedural | ||||
Geppetto[36] | 2015 | zk-SNARK | Procedural | ||||
TinyRAM[37] | 2013 | zk-SNARK | Procedural | ||||
Buffet[38] | 2015 | zk-SNARK | Procedural | ||||
ZoKrates[39] | 2018 | zk-SNARK | Procedural | ||||
xJsnark[40] | 2018 | zk-SNARK | Procedural | ||||
vRAM[41] | 2018 | zk-SNARG | Assembly | ||||
vnTinyRAM[42] | 2014 | zk-SNARK | Procedural | ||||
MIRAGE[43] | 2020 | zk-SNARK | Arithmetic Circuits | ||||
Sonic[44] | 2019 | zk-SNARK | Arithmetic Circuits | ||||
Marlin[45] | 2020 | zk-SNARK | Arithmetic Circuits | ||||
PLONK[46] | 2019 | zk-SNARK | Arithmetic Circuits | ||||
SuperSonic[47] | 2020 | zk-SNARK | Arithmetic Circuits | ||||
Bulletproofs | 2018 | Bulletproofs | Arithmetic Circuits | ||||
Hyrax[48] | 2018 | zk-SNARK | Arithmetic Circuits | ||||
Halo[49] | 2019 | zk-SNARK | Arithmetic Circuits | ||||
Virgo[50] | 2020 | zk-SNARK | Arithmetic Circuits | ||||
Ligero[51] | 2017 | zk-SNARK | Arithmetic Circuits | ||||
Aurora[52] | 2019 | zk-SNARK | Arithmetic Circuits | ||||
zk-STARK[53] | 2019 | zk-STARK | Assembly | ||||
Zilch | 2021 | zk-STARK | Object-Oriented |