A locally decodable code (LDC) is an error-correcting code that allows a single bit of the original message to be decoded with high probability by only examining (or querying) a small number of bits of a possibly corrupted codeword.[1] [2] [3] This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Locally decodable codes are not a subset of locally testable codes, though there is some overlap between the two.[4]
Codewords are generated from the original message using an algorithm that introduces a certain amount of redundancy into the codeword; thus, the codeword is always longer than the original message. This redundancy is distributed across the codeword and allows the original message to be recovered with good probability even in the presence of errors. The more redundant the codeword, the more resilient it is against errors, and the fewer queries required to recover a bit of the original message.
More formally, a
(q,\delta,\epsilon)
n
x
N
C(x)
xi
1-\epsilon
q
C(x)
\deltaN
Furthermore, a perfectly smooth local decoder is a decoder such that, in addition to always generating the correct output given access to an uncorrupted codeword, for every
j\in[q]
i\in[n]
jth
ith
[N]
[y]
\{1,\ldots,y\}
Local list decoders are another interesting subset of local decoders. List decoding is useful when a codeword is corrupted in more than
\delta/2
\delta
\delta
\epsilon
\epsilon
\delta
\epsilon
Locally decodable codes can also be concatenated, where a message is encoded first using one scheme, and the resulting codeword is encoded again using a different scheme. (Note that, in this context, concatenation is the term used by scholars to refer to what is usually called composition; see [5]). This might be useful if, for example, the first code has some desirable properties with respect to rate, but it has some undesirable property, such as producing a codeword over a non-binary alphabet. The second code can then transform the result of the first encoding over a non-binary alphabet to a binary alphabet. The final encoding is still locally decodable, and requires additional steps to decode both layers of encoding.
The rate of a code refers to the ratio between its message length and codeword length:
|x| | |
|C(x)| |
The rate of a code is inversely related to the query complexity, but the exact shape of this tradeoff is a major open problem.[7] [8] It is known that there are no LDCs that query the codeword in only one position, and that the optimal codeword size for query complexity 2 is exponential in the size of the original message.[7] However, there are no known tight lower bounds for codes with query complexity greater than 2. Approaching the tradeoff from the side of codeword length, the only known codes with codeword length proportional to message length have query complexity
k\epsilon
\epsilon>0
Locally decodable codes have applications to data transmission and storage, complexity theory, data structures, derandomization, theory of fault tolerant computation, and private information retrieval schemes.[8]
Locally decodable codes are especially useful for data transmission over noisy channels. The Hadamard code (a special case of Reed Muller codes) was used in 1971 by Mariner 9 to transmit pictures of Mars back to Earth. It was chosen over a 5-repeat code (where each bit is repeated 5 times) because, for roughly the same number of bits transmitted per pixel, it had a higher capacity for error correction. (The Hadamard code falls under the general umbrella of forward error correction, and just happens to be locally decodable; the actual algorithm used to decode the transmission from Mars was a generic error-correction scheme.)[9]
LDCs are also useful for data storage, where the medium may become partially corrupted over time, or the reading device is subject to errors. In both cases, an LDC will allow for the recovery of information despite errors, provided that there are relatively few. In addition, LDCs do not require that the entire original message be decoded; a user can decode a specific portion of the original message without needing to decode the entire thing.[10]
One of the applications of locally decodable codes in complexity theory is hardness amplification. Using LDCs with polynomial codeword length and polylogarithmic query complexity, one can take a function
L:\{0,1\}n → \{0,1\}
L':\{0,1\}N → \{0,1\}
Consider
L
t
L
2t
L(x)
x\in\{0,1\}t
C
L
2O(t)=2t'
L'
t'
L'
L'
1-\epsilon
L'
L
L'
L
L
A private information retrieval scheme allows a user to retrieve an item from a server in possession of a database without revealing which item is retrieved. One common way of ensuring privacy is to have
k
One can easily see that locally decodable codes have applications in this setting. A general procedure to produce a
k
k
Let
C
n
N
k
S1,\ldots,Sk
n
x
C
N
C(x)
ith
x
k
q1,\ldotsqk
xi
C(x) | |
q1 |
,\ldots
C(x) | |
qk |
A
C
A
xi
qj
The Hadamard (or Walsh-Hadamard) code is an example of a simple locally decodable code that maps a string of length
k
2k
x\in\{0,1\}k
aj\in\{0,1\}k
jth
x\odotaj
x\odoty=
k | |
\sum\limits | |
i=1 |
xiyi
n | |
2 |
The local decoding algorithm has query complexity 2, and the entire original message can be decoded with good probability if the codeword is corrupted in less than
1 | |
4 |
\rho<
1 | |
4 |
\rho
ith
1-2\rho
Proof: Given a codeword
H
i
ith
x
Let
ej
\{0,1\}k
jth
y\in\{0,1\}k
f(y)
H
x\odoty
y\in\{0,1\}k
y'=y ⊕ ei
⊕
f(y) ⊕ f(y')
Correctness: By linearity,
(x\odoty) ⊕ (x\odoty')=(x\odoty) ⊕ (x\odot(y ⊕ ei))=(x\odoty) ⊕ (x\odoty) ⊕ (x\odotei)=x\odotei
But
(x\odotei)=xi
f(y)=x\odoty
f(y')=x\odoty'
Since
y
y'
f(y)=x\odoty
f(y')=x\odoty'
1-2\rho
The main idea behind local decoding of Reed-Muller codes is polynomial interpolation. The key concept behind a Reed-Muller code is a multivariate polynomial of degree
d
l
S
x
S
x
x
More formally, let
F
l,d
d<|F|
F,l,d
F\binom{l+d{d}} →
|F|l | |
F |
l
P
F
d
P
Fl
P(x1,\ldots,xl)=
\sum\limits | |
i1+\ldots+il\led |
c | |
i1,\ldots,il |
i1 | |
x | |
1 |
i2 | |
x | |
2 |
…
il | |
x | |
l |
\binom{l+d}{d}
\{P(x1,\ldots,xl)\}
x1,\ldots,xl\inF
To recover the value of a degree
d
w\inFn
w
d+1
w
v\inFn
L=\{w+λv\midλ\inF\}
w
S
F
|S|=d+1
w+λv
λ\inS
\{eλ\}
h
d
h(λ)=eλ
λ\inS
w
h(0)
w
Each individual query is distributed uniformly at random over the codeword. Thus, if the codeword is corrupted in at most a
\delta
1-(d+1)\delta