Quadratic probing explained

Quadratic probing is an open addressing scheme in computer programming for resolving hash collisions in hash tables. Quadratic probing operates by taking the original hash index and adding successive values of an arbitrary quadratic polynomial until an open slot is found.

An example sequence using quadratic probing is:

H+12,H+22,H+32,H+42,...,H+k2

Quadratic probing is often recommended as an alternative to linear probing because it incurs less clustering.[1] Quadratic probing exhibits better locality of reference than many other hash table such as chaining; however, for queries, quadratic probing does not have as good locality as linear probing, causing the latter to be faster in some settings.[2]

Quadratic probing was first introduced by Ward Douglas Maurer in 1968.[3]

Quadratic function

Let h(k) be a hash function that maps an element k to an integer in [0, ''m''−1], where m is the size of the table. Let the ith probe position for a value k be given by the function

h(k,i)=h(k)+c1i+c2i2\pmod{m}

where c2 ≠ 0 (If c2 = 0, then h(k,i) degrades to a linear probe). For a given hash table, the values of c1 and c2 remain constant.

Examples:

h(k,i)=(h(k)+i+i2)\pmod{m}

, then the probe sequence will be

h(k),h(k)+2,h(k)+6,...

h(k),h(k)+1,h(k)+3,h(k)+6,...

(the triangular numbers) where the values increase by 1, 2, 3, ...

m=np

, where m, n, and p are integer greater or equal 2 (degrades to linear probe when p = 1), then

h(k,i)=(h(k)+i+ni2)\pmod{m}

gives cycle of all distinct probes. It can be computed in loop as:

h(k,0)=h(k)

, and

h(k,i+1)=(h(k,i)+2in+n+1)\pmod{m}

h(k,i)=h(k)+((i2+i)/2)\pmod{roundUp2(m)}

, and skip iteration when

h(k,i)>=m

. There is maximum

roundUp2(m)-m<m/2

skipped iterations, and these iterations do not refer to memory, so it is fast operation on most modern processors. Rounding up m can be computed by: uint64_t roundUp2(uint64_t v)

Limitations

Alternating signs

If the sign of the offset is alternated (e.g. +1, −4, +9, −16, etc.), and if the number of buckets is a prime number

p

congruent to 3 modulo 4 (e.g. 3, 7, 11, 19, 23, 31, etc.), then the first

p

offsets will be unique (modulo

p

). In other words, a permutation of 0 through

p-1

is obtained, and, consequently, a free bucket will always be found as long as at least one exists.

External links

Notes and References

  1. Book: Cormen . Thomas H. . Introduction to algorithms . Leiserson . Charles Eric . Rivest . Ronald Linn . Stein . Clifford . 2009 . MIT Press . 978-0-262-53305-8 . 3rd . Cambridge, Massachusetts London, England.
  2. Richter . Stefan . Alvarez . Victor . Dittrich . Jens . 2015 . A seven-dimensional analysis of hashing methods and its implications on query processing . Proceedings of the VLDB Endowment . 9 . 3 . 96–107 . 10.14778/2850583.2850585 . 2150-8097.
  3. Maurer . W. D. . 1968 . Programming Technique: An improved hash code for scatter storage . Communications of the ACM . 11 . 1 . 35–38 . 10.1145/362851.362880 . 0001-0782. free .
  4. The Art of Computer Science Volume 3 Sorting and Searching, Chapter 6.4, exercise 20, Donald Knuth