Version vector explained

A version vector is a mechanism for tracking changes to data in a distributed system, where multiple agents might update the data at different times. The version vector allows the participants to determine if one update preceded another (happened-before), followed it, or if the two updates happened concurrently (and therefore might conflict with each other). In this way, version vectors enable causality tracking among data replicas and are a basic mechanism for optimistic replication. In mathematical terms, the version vector generates a preorder that tracks the events that precede, and may therefore influence, later updates.

Version vectors maintain state identical to that in a vector clock, but the update rules differ slightly; in this example, replicas can either experience local updates (e.g., the user editing a file on the local node), or can synchronize with another replica:

Va[x]=Vb[x]=max(Va[x],Vb[x])

. After synchronization, the two replicas have identical version vectors.

Pairs of replicas,,, can be compared by inspecting their version vectors and determined to be either: identical (

a=b

), concurrent (

a\parallelb

), or ordered (

a<b

or

b<a

). The ordered relation is defined as: Vector

a<b

if and only if every element of

Va

is less than or equal to its corresponding element in

Vb

, and at least one of the elements is strictly less than. If neither

a<b

or

b<a

, but the vectors are not identical, then the two vectors must be concurrent.

Version vectors[1] or variants are used to track updates in many distributed file systems, such as Coda (file system) and Ficus, and are the main data structure behind optimistic replication.[2]

Other mechanisms

External links

Notes and References

  1. Douglas Parker, Gerald Popek, Gerard Rudisin, Allen Stoughton, Bruce Walker, Evelyn Walton, Johanna Chow, David Edwards, Stephen Kiser, and Charles Kline. Detection of mutual inconsistency in distributed systems. Transactions on Software Engineering. 1983
  2. David Ratner, Peter Reiher, and Gerald Popek. Dynamic version vector maintenance. Technical Report CSD-970022, Department of Computer Science, University of California, Los Angeles, 1997
  3. ByungHoon Kang, Robert Wilensky, and John Kubiatowicz.The Hash History Approach for Reconciling Mutual Inconsistency. ICDCS, pp. 670-677, IEEE Computer Society, 2003.
  4. [Dahlia Malkhi]
  5. Paulo Almeida, Carlos Baquero and Victor Fonte. Version Stamps: Decentralized Version Vectors. ICDCS, pp. 544-551, 2002.
  6. Paulo Almeida, Carlos Baquero and Victor Fonte. Interval Tree Clocks. OPODIS, Lecture Notes in Computer Science, Vol. 5401, pp. 259-274, Springer, 2008.
  7. José Almeida, Paulo Almeida and Carlos Baquero. Bounded Version Vectors. DISC: International Symposium on Distributed Computing, LNCS, 2004.
  8. Nuno Preguiça, Carlos Baquero, Paulo Almeida, Victor Fonte and Ricardo Gonçalves. Brief Announcement: Efficient Causality Tracking in Distributed Storage Systems With Dotted Version Vectors. ACM PODC, pp. 335-336, 2012.