Distributed data flow explained

Distributed data flow (also abbreviated as distributed flow) refers to a set of events in a distributed application or protocol.

Distributed data flows serve a purpose analogous to variables or method parameters in programming languages such as Java, in that they can represent state that is stored or communicated by a layer of software. Unlike variables or parameters, which represent a unit of state that resides in a single location, distributed flows are dynamic and distributed: they simultaneously appear in multiple locations within the network at the same time. As such, distributed flows are a more natural way of modeling the semantics and inner workings of certain classes of distributed systems. In particular, the distributed data flow abstraction has been used as a convenient way of expressing the high-level logical relationships between parts of distributed protocols.[1] [2] [3]

Informal properties

A distributed data flow satisfies the following informal properties.

Formal representation

Formally, we represent each event in a distributed flow as a quadruple of the form (x,t,k,v), where x is the location (e.g., the network address of a physical node) at which the event occurs, t is the time at which this happens, k is a version, or a sequence number identifying the particular event, and v is a value that represents the event payload (e.g., all the arguments passed in a method call). Each distributed flow is a (possibly infinite) set of such quadruples that satisfies the following three formal properties.

In addition to the above, flows can have a number of additional properties.

Notes and References

  1. Ostrowski, K., Birman, K., Dolev, D., and Sakoda, C. (2009). "Implementing Reliable Event Streams in Large Systems via Distributed Data Flows and Recursive Delegation", 3rd ACM International Conference on Distributed Event-Based Systems (DEBS 2009), Nashville, TN, USA, July 6–9, 2009, http://www.cs.cornell.edu/~krzys/krzys_debs2009.pdf
  2. Ostrowski, K., Birman, K., and Dolev, D. (2009). "Distributed Data Flow Language for Multi-Party Protocols", 5th ACM SIGOPS Workshop on Programming Languages and Operating Systems (PLOS 2009), Big Sky, MT, USA. October 11, 2009, http://www.cs.cornell.edu/~krzys/krzys_plos2009.pdf
  3. Ostrowski, K., Birman, K., Dolev, D. (2009). "Programming Live Distributed Objects with Distributed Data Flows", Submitted to the International Conference on Object Oriented Programming, Systems, Languages and Applications (OOPSLA 2009), http://www.cs.cornell.edu/~krzys/krzys_oopsla2009.pdf
  4. De Francesco . N. . Perego . G. . Vaglini . G. . Vanneschi . M. . 1980-12-01 . A framework for data-flow distributed processing . Calcolo . en . 17 . 4 . 333–363 . 10.1007/BF02578622 . 1126-5434.
  5. Reif . John H. . Smolka . Scott A. . February 1990 . Data Flow Analysis of Distributed Communicating Processes . International Journal of Parallel Programming . 19 . 1. 1–30 . 10.1007/BF01407862 .
  6. Book: Gallizzi . Edmund . Zondervan . Quinton . Distributed data flow computing system . 1992 . Proceedings of the 30th annual Southeast regional conference on - ACM-SE 30 . http://portal.acm.org/citation.cfm?doid=503720.503770 . en . ACM Press . 421 . 10.1145/503720.503770 . 978-0-89791-506-9.