Simplex tree explained

In topological data analysis, a simplex tree is a type of trie used to represent efficiently any general simplicial complex. Through its nodes, this data structure notably explicitly represents all the simplices. Its flexible structure allows the implementation of many basic operations useful to computing persistent homology. This data structure was invented by Jean-Daniel Boissonnat and Clément Maria in 2014, in the article The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes.[1] This data structure offers efficient operations on sparse simplicial complexes. For dense or maximal simplices, Skeleton-Blocker[2] representations or Toplex Map[3] representations are used.

Definitions

Many researchers in topological data analysis consider the simplex tree to be the most compact simplex-based data structure for simplicial complexes, and a data structure allowing an intuitive understanding of simplicial complexes due to integrated usage of their mathematical properties.[4]

Heuristic definition

Consider any simplicial complex is a set composed of points (0 dimensions), line segments (1 dimension), triangles (2 dimensions), and their n-dimensional counterparts, called n-simplexes within a topological space. By the mathematical properties of simplexes, any n-simplex is composed of multiple

(n-1)

-simplexes. Thus, lines are composed of points, triangles of lines, and tetrahedrons of triangles. Notice each higher level adds 1 vertex to the vertices of the n-simplex. The data structure is simplex-based, therefore, it should represent all simplexes uniquely by the points defining the simplex. A simple way to achieve this is to define each simplex by its points in sorted order.

Let

\Kappa

be a simplicial complex of dimension k,

V

its vertex set, where vertices are labeled from 1 to

\left\vertV\right\vert

and ordered accordingly. Now, construct a dictionary size

\left\vertV\right\vert

containing all vertex labels in order. This represents the 0-dimensional simplexes. Then, for the path to the initial dictionary of each entry in the initial dictionary, add as a child dictionary all vertices fully-connected to the current set of vertices, all of which have a label greater than

l

. Represent this step on k levels. Clearly, considering the first dictionary as depth 0, any entry at depth

\tau

of any dictionary in this data structure uniquely represents a

\tau

-simplex within

\Kappa

. For completeness, the point to the initial dictionary is considered the representation of the empty simplex. For the practicality of the operations, labels that are repeated on the same level are linked together, forming a looped linked list. Finally, child dictionaries also have pointers to their parent dictionary, for fast ancestor access.

Constructive definition

Let

\Kappa

be a simplicial complex of dimension k. We begin by decomposing the simplicial complex into mutually exclusive simplexes. This can be achieved in a greedy way by iteratively removing from the simplicial complex the highest order simplexes until the simplicial complex is empty. We then need to label each vertex from 1 to

\left\vertV\right\vert

and associate each simplex with its corresponding "word", that is the ordered list of its vertices by label. Ordering the labels ensures no repetition in the simplex tree, as there is only one way to describe a simplex. We start with a null root, representing the null simplex. Then, we iterate through all simplexes, and through each label of each simplex word. If the label is available as a child to the current root, make that child the temporary root of the insertion process, otherwise, create a new node for the child, make it the new temporary root, and continue with the rest of the word. During this process, k dictionaries are maintained with all the labels and insert the address of the node for the corresponding label. If an address is already at that space in the dictionary, a pointer is created from the old node to the new node. Once the process is finished, all children of each node are entered into a dictionary, and all pointers are looped to make looped linked lists. A wide range of dictionaries could be applied here, like hash tables, but some operations assume the possibility of an ordered traversal of the entries, leading most of the implementations to use red-black trees are dictionaries.

Operations

While simplex trees are not the most space efficient data structures for simplicial complex representation, their operations on sparse data are considered state-of-art. Here, we give the bounds of different useful operations possible through this representation. Many implementations of these operations are available.[4] [5] [6] [7]

We first introduce the notation. Consider

s

is a given simplex,

\sigma

is a given node corresponding to the last vertex of

s

,

l

is the label associate to that node,

j

is the depth of that node,

k

is the dimension of the simplicial complex,

D\sigma

is the maximal number of operations to access

\sigma

in a dictionary (if the dictionary is a red-black tree,

D\sigma=O(log(deg(\sigma)))

is the complexity) . Consider

Cs

is the number of cofaces of

s

, and
>j
N
l
is the number of nodes of the simplex tree ending with the label

l

at depth greater than

j

. Notice
>j
N
l

\leqCs

.
  1. Search, insert and remove words are done in

O(jD\sigma)

.
  1. Insert and remove an entire simplex is done in

O(2jD\sigma)

.
  1. Computing persistent homology, or in a more involved way, computing Betti numbers, using a simplex tree most efficiently remains an open problem, however, current algorithms for this task on sparse simplicial complexes achieve state-of-art performance.[4]
  2. The structure of simplex trees allows for elementary collapse of collapsible simplexes, however the bounds of this operation in the general case are unknown.[5] [7]
  3. A subcase of elementary collapse is edge-contraction. Edge contraction can be

achieved in

O(k

>j
N
l

+CsD\sigma)

.
  1. Locating cofaces of given simplex can be achieved in

O(k

>j
N
l

)

.
  1. Locating cofacets of given simplex can be achieved in

O(j2D\sigma)

.

As for construction, as seen in the constructive definition, construction is proportional to the number and complexity of simplexes in the simplicial complex. This can be especially expensive if the simplicial complex is dense. However, some optimizations for particular simplicial complexes, including for Flag complexes, Rips complexes and Witness complexes.[8]

Applications

Simplex trees are efficient in sparse simplicial complexes. For this purpose, many persistent homology algorithms focusing on high-dimensional real data (often sparse) use simplex trees within these algorithms. While simplex trees are not as efficient as incidence matrices, their simplex-based structure allows them to be useful and efficient for simplicial complex storage within persistent homology algorithms.[9]

Notes and References

  1. Boissonnat. Jean-Daniel. Maria. Clément. November 2014. The Simplex Tree: an Efficient Data Structure for General Simplicial Complexes. Algorithmica. 70. 3. 406–427. 10.1007/s00453-014-9887-3. 2001.02581. 15335393. 0178-4617.
  2. Web site: Salinas. David. 7 February 2020. Skeleton-Blocker. 9 December 2021. Geometry Understanding in Higher Dimensions.
  3. Web site: Godi. Francois. 7 February 2020. Toplex Map. 9 December 2021. Geometry Understanding in Higher Dimensions.
  4. Web site: Boissonnat. Jean-Daniel. Simplex tree reference manual. 9 December 2021. Geometry Understanding in Higher Dimensions.
  5. Web site: Piekenbrock. Matt. 13 September 2020. simplex_tree: Simplex Tree. 9 December 2021. rdrr.io.
  6. Web site: Nanda. Vidit. Perseus, the Persistent Homology Software. 9 December 2021. The Perseus Software Project for Rapid Computation of Persistent Homology.
  7. Web site: Morozov. Dmitriy. 2019. Basics. 9 December 2021. Dionysus 2.
  8. Web site: Morozov. Dmitriy. 2019. Vietoris–Rips Complexes. 9 December 2021. Dionysus 2.
  9. Mandal. Sayan. 2020. Applications of Persistent Homology and Cycles. The Ohio State University. PHD Thesis.