Graph removal lemma explained
In graph theory, the graph removal lemma states that when a graph contains few copies of a given subgraph, then all of the copies can be eliminated by removing a small number of edges. The special case in which the subgraph is a triangle is known as the triangle removal lemma.
The graph removal lemma can be used to prove Roth's theorem on 3-term arithmetic progressions, and a generalization of it, the hypergraph removal lemma, can be used to prove Szemerédi's theorem. It also has applications to property testing.
Formulation
Let
be a graph with
vertices. The graph removal lemma states that for any
, there exists a constant
\delta=\delta(\epsilon,H)>0
such that for any
-vertex graph
with fewer than
subgraphs isomorphic to
, it is possible to eliminate all copies of
by removing at most
edges from
.
An alternative way to state this is to say that for any
-vertex graph
with
subgraphs isomorphic to
, it is possible to eliminate all copies of
by removing
edges from
. Here, the
indicates the use of
little o notation.
In the case when
is a triangle, resulting lemma is called
triangle removal lemma.
History
The original motivation for the study of triangle removal lemma was the Ruzsa–Szemerédi problem. Its initial formulation due to Imre Z. Ruzsa and Szemerédi from 1978 was slightly weaker than the triangle removal lemma used nowadays and can be roughly stated as follows: every locally linear graph on
vertices contains
edges. This statement can be quickly deduced from a modern triangle removal lemma. Ruzsa and Szemerédi provided also an alternative proof of
Roth's theorem on arithmetic progressions as a simple corollary.
In 1986, during their work on generalizations of the Ruzsa–Szemerédi problem to arbitrary
-uniform graphs,
Erdős,
Frankl, and
Rödl provided a statement for general graphs very close to the modern graph removal lemma: if graph
is a
homomorphic image of
, then any
-free graph
on
vertices can be made
-free by removing
edges.
The modern formulation of the graph removal lemma was first stated by Füredi in 1994. The proof generalized earlier approaches by Ruzsa and Szemerédi and Erdős, Frankl, and Rödl, also using the Szemerédi regularity lemma.
Graph counting lemma
A key component of the proof of the graph removal lemma is the graph counting lemma about counting subgraphs in systems of regular pairs. The graph counting lemma is also very useful on its own. According to Füredi, it is used "in most applications of regularity lemma".
Heuristic argument
Let
be a graph on
vertices, whose vertex set is
and edge set is
. Let
be sets of vertices of some graph
such that, for all
, the pair
is
-regular (in the sense of the regularity lemma). Let also
be the density between the sets
and
. Intuitively, a regular pair
with density
should behave like a random
Erdős–Rényi-like graph, where every pair of vertices
is selected to be an edge independently with probability
. This suggests that the number of copies of
on vertices
such that
should be close to the expected number from the Erdős–Rényi model,
where
and
are the edge set and the vertex set of
.
Precise statement
The straightforward formalization of the above heuristic claim is as follows. Let
be a graph on
vertices, whose vertex set is
and whose edge set is
. Let
be arbitrary. Then there exists
such that for any
as above, satisfying
for all
, the number of
graph homomorphisms from
to
such that vertex
is mapped to
is not smaller than
Blow-up Lemma
One can even find bounded-degree subgraphs of blow-ups of
in a similar setting. The following claim appears in the literature under name of the
blow-up lemma and was first proven by
Komlós,
Sárközy, and Szemerédi. The precise statement here is a slightly simplified version due to Komlós, who referred to it also as the key lemma, as it is used in numerous regularity-based proofs.
Let
be an arbitrary graph and let
. Construct
by replacing each vertex
of
by an independent set
of size
and replacing every edge
of
by the complete
bipartite graph on
. Let
be arbitrary reals, let
be a positive integer, and let
be a subgraph of
with
vertices and maximum degree
. Define
| \Delta/(2+\Delta) |
\epsilon | |
| 0=\delta |
. Finally, let
be a graph and
be disjoint sets of vertices of
such that, whenever
, then
is a
-regular pair with density at least
. Then if
and
, then the number of injective graph homomorphisms from
to
is at least
.
In fact, one can restrict to counting only those homomorphisms such that any vertex
of
with
is mapped to a vertex in
.
Proof
We will provide a proof of the counting lemma in the case when
is a triangle (
triangle counting lemma). The proof of the general case, as well as the proof of the blow-up lemma, are very similar and do not require different techniques.
Take
. Let
be the set of those vertices in
which have at least
neighbors in
and at least
neighbors in
. Note that if there were more than
vertices in
with less than
neighbors in
, then these vertices together with the whole
would witness
-irregularity of the pair
. Repeating this argument for
shows that we must have
. Now take an arbitrary
and define
and
as neighbors of
in
and
, respectively. By definition,
|X2'|\geq(d12-\epsilon)|X2|\geq\epsilon|X2|
and
, so by the regularity of
we obtain existence of at least
triangles containing
. Since
was chosen arbitrarily from the set
of size at least
, we obtain a total of at least
which finishes the proof as
.
Proof
Proof of the triangle removal lemma
To prove the triangle removal lemma, consider an
-regular partition
of the vertex set of
. This exists by the Szemerédi regularity lemma. The idea is to remove all edges between irregular pairs, low-density pairs, and small parts, and prove that if at least one triangle still remains, then many triangles remain. Specifically, remove all edges between parts
and
if This procedure removes at most
edges. If there exists a triangle with vertices in
after these edges are removed, then the triangle counting lemma tells us there are at least
triples in
which form a triangle. Thus, we may take
Proof of the graph removal lemma
The proof of the case of general
is analogous to the triangle case, and uses the graph counting lemma instead of the triangle counting lemma.
Induced Graph Removal Lemma
A natural generalization of the graph removal lemma is to consider induced subgraphs. In property testing, it is often useful to consider how far a graph is from being induced -free. A graph
is considered to contain an induced subgraph
if there is an injective map
such that
is an edge of
if and only if
is an edge of
. Notice that non-edges are considered as well.
is induced
-free if there is no induced subgraph
. We define
as
-far from being induced
-free if we cannot add or delete
edges to make
induced
-free.
Formulation
A version of the graph removal lemma for induced subgraphs was proved by Alon, Fischer, Krivelevich, and Szegedy in 2000. It states that for any graph
with
vertices and
, there exists a constant
such that, if an
-vertex graph
has fewer than
induced subgraphs isomorphic to
, then it is possible to eliminate all induced copies of
by adding or removing fewer than
edges.
The problem can be reformulated as follows: Given a red-blue coloring
of the complete graph
(analogous to the graph
on the same
vertices where non-edges are blue and edges are red), and a constant
, then there exists a constant
such that for any red-blue colorings of
has fewer than
subgraphs isomorphic to
, then it is possible to eliminate all copies of
by changing the colors of fewer than
edges. Notice that our previous "cleaning" process, where we remove all edges between irregular pairs, low-density pairs, and small parts, only involves removing edges. Removing edges only corresponds to changing edge colors from red to blue. However, there are situations in the induced case where the optimal edit distance involves changing edge colors from blue to red as well. Thus, the regularity lemma is insufficient to prove the induced graph removal lemma. The proof of the induced graph removal lemma must take advantage of the
strong regularity lemma.
Proof
Strong Regularity Lemma
The strong regularity lemma is a strengthened version of Szemerédi's regularity lemma. For any infinite sequence of constants
\epsilon0\ge\epsilon1\ge...>0
, there exists an integer
such that for any graph
, we can obtain two (equitable) partitions
and
such that the following properties are satisfied:
refines
; that is, every part of
is the union of some collection of parts in
.
is
-regular and
is
-regular.
q(l{Q})<q(l{P})+\epsilon0
The function
is defined to be the
energy function defined in
Szemerédi regularity lemma. Essentially, we can find a pair of partitions
where
is regular compared to
, and at the same time
are close to each other. This property is captured in the third condition.
Corollary of the Strong Regularity Lemma
The following corollary of the strong regularity lemma is used in the proof of the induced graph removal lemma. For any infinite sequence of constants
\epsilon0\ge\epsilon1\ge...>0
, there exists
such that there exists a partition
and subsets
for each
where the following properties are satisfied:
is
-regular for each pair
|d(Wi,Wj)-d(Vi,Vj)|\le\epsilon0
for all but
pairs
The main idea of the proof of this corollary is to start with two partitions
and
that satisfy the Strong Regularity Lemma where
| 3/8 |
q(l{Q})<q(l{P})+\epsilon | |
| 0 |
. Then for each part
, we uniformly at random choose some part
that is a part in
. The expected number of irregular pairs
is less than 1. Thus, there exists some collection of
such that every pair is
-regular!
The important aspect of this corollary is that pair of
is
-regular! This allows us to consider edges and non-edges when we perform our cleaning argument.
Proof Sketch of the Induced Graph Removal Lemma
With these results, we are able to prove the induced graph removal lemma. Take any graph
with
vertices that has less than
copies of
. The idea is to start with a collection of vertex sets
which satisfy the conditions of the
Corollary of the Strong Regularity Lemma. We then can perform a "cleaning" process where we remove all edges between pairs of parts
with low density, and we can add all edges between pairs of parts
with high density. We choose the density requirements such that we added/deleted at most
edges.
If the new graph has no copies of
, then we are done. Suppose the new graph has a copy of
. Suppose the vertex
is embedded in
. Then if there is an edge connecting
in
, then
does not have low density. (Edges between
were not removed in the cleaning process.) Similarly, if there is not an edge connecting
in
, then
does not have high density. (Edges between
were not added in the cleaning process.)
Thus, by a similar counting argument to the proof of the triangle counting lemma (that is, the graph counting lemma), we can show that
has more than
copies of
.
Generalizations
The graph removal lemma was later extended to directed graphs and to hypergraphs.
Quantitative bounds
The usage of the regularity lemma in the proof of the graph removal lemma forces
to be extremely small, bounded by a
tower function whose height is polynomial in
; that is,
\delta=1/tower(\epsilon-O(1))
(here
is the tower of twos of height
). A tower function of height
is necessary in all regularity proofs, as is implied by results of
Gowers on lower bounds in the regularity lemma. However, in 2011,
Fox provided a new proof of the graph removal lemma which does not use the regularity lemma, improving the bound to
\delta=1/tower(5h2log\epsilon-1)
(here
is the number of vertices of the removed graph
). His proof, however, uses regularity-related ideas such as energy increment, but with a different notion of energy, related to
entropy. This proof can be also rephrased using the Frieze-Kannan weak regularity lemma as noted by Conlon and Fox. In the special case of bipartite
, it was shown that
is sufficient.
There is a large gap between the available upper and lower bounds for
in the general case. The current best result true for all graphs
is due to Alon and states that, for each nonbipartite
, there exists a constant
such that
is necessary for the graph removal lemma to hold, while for bipartite
, the optimal
has polynomial dependence on
, which matches the lower bound. The construction for the nonbipartite case is a consequence of
Behrend construction of large Salem-Spencer sets. Indeed, as the triangle removal lemma implies
Roth's theorem, existence of large Salem-Spencer sets may be translated to an upper bound for
in the triangle removal lemma. This method can be leveraged for arbitrary nonbipartite
to give the aforementioned bound.
Applications
Property testing
See also