Square-free word explained
In combinatorics, a square-free word is a word (a sequence of symbols) that does not contain any squares. A square is a word of the form, where is not empty. Thus, a square-free word can also be defined as a word that avoids the pattern .
Finite square-free words
Binary alphabet
, the only square-free words are the empty word
, and
.
Ternary alphabet
Over a ternary alphabet
, there are infinitely many square-free words. It is possible to count the number
of ternary square-free words of length .01 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 1 | 3 | 6 | 12 | 18 | 30 | 42 | 60 | 78 | 108 | 144 | 204 | 264 | |
---|
This number is bounded by
, where .[1] The upper bound on
can be found via Fekete's Lemma and approximation by automata. The lower bound can be found by finding a substitution that preserves square-freeness.Alphabet with more than three letters
Since there are infinitely many square-free words over three-letter alphabets, this implies there are also infinitely many square-free words over an alphabet with more than three letters.
The following table shows the exact growth rate of the -ary square-free words, rounded off to 7 digits after the decimal point, for in the range from 4 to 15:
45 | 6 | 7 | 8 | 9 |
growth rate | 2.6215080 | 3.7325386 | 4.7914069 | 5.8284661 | 6.8541173 | 7.8729902 |
---|
alphabet size | 10 | 11 | 12 | 13 | 14 | 15 |
---|
growth rate | 8.8874856 | 9.8989813 | 10.9083279 | 11.9160804 | 12.9226167 | 13.9282035 | |
---|
2-dimensional words
Consider a map
from
to, where is an alphabet and
is called a 2-dimensional word. Let
be the entry
. A word
is a line of
if there exists
such that
, and for
t\ge0,xt=
+{j1t},{i2}+{j2t}}
.
Carpi[2] proves that there exists a 2-dimensional word
over a 16-letter alphabet such that every line of
is square-free. A computer search shows that there are no 2-dimensional words
over a 7-letter alphabet, such that every line of
is square-free.
Generating finite square-free words
Shur[3] proposes an algorithm called R2F (random-t(w)o-free) that can generate a square-free word of length over any alphabet with three or more letters. This algorithm is based on a modification of entropy compression: it randomly selects letters from a k-letter alphabet to generate a -ary square-free word. algorithm R2F is input: alphabet size
, word length
output: a -ary square-free word of length . choose
in
uniformly at random
set
to
followed by all other letters of
in increasing order
set the number of iterations
to 0
while
do choose in
uniformly at random append
to the end of update
shifting the first elements to the right and setting
increment by
if ends with a square of rank
then delete the last letters of
return Every (k+1)-ary square-free word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of .
The expected number of random k-ary letters used by Algorithm R2F to construct a -ary square-free word of length isNote that there exists an algorithm that can verify the square-freeness of a word of length in
time. Apostolico and Preparata
[4] give an algorithm using suffix trees. Crochemore
[5] uses partitioning in his algorithm. Main and Lorentz
[6] provide an algorithm based on the divide-and-conquer method. A naive implementation may require
time to verify the square-freeness of a word of length .
Infinite square-free words
There exist infinitely long square-free words in any alphabet with three or more letters, as proved by Axel Thue.[7]
Examples
One example of an infinite square-free word over an alphabet of size 3 is the word over the alphabet
obtained by taking the first difference of the
Thue–Morse sequence. That is, from the Thue–Morse sequence
0,1,1,0,1,0,0,1,1,0,0,1,0,1,1,0...
one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting square-free word is
1,0,-1,1,-1,0,1,0,-1,0,1,-1,1,0,-1,...
.
Another example found by John Leech[8] is defined recursively over the alphabet
. Let be any square-free word starting with the letter . Define the words
recursively as follows: the word
is obtained from by replacing each in with, each with, and each with . It is possible to prove that the sequence converges to the infinite square-free word
Generating infinite square-free words
Infinite square-free words can be generated by square-free morphism. A morphism is called square-free if the image of every square-free word is square-free. A morphism is called k–square-free if the image of every square-free word of length k is square-free.
Crochemore[9] proves that a uniform morphism is square-free if and only if it is 3-square-free. In other words, is square-free if and only if
is square-free for all square-free of length 3. It is possible to find a square-free morphism by
brute-force search.
algorithm square-free_morphism
is output: a square-free morphism with the lowest possible rank .
set
while True
do set k_sf_words to the list of all square-free words of length over a ternary alphabet
for each
in k_sf_words do for each
in k_sf_words do for each
in k_sf_words do if
then break from the current loop (advance to next
)
if
and
then if
is square-free
for all square-free of length
then return
increment by Over a ternary alphabet, there are exactly 144 uniform square-free morphisms of rank 11 and no uniform square-free morphisms with a lower rank than 11.
To obtain an infinite square-free words, start with any square-free word such as, and successively apply a square-free morphism to it. The resulting words preserve the property of square-freeness. For example, let be a square-free morphism, then as
,
is an infinite square-free word.
Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is square-free if and only if it is 5-square-free.
Letter combinations in square-free words
Avoid two-letter combinations
Over a ternary alphabet, a square-free word of length more than 13 contains all the square-free two-letter combinations.[10]
This can be proved by constructing a square-free word without the two-letter combination . As a result, is the longest square-free word without the combination and its length is equal to 13.
Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary two-letter combination.
Avoid three-letter combinations
Over a ternary alphabet, a square-free word of length more than 36 contains all the square-free three-letter combinations.
However, there are square-free words of any length without the three-letter combination .
Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary three-letter combination.
Density of a letter
The density of a letter in a finite word is defined as
where
is the number of occurrences of in
and
is the length of the word. The density of a letter in an infinite word is
where
is the prefix of the word of length .
[11] The minimal density of a letter in an infinite ternary square-free word is equal to
.
The maximum density of a letter in an infinite ternary square-free word is equal to
.
[12] Notes
- Shur. Arseny. 2011. Growth properties of power-free languages. Computer Science Review. 6. 5–6. 28–43. 10.1016/j.cosrev.2012.09.001.
- Carpi. Arturo. 1988. Multidimensional unrepetitive configurations. Theoretical Computer Science. 56. 2. 233–241. 10.1016/0304-3975(88)90080-1. 0304-3975. free.
- Shur. Arseny. 2015. Generating square-free words efficiently. Theoretical Computer Science. 601. 67–72. 10.1016/j.tcs.2015.07.027. free. 10995/92700. free.
- Apostolico. A.. Preparata. F.P.. Feb 1983. Optimal off-line detection of repetitions in a string. Theoretical Computer Science. 22. 3. 297–315. 10.1016/0304-3975(83)90109-3. 0304-3975. free.
- Crochemore. Max. Oct 1981. An optimal algorithm for computing the repetitions in a word. Information Processing Letters. 12. 5. 244–250. 10.1016/0020-0190(81)90024-7. 0020-0190.
- Main. Michael G. Lorentz. Richard J. Sep 1984. An O(n log n) algorithm for finding all repetitions in a string. Journal of Algorithms. 5. 3. 422–432. 10.1016/0196-6774(84)90021-x. 0196-6774.
- Book: Berstel, Jean. Axel Thue's papers on repetitions in words a translation. Départements de mathématiques et d'informatique, Université du Québec à Montréal. 1994. 978-2892761405. 494791187.
- Leech. J.. John Leech (mathematician). 1957. A problem on strings of beads. Math. Gaz.. 41. 277–278. 0079.01101. 10.1017/S0025557200236115. 126406225 .
- Book: Berstel, Jean. April 1984. Some Recent Results on Squarefree Words. Annual Symposium on Theoretical Aspects of Computer Science. 166. 14–25. 10.1007/3-540-12920-0_2. Lecture Notes in Computer Science. 978-3-540-12920-2. http://www.numdam.org/item/PDML_1985___2B_21_0/.
- 1505.00019. Zolotov. Boris. Another Solution to the Thue Problem of Non-Repeating Words. math.CO. 2015.
- Khalyavin. Andrey. 2007. The minimal density of a letter in an infinite ternary square-free word is 883/3215. Journal of Integer Sequences. 10. 2. 3. 2007JIntS..10...65K.
- Ochem. Pascal. 2007. Letter frequency in infinite repetition-free words. Theoretical Computer Science. 380. 3. 388–392. 10.1016/j.tcs.2007.03.027. 0304-3975.
References
- Book: Berstel . Jean . Lauve . Aaron . Reutenauer . Christophe . Saliola . Franco V. . Jean Berstel . Combinatorics on words. Christoffel words and repetitions in words . CRM Monograph Series . 27 . Providence, RI . . 2009 . 978-0-8218-4480-9 . 1161.68043 .
- Book: Lothaire, M. . M. Lothaire . Combinatorics on words . . Cambridge . 1997 . 978-0-521-59924-5 . .
- Book: Lothaire, M. . M. Lothaire . Algebraic combinatorics on words . With preface by Jean Berstel and Dominique Perrin . Reprint of the 2002 hardback . Encyclopedia of Mathematics and Its Applications . 90. . 2011 . 978-0-521-18071-9 . 1221.68183 .
- Book: Pytheas Fogg, N. . Berthé, Valérie. Valérie Berthé. Ferenczi, Sébastien. Mauduit, Christian. Siegel, Anne . Substitutions in dynamics, arithmetics and combinatorics . Lecture Notes in Mathematics . 1794 . Berlin . . 2002 . 978-3-540-44141-0 . 1014.11015 .