Avoiding redundancies in words

The study of Combinatorics on words started at the beginning of the 20th century with the work of the Norwegian Mathematician Axel Thue, who published several articles in a relatively unknown journal. His work had primarily theoretical objectives, but ever since many of his results have been redisco...

Full description

Bibliographic Details
Main Author: Badkobeh, Golnaz
Other Authors: Crochemore, Maxime; Iliopoulos, Costas
Published: King's College London (University of London) 2013
Subjects:
004
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.669558
Description
Summary:The study of Combinatorics on words started at the beginning of the 20th century with the work of the Norwegian Mathematician Axel Thue, who published several articles in a relatively unknown journal. His work had primarily theoretical objectives, but ever since many of his results have been rediscovered independently by other researchers in relation to other problems. Although many questions have been studied and solved in the area, there are yet many open questions left to be studied. Among the basic discoveries of Thue are the existence of infinite words with no occurrence of squares (words of the form uu for a nonempty word u) on an alphabet of at least three symbols, and with no occurrence of cubes (and even overlaps) on a binary alphabet. The constraints on repetitions in infinite words have been raised to optimality after Dejean’s conjecture on the repetitive threshold associated with the alphabet size, which last cases have been proved recently by Rao after the works of Carpi, Pansiot, Moulin-Ollagnier, Mohammad-Noori and Currie, Currie and Rampersad. The first case says that the repetitive threshold of the binary alphabet is 2 (infinite binary words can avoid factor of exponent larger than 2 but cannot do more) and the second case, proved by Dejean, states that it is 7/4 for the three-letter alphabet. The constraint studied later on by Fraenkel and Simpson is somewhat orthogonal to the previous notion. Their parameter to the complexity of binary infinite words is the number of squares occurring in them without any restriction on the number of occurrences. The analysis of repetitions in strings is primarily of combinatorial interest in relation to the entropy of sequences. But repetitions or repeats are also of main concerns in the domains of text compression and of pattern matching. The knowledge of extreme situations or strongest constraints on words help analyse the behaviour of the corresponding algorithms. In this document, we provide a new proof for the Fraenkel and Simpson result, we give a proof that there exists an infinite binary word which contains finitely many squares and simultaneously avoids words of exponent larger than 7/3, which leads us to the concept introduced hereafter. A chapter is dedicated to new notion of Finite-Repetition threshold and some results about it. We give some new results on the trade-off between the number of squares and the number of maximal-exponent powers in infinite binary words. This is done in three cases where the maximal exponent is 7/3, 5/2, and 3, that is the only cases of interest. We show that there exists no infinite 3+-free binary word avoiding squares of odd period. This study also reveals there exists no infinite binary word, simultaneously avoiding cubes and squares of even period. Moreover, we proof that there exists an infinite 3+-free binary word avoiding squares of even-period length. We investigate the trade-off between the maximal period length of repetitions contained and their number. Similarly we exhibit a trade-off between number of cubes and number of squares occurring in an infinite word avoiding even-period squares. All bounds provided in these cases are shown to be optimal. Repetitions or repeats are also of main concern in the domains of text compression and of pattern matching. The knowledge of extreme situations or strongest constraints on words helps analyse the behaviour of the corresponding algorithms. In this document we mostly deal with the combinatorial aspects of the question. The algorithmic part is strongly linked and it is used to explore the words satisfying constraints on the repetitions they contain.