Edit distance in substitution systems

Andrew Best, Yuval Peres

Published: 2024/12/31

Abstract

Let $\sigma$ be a primitive substitution on an alphabet $\mathcal{A}$, and let $\mathcal{W}_n$ be the set of words of length $n$ determined by $\sigma$ (i.e., $w \in \mathcal{W}_n$ if $w$ is a subword of $\sigma^k(a)$ for some $a \in \mathcal{A}$ and $k \geq 1$). It is known that the corresponding substitution dynamical system is loosely Kronecker (also known as zero-entropy loosely Bernoulli), so the diameter of $\mathcal{W}_n$ in the edit distance is $o(n)$. We improve this upper bound to $O(n/\sqrt{\log n})$. The main challenge is handling the case where $\sigma$ is non-uniform; a better bound is available for the uniform case. Finally, we show that for the Thue--Morse substitution, the diameter of $\mathcal{W}_n$ is at least $\sqrt {n/6} - 1$.