Tight Bounds for the Number of Absent Subsequences

Duncan Adamson, Pamela Fleischmann, Annika Huch, Florin Manea, Paul Sarnighausen-Cahn, Max Wiedenhöft

公開日: 2024/7/26

Abstract

A {\em subsequence} of a word $w$ is a word $u$ that can be obtained by deleting some letters from $w$ while maintaining the relative order of the remaining letters, e.g., $\mathtt{lala}$ is a subsequence of $\mathtt{alfalfa}$. A word, over some alphabet $\Sigma$, which has all possible words of length $\iota$ over $\Sigma$ as subsequences is called $\iota$-universal, and the largest $\iota$ for which this holds is called the universality index of $w$, and denoted $\iota(w)$. Moreover, words that are not subsequences of $w$ are called absent subsequences (AS) of $w$, and their investigation was started in (Kosche et al., 2022). In this paper, we present tight bounds on the number of AS of a given length $k$ among all words with the same universality index $\iota$. For both the lower and upper bound, we construct words that have, respectively, a minimal and maximal number of absent subsequences of the respective length $k$, and, in the case of the lower bound, we provide the exact number of missing subsequences as a closed form. Finally, we present efficient enumeration algorithms for the set of subsequences of given length of a word: we give a novel, optimal enumeration algorithm with output linear delay of this set of subsequences, with preprocessing time $O(|w|)$, which is further improved to an incremental enumeration algorithm with $O(1)$ delay of this set of subsequences, with preprocessing time $O(|w|)$.