Counting Distinct Square Substrings in Sublinear Time

Panagiotis Charalampopoulos, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, Wiktor Zuba

公開日: 2025/8/5

Abstract

We show that the number of distinct squares in a packed string of length $n$ over an alphabet of size $\sigma$ can be computed in $O(n/\log_\sigma n)$ time in the word-RAM model. This paper is the first to introduce a sublinear-time algorithm for counting squares in the packed setting. The packed representation of a string of length $n$ over an alphabet of size $\sigma$ is given as a sequence of $O(n/\log_\sigma n)$ machine words in the word-RAM model (a machine word consists of $\omega \ge \log_2 n$ bits). Previously, it was known how to count distinct squares in $O(n)$ time [Gusfield and Stoye, JCSS 2004], even for a string over an integer alphabet [Crochemore et al., TCS 2014; Bannai et al., CPM 2017; Charalampopoulos et al., SPIRE 2020]. We use the techniques for extracting squares from runs described by Crochemore et al. [TCS 2014]. However, the packed model requires novel approaches. We need an $O(n/\log_\sigma n)$-sized representation of all long-period runs (runs with period $\Omega(\log_\sigma n)$) which allows for a sublinear-time counting of the -- potentially linearly-many -- implied squares. The long-period runs with a string period that is periodic itself (called layer runs) are an obstacle, since their number can be $\Omega(n)$. The number of all other long-period runs is $O(n/\log_\sigma n)$ and we can construct an implicit representation of all long-period runs in $O(n/\log_\sigma n)$ time by leveraging the insights of Amir et al. [ESA 2019]. We count squares in layer runs by exploiting combinatorial properties of pyramidally-shaped groups of layer runs. Another difficulty lies in computing the locations of Lyndon roots of runs in packed strings, which is needed for grouping runs that may generate equal squares. To overcome this difficulty, we introduce sparse-Lyndon roots which are based on string synchronizers [Kempa and Kociumaka, STOC 2019].