Revealing the building blocks of tree balance: fundamental units of the Sackin and Colless Indices

Linda Knüver, Mareike Fischer

Published: 2025/9/5

Abstract

Over the past decades, more than 25 phylogenetic tree balance indices and several families of such indices have been proposed in the literature -- some of which even contain infinitely many members. It is well established that different indices have different strengths and perform unequally across application scenarios. For example, power analyses have shown that the ability of an index to detect the generative model of a given phylogenetic tree varies significantly between indices. This variation in performance motivates the ongoing search for new and possibly \enquote{better} (im)balance indices. An easy way to generate a new index is to construct a compound index, e.g., a linear combination of established indices. Two of the most prominent and widely used imbalance indices in this context are the Sackin index and the Colless index. In this study, we show that these classic indices are themselves compound in nature: they can be decomposed into more elementary components that independently satisfy the defining properties of a tree (im)balance index. We further show that the difference Colless minus Sackin results in another imbalance index that is minimized (amongst others) by all Colless minimal trees. Conversely, the difference Sackin minus Colless forms a balance index. Finally, we compare the building blocks of which the Sackin and the Colless indices consist to these indices as well as to the stairs2 index, which is another index from the literature. Our results suggest that the elementary building blocks we identify are not only foundational to established indices but also valuable tools for analyzing disagreement among indices when comparing the balance of different trees.

Revealing the building blocks of tree balance: fundamental units of the Sackin and Colless Indices | SummarXiv | SummarXiv