Intellectual Up-streams of Percentage Scale ($ps$) and Percentage Coefficient ($b_p$) -- Effect Size Analysis (Theory Paper 2)

Xinshu Zhao, Qinru Ruby Ju, Piper Liping Liu, Dianshi Moses Li, Luxi Zhang, Jizhou Francis Ye, Song Harris Ao, Ming Milano Li

Published: 2025/7/18

Abstract

Percentage thinking, i.e., assessing quantities as parts per hundred, spread from Roman tax ledgers to modern algorithms. Building on Simon Stevin's La Thiende (1585) and the 19th-century metrication that institutionalized base-10 measurement (Cajori, 1925), this article traces how base-10 normalization, especially the 0-1 percentage scale, became a shared language for human and machine understanding. We retrace 1980s efforts at UW-Madison and UNC Chapel Hill to "percentize" variables to make regression coefficients interpretable, and relate these experiments to established indices, notably the Pearson (1895) correlation r (range -1 to 1) and the coefficient of determination r-squared (Wright, 1920). We also revisit Cohen et al.'s (1999) percent of maximum possible (POMP) metric. The lineage of 0-100 and 0-1 scales includes Roman fiscal practice, early American grading at Yale and Harvard, and recurring analyses of percent (0-100) and percentage (0-1, or -1 to 1) scales that repeatedly reinvent the same indices (Durm, 1993; Schneider and Hutt, 2014). In data mining and machine learning, min-max normalization maps any feature to [0, 1] (i.e., 0-100%), equalizing scale ranges and implied units across percentized variables, which improves comparability of predictors. Under the percentage theory of measurement indices, equality of units is the necessary and sufficient condition for comparing indices (Cohen et al., 1999; Zhao et al., 2024; Zhao and Zhang, 2014). Seen this way, the successes of machine learning and artificial intelligence over the past half century constitute large-scale evidence for the comparability of percentage-based indices, foremost the percentage coefficient (bp).

Read Full Paper (arXiv.org)