Dealing with Logs and Zeros in Regression Models
Christophe Bellégo, David Benatia, Louis Pape
Published: 2022/3/22
Abstract
The log transformation is widely used in linear regression, mainly because coefficients are interpretable as proportional effects. Yet this practice has fundamental limitations, most notably that the log is undefined at zero, creating an identification problem. We propose a new estimator, iterated OLS (iOLS), which targets the normalized average treatment effect, preserving the percentage-change interpretation while addressing these limitations. Our procedure is the theoretically justified analogue of the ad-hoc log(1+Y) transformation and delivers a consistent and asymptotically normal estimator of the parameters of the exponential conditional mean model. iOLS is computationally efficient, globally convergent, and free of the incidental-parameter bias, while extending naturally to endogenous regressors through iterated 2SLS. We illustrate the methods with simulations and revisit three influential publications.