Modelling peaks over thresholds in panel data: a two-level grouped panel generalized Pareto regression

Zefan Liu, Natalia Nolde

公開日: 2025/9/18

Abstract

Panel data arise in a wide range of application areas, and developing modelling methods for extreme values under such a setup is essential for reliable risk assessment and management. When choosing to model the marginal distributions of univariate extremes, one may wish to balance the flexibility in capturing the heterogeneity among margins and the efficiency of estimation. This can be achieved through a combination of regression techniques and assuming a latent group structure based on parameter values, which needs to be estimated from data. Building on an existing method, we propose a two-level grouped panel generalized Pareto regression framework, which models peaks over high thresholds in panel data. While retaining the wide applicability of the original modelling strategy, which is largely domain-knowledge-free, our new methodology uses the information of extreme events more exhaustively and allows the exploration of a broader model space, where parsimony and good model fit can be achieved simultaneously. We also address several estimation challenges associated with high-dimensional optimization and group structure identification. The finite-sample performance of our methodology is carefully evaluated through simulation studies. With an application to the summer river flow data from 31 stations in the upper Danube basin, we show that our methodology can effectively improve estimation efficiency while discovering patterns in the tail behavior that can be omitted by domain-knowledge-based regionalization and the existing method.