Spatial trend estimation under potential heterogeneity is an important
problem to extract spatial characteristics and hazards such as criminal
activity. By focusing on quantiles, which provide substantial information on
distributions compared with commonly used summary statistics such as means, it
is often useful to estimate not only the average trend but also the high (low)
risk trend additionally. In this paper, we propose a Bayesian quantile trend
filtering method to estimate the non-stationary trend of quantiles on graphs
and apply it to crime data in Tokyo between 2013 and 2017. By modeling multiple
observation cases, we can estimate the potential heterogeneity of spatial crime
trends over multiple years in the application. To induce locally adaptive
Bayesian inference on trends, we introduce general shrinkage priors for graph
differences. Introducing so-called shadow priors with multivariate distribution
for local scale parameters and mixture representation of the asymmetric Laplace
distribution, we provide a simple Gibbs sampling algorithm to generate
posterior samples. The numerical performance of the proposed method is
demonstrated through simulation studies.
arXivππ€
Locally Adaptive Spatial Quantile Smoothing: Application to Monitoring Crime Density in Tokyo
By
27.02.2026 22:07 β
π 1
π 0
π¬ 0
π 0
Intrinsic Gaussian fields are used in many areas of statistics as models for spatial or spatio-temporal dependence, or as priors for latent variables. However, there are two major gaps in the literature: first, the number and flexibility of existing intrinsic models are very limited; second, theory, fast inference, and software are currently underdeveloped for intrinsic fields. We tackle these challenges by introducing the new flexible class of intrinsic Whittle--Mat\'ern Gaussian random fields obtained as the solution to a stochastic partial differential equation (SPDE). Exploiting sparsity resulting from finite-element approximations, we develop fast estimation and simulation methods for these models. We demonstrate the benefits of this intrinsic SPDE approach for the important task of kriging under extrapolation settings. Leveraging the connection of intrinsic fields to spatial extreme value processes, we translate our theory to an SPDE approach for Brown--Resnick processes for sparse modeling of spatial extreme events. This new paradigm paves the way for efficient inference in unprecedented dimensions. To demonstrate the wide applicability of our new methodology, we apply it in two very different areas: a longitudinal study of renal function data, and the modeling of marine heat waves using high-resolution sea surface temperature data.
arXivππ€
Intrinsic Whittle--Mat\'ern fields and sparse spatial extremes
By Bolin, Braunsteins, Engelke et al
27.02.2026 19:16 β
π 0
π 0
π¬ 0
π 0
Bayesian multinomial logistic regression provides a principled, interpretable approach to multiclass classification, but posterior sampling becomes increasingly expensive as the model dimension grows. Prior work has studied scalability in the number of subjects and covariates; in contrast, this paper focuses on how computation changes as the number of outcome categories increases. To improve scalability in settings with numerous categories, we adapt a gamma-augmentation strategy to decouple category-specific coefficient updates, so that each category's coefficients can be updated conditional on a single auxiliary variable per subject, rather than on the full set of other categories' coefficients. Because the resulting coefficient conditionals are non-conjugate, we couple this augmentation with either adaptive Metropolis-Hastings or elliptical slice sampling. Through simulation and a real-data example, we compare effective sample size and effective sampling rate across several standard competitors. We find that the best-performing sampler depends on the dimension and imbalance regime, and that the proposed augmentation provides substantial speedups in scenarios with numerous categories.
arXivππ€
Bayesian Multinomial Logistic Regression for Numerous Categories
By Fisher, McEvoy
27.02.2026 17:24 β
π 1
π 0
π¬ 0
π 0
Data privacy is important in the AI era, and differential privacy (DP) is one of the golden solutions. However, DP is typically applicable only if data have a bounded underlying distribution. We address this limitation by leveraging second-moment information from a small amount of public data. We propose Public-moment-guided Truncation (PMT), which transforms private data using the public second-moment matrix and applies a principled truncation whose radius depends only on non-private quantities: data dimension and sample size. This transformation yields a well-conditioned second-moment matrix, enabling its inversion with a significantly strengthened ability to resist the DP noise. Furthermore, we demonstrate the applicability of PMT by using penalized and generalized linear regressions. Specifically, we design new loss functions and algorithms, ensuring that solutions in the transformed space can be mapped back to the original domain. We have established improvements in the models' DP estimation through theoretical error bounds, robustness guarantees, and convergence results, attributing the gains to the conditioning effect of PMT. Experiments on synthetic and real datasets confirm that PMT substantially improves the accuracy and stability of DP models.
arXivππ€
Differentially Private Truncation of Unbounded Data via Public Second Moments
By Cao, Bi, Zhang
27.02.2026 17:19 β
π 0
π 0
π¬ 0
π 0
Model selection is a central task in statistics, but standard methods are not robust in misspecified settings where the true data-generating process (DGP) is not in the set of candidate models. The key limitation is that existing methods -- including information criteria and Bayesian posteriors -- do not quantify uncertainty about how well each candidate model approximates the true DGP. In this paper, we introduce a novel approach to model selection based on modeling the likelihood values themselves. Specifically, given $K$ candidate models and $n$ observations, we view the $n\times K$ matrix of negative log-likelihood values as a random data matrix and observe that the expectation of each row is equal to the vector of Kullback--Leibler divergences between the $K$ models and the true DGP, up to an additive constant. We use a multivariate normal model to estimate and quantify uncertainty in this expectation, providing calibrated inferences for robust model selection under misspecification. The procedure is easy to compute, interpretable, and comes with theoretical guarantees, including consistency.
arXivππ€
Robust model selection using likelihood as data
By Choi, Spencer, Miller
27.02.2026 17:17 β
π 0
π 0
π¬ 0
π 0
Quantifying uncertainty in future climate projections is hindered by the prohibitive computational cost of running physical climate models, which severely limits the availability of training data. We propose a data-efficient framework for emulating the internal variability of global climate fields, specifically designed to overcome these sample-size constraints. Inspired by copula modeling, our approach constructs a highly expressive joint distribution via a composite transformation to a multivariate standard normal space. We combine a nonparametric Bayesian transport map for spatial dependence modeling with flexible, spatially varying marginal models, essential for capturing non-Gaussian behavior and heavy-tailed extremes. These marginals are defined by a parametric model followed by a semi-parametric B-spline correction to capture complex distributional features. The marginal parameters are spatially smoothed using Gaussian-process priors with low-rank approximations, rendering the computational cost linear in the spatial dimension. When applied to global log-precipitation-rate fields at more than 50,000 grid locations, our stochastic surrogate achieves high fidelity, accurately quantifying the climate distribution's spatial dependence and marginal characteristics, including the tails. Using only 10 training samples, it outperforms a state-of-the-art competitor trained on 80 samples, effectively octupling the computational budget for climate research. We provide a Python implementation at https://github.com/jobrachem/ppptm .
arXivππ€
Data-Efficient Generative Modeling of Non-Gaussian Global Climate Fields via Scalable Composite Transformations
By Brachem, Wiemann, Katzfuss
27.02.2026 17:12 β
π 0
π 0
π¬ 0
π 0
The study of causal effects in the presence of unmeasured spatially varying confounders has garnered increasing attention. However, a general framework for identifiability, which is critical for reliable causal inference from observational data, has yet to be advanced. In this paper, we study a linear model with various parametric model assumptions on the covariance structure between the unmeasured confounder and the exposure of interest. We establish identifiability of the treatment effect for many commonly 20 used spatial models for both discrete and continuous data, under mild conditions on the structure of observation locations and the exposure-confounder association. We also emphasize models or scenarios where identifiability may not hold, under which statistical inference should be conducted with caution.
arXivππ€
Identifiability of Treatment Effects with Unobserved Spatially Varying Confounders
By Tang, Li, Li
27.02.2026 17:09 β
π 0
π 0
π¬ 0
π 0
Switchback experiments--alternating treatment and control over time--are widely used when unit-level randomization is infeasible, outcomes are aggregated, or user interference is unavoidable. In practice, experimentation must support fast product cycles, so teams often run studies for limited durations and make decisions with modest samples. At the same time, outcomes in these time-indexed settings exhibit serial dependence, seasonality, and occasional heavy-tailed shocks, and temporal interference (carryover or anticipation) can render standard asymptotics and naive randomization tests unreliable. In this paper, we develop a randomization-test framework that delivers finite-sample valid, distribution-free p-values for several null hypotheses of interest using only the known assignment mechanism, without parametric assumptions on the outcome process. For causal effects of interests, we impose two primitive conditions--non-anticipation and a finite carryover horizon m--and construct conditional randomization tests (CRTs) based on an ex ante pooling of design blocks into "sections," which yields a tractable conditional assignment law and ensures imputability of focal outcomes. We provide diagnostics for learning the carryover window and assessing non-anticipation, and we introduce studentized CRTs for a session-wise weak null that accommodates within-session seasonality with asymptotic validity. Power approximations under distributed-lag effects with AR(1) noise guide design and analysis choices, and simulations demonstrate favorable size and power relative to common alternatives. Our framework extends naturally to other time-indexed designs.
arXivππ€
Randomization Tests in Switchback Experiments
By Liu, Zhong
27.02.2026 17:06 β
π 0
π 0
π¬ 0
π 0
We propose a new and interpretable class of high-dimensional tail dependence models based on latent linear factor structures. Specifically, extremal dependence of an observable vector is assumed to be driven by a lower-dimensional latent $K$-factor model, where $K \ll d$, thereby inducing an explicit form of dimension reduction. Geometrically, this is reflected in the support of the associated spectral dependence measure, whose intrinsic dimension is at most $K-1$. The loading structure may additionally exhibit sparsity, meaning that each component is influenced by only a small number of latent factors, which further enhances interpretability and scalability. Under mild structural assumptions, we establish identifiability of the model parameters and provide a constructive recovery procedure based on a margin-free tail pairwise dependence matrix, which also yields practical rank-based estimation methods. The framework combines naturally with marginal tail models and is particularly well suited to high-dimensional settings. We illustrate its applicability in a spatial wind energy application, where the latent factor structure enables tractable estimation of the risk that a large proportion of turbines simultaneously fall below their cut-in wind speed thresholds.
arXivππ€
Dimension Reduction in Multivariate Extremes via Latent Linear Factor Models
By Boulin, B\"ucher
27.02.2026 17:02 β
π 0
π 0
π¬ 0
π 0
We re-examine the traditional Mean-Squared Error (MSE) forecasting paradigm by formally integrating an accuracy-timeliness trade-off: accuracy is defined by MSE (or target correlation) and timeliness by advancement (or phase excess). While MSE-optimized predictors are accurate in tracking levels, they sacrifice dynamic lead, causing them to lag behind changing targets. To address this, we introduce two `look-ahead' frameworks--Decoupling-from-Present (DFP) and Peak-Correlation-Shifting (PCS)--and provide closed-form solutions for their optimization. Notably, the classical MSE predictor is shown to be a special case within these frameworks. Dually, our methods achieve maximum advancement for any given accuracy level, so our approach reveals the complete efficient frontier of the accuracy-timeliness trade-off, whereas MSE represents only a single point. We also derive a universal upper bound on lead over MSE for any linear predictor under a consistency constraint and prove that our methods hit this ceiling. We validate this approach through applications in forecasting and real-time signal extraction, introducing a leading-indicator criterion and tailored linear benchmarks.
arXivππ€
Forecasting on the Accuracy-Timeliness Frontier: Two Novel `Look Ahead' Predictors
By Wildi
27.02.2026 17:01 β
π 2
π 1
π¬ 0
π 0
Sensitivity and specificity evaluated at an optimal diagnostic cut-off are fundamental measures of classification accuracy when continuous biomarkers are used for disease diagnosis. Joint inference for these quantities is challenging because their estimators are evaluated at a common, data-driven threshold estimated from both diseased and healthy samples, inducing statistical dependence. Existing approaches are largely based on parametric assumptions or fully nonparametric procedures, which may be sensitive to model misspecification or lack efficiency in moderate samples. We propose a semiparametric framework for joint inference on sensitivity and specificity at the Youden-optimal cut-off under the density ratio model. Using maximum empirical likelihood, we derive estimators of the optimal threshold and the corresponding sensitivity and specificity, and establish their joint asymptotic normality. This leads to Wald-type and range-preserving logit-transformed confidence regions. Simulation studies show that the proposed method achieves accurate coverage with improved efficiency relative to existing parametric and nonparametric alternatives across a variety of distributional settings. An analysis of COVID-19 antibody data demonstrates the practical advantages of the proposed approach for diagnostic decision-making.
arXivππ€
Semiparametric Joint Inference for Sensitivity and Specificity at the Youden-Optimal Cut-Off
By Liu, Tian, Wang et al
27.02.2026 16:56 β
π 0
π 0
π¬ 0
π 0
We consider hypothesis testing of binary causal queries using observational data. Since the mapping of causal models to the observational distribution that they induce is not one-to-one, in general, causal queries are often only partially identifiable. When binary statistical tests are used for testing partially-identifiable causal queries, their results do not translate in a straightforward manner to the causal hypothesis testing problem. We propose using ternary (three-outcome) statistical tests to test partially-identifiable causal queries. We establish testability requirements that ternary tests must satisfy in terms of uniform consistency and present equivalent topological conditions on the hypotheses. To leverage the existing toolbox of binary tests, we prove that obtaining ternary tests by combining binary tests is complete. Finally, we demonstrate how topological conditions serve as a guide to construct ternary tests for two concrete causal hypothesis testing problems, namely testing the instrumental variable (IV) inequalities and comparing treatment efficacy.
arXivππ€
Testing Partially-Identifiable Causal Queries Using Ternary Tests
By Bhadane, Mooij, Boeken et al
27.02.2026 16:52 β
π 0
π 0
π¬ 0
π 0
Permutation procedures are common practice in hypothesis testing when distributional assumptions about the test statistic are not met or unknown. With only few permutations, empirical p-values lie on a coarse grid and may even be zero when the observed test statistic exceeds all permuted values. Such zero p-values are statistically invalid and hinder multiple testing correction. Parametric tail modeling with the Generalized Pareto Distribution (GPD) has been proposed to address this issue, but existing implementations can again yield zero p-values when the estimated shape parameter is negative and the fitted distribution has a finite upper bound.
We introduce a method for accurate and zero-free p-value approximation in permutation testing, embedded in the permApprox workflow and R package. Building on GPD tail modeling, the method enforces a support constraint during parameter estimation to ensure valid extrapolation beyond the observed statistic, thereby strictly avoiding zero p-values. The workflow further integrates robust parameter estimation, data-driven threshold selection, and principled handling of hybrid p-values that are discrete in the bulk and continuous in the extreme tail.
Extensive simulations using two-sample t-tests and Wilcoxon rank-sum tests show that permApprox produces accurate, robust, and zero-free p-value approximations across a wide range of sample and effect sizes. Applications to single-cell RNA-seq and microbiome data demonstrate its practical utility: permApprox yields smooth and interpretable p-value distributions even with few permutations. By resolving the zero-p-value problem while preserving accuracy and computational efficiency, permApprox enables reliable permutation-based inference in high-dimensional and computationally intensive settings.
arXivππ€
permApprox: a general framework for accurate permutation p-value approximation
By Peschel, Boulesteix, Mutius et al
27.02.2026 16:48 β
π 0
π 0
π¬ 0
π 0
Improper priors are not allowed for the computation of the Bayesian evidence $Z=p({\bf y})$ (a.k.a., marginal likelihood), since in this case $Z$ is not completely specified due to an arbitrary constant involved in the computation. However, in this work, we remark that they can be employed in a specific type of model selection problem: when we have several (possibly infinite) models belonging to the same parametric family (i.e., for tuning parameters of a parametric model). However, the quantities involved in this type of selection cannot be considered as Bayesian evidences: we suggest to use the name ``fake evidences'' (or ``areas under the likelihood'' in the case of uniform improper priors). We also show that, in this model selection scenario, using a diffuse prior and increasing its scale parameter asymptotically to infinity, we cannot recover the value of the area under the likelihood, obtained with a uniform improper prior. We first discuss it from a general point of view. Then we provide, as an applicative example, all the details for Bayesian regression models with nonlinear bases, considering two cases: the use of a uniform improper prior and the use of a Gaussian prior, respectively. A numerical experiment is also provided confirming and checking all the previous statements.
arXivππ€
A note on the area under the likelihood and the fake evidence for model selection
By Martino, Llorente
27.02.2026 16:44 β
π 0
π 0
π¬ 0
π 0
Statistical analysis of functional data is challenging due to their complex patterns, for which functional depth provides an effective means of reflecting their ordering structure. In this work, we investigate practical aspects of the recently proposed regularized projection depth (RPD), which induces a meaningful ordering of functional data while appropriately accommodating their complex shape features. Specifically, we examine the impact and choice of its tuning parameter, which regulates the degree of effective dimension reduction applied to the data, and propose a random projection-based approach for its efficient computation, supported by theoretical justification. Through comprehensive numerical studies, we explore a wide range of statistical applications of the RPD and demonstrate its particular usefulness in uncovering shape features in functional data analysis. This ability allows the RPD to outperform competing depth-based methods, especially in tasks such as functional outlier detection, classification, and two-sample hypothesis testing.
arXivππ€
Projection depth for functional data: Practical issues, computation and applications
By Bo\v{c}inec, Nagy, Yeon
27.02.2026 16:42 β
π 0
π 0
π¬ 0
π 0
We are honoured to have our work read and discussed at such a thorough level by several experts. Words of appreciation and encouragement are gratefully received, while the many supplementary comments, thoughtful reminders, new perspectives and additional themes raised are warmly welcomed and deeply appreciated. Our thanks go also to JASA Editor Francisco Samaniego and his editorial helpers for organising this discussion.
Space does not allow us answering all of the many worthwhile points raised by our discussants, but in the following we make an attempt to respond to what we perceive of as being the major issues. Our responses are organised by themes rather than by discussants. We shall refer to our two articles as `the FMA paper' (Hjort and Claeskens) and `the FIC paper' (Claeskens and Hjort).
arXivππ€
Rejoinder to the discussants of the two JASA articles `Frequentist Model Averaging' and `The Focused Information Ctierion', by Nils Lid Hjort and Gerda Claeskens
By Hjort, Claeskens
27.02.2026 16:39 β
π 0
π 0
π¬ 0
π 0
Multi-armed bandit (MAB) processes constitute a foundational subclass of reinforcement learning problems and represent a central topic in statistical decision theory, but are limited to simultaneous adaptive allocation and sequential test, because of the absence of asymptotic theory under non-i.i.d sequence and sublinear information. To address this open challenge, we propose Urn Bandit (UNB) process to integrate the reinforcement mechanism of urn probabilistic models with MAB principles, ensuring almost sure convergence of resource allocation to optimal arms. We establish the joint functional central limit theorem (FCLT) for consistent estimators of expected rewards under non-i.i.d., non-sub-Gaussian and sublinear reward samples with pairwise correlations across arms. To overcome the limitations of existing methods that focus mainly on cumulative regret, we establish the asymptotic theory along with adaptive allocation that serves powerful sequential test, such as arms comparison, A/B testing, and policy valuation. Simulation studies and real data analysis demonstrate that UNB maintains statistical test performance of equal randomization (ER) design but obtain more average rewards like classical MAB processes.
arXivππ€
Asymptotic Theory and Sequential Test for General Multi-Armed Bandit Process
By Yang, Yan, Jiang
27.02.2026 16:37 β
π 0
π 0
π¬ 0
π 0
Streaming data often exhibit heterogeneity due to heteroscedastic variances or inhomogeneous covariate effects. Online renewable quantile and expectile regression methods provide valuable tools for detecting such heteroscedasticity by combining current data with summary statistics from historical data. However, quantile regression can be computationally demanding because of the non-smooth check function. To address this, we propose a novel online renewable method based on expectile regression, which efficiently updates estimates using both current observations and historical summaries, thereby reducing storage requirements. By exploiting the smoothness of the expectile loss function, our approach achieves superior computational efficiency compared with existing online renewable methods for streaming data with heteroscedastic variances or inhomogeneous covariate effects. We establish the consistency and asymptotic normality of the proposed estimator under mild regularity conditions, demonstrating that it achieves the same statistical efficiency as oracle estimators based on full individual-level data. Numerical experiments and real-data applications demonstrate that our method performs comparably to the oracle estimator while maintaining high computational efficiency and minimal storage costs.
arXivππ€
Renewable estimation in linear expectile regression models with streaming data sets
By Cao, Wanga, Hua
27.02.2026 16:33 β
π 1
π 0
π¬ 0
π 0
Corner kicks are an important event in soccer because they are often the result of strong attacking play and can be of keen interest to sports fans and bettors. Peng, Hu, and Swartz (2024, Computational Statistics) formulate the mixture feature of corner kick times caused by previous corner kicks, frame the commonly available corner kick data as right-censored event times, and explore patterns of corner kicks. This paper extends their modeling to accommodate the potential correlations between corner kicks by the same teams within the same games. We con- sider a frailty model for event times and apply the Monte Carlo Expec- tation Maximization (MCEM) algorithm to obtain the maximum like- lihood estimates for the model parameters. We compare the proposed model with the model in Peng, Hu, and Swartz (2024) using likelihood ratio tests. The 2019 Chinese Super League (CSL) data are employed throughout the paper for motivation and illustration.
arXivππ€
Learning about Corner Kicks in Soccer by Analysis of Event Times Using a Frailty Model
By Isaacs, Hu, Peng et al
27.02.2026 16:29 β
π 0
π 0
π¬ 0
π 0
Emerging applications increasingly demand flexible covariate adaptive randomization (CAR) methods that support unequal targeted allocation ratios. While existing procedures can achieve covariate balance, they often suffer from the shift problem. This occurs when the allocation ratios of some additional covariates deviate from the target. We show that this problem is equivalent to a mismatch between the conditional average allocation ratio and the target among units sharing specific covariate values, revealing a failure of existing procedures in the long run. To address it, we derive a new form of allocation function by requiring that balancing covariates ensures the ratio matches the target. Based on this form, we design a class of parameterized allocation functions. When the parameter roughly matches certain characteristics of the covariate distribution, the resulting procedure can balance covariates. Thus, we propose a feasible randomization procedure that updates the parameter based on collected covariate information, rendering the procedure non-Markovian. To accommodate this, we introduce a CAR framework that allows non-Markovian procedure. We then establish its key theoretical properties, including the boundedness of covariate imbalance in probability and the asymptotic distribution of the imbalance for additional covariates. Ultimately, we conclude that the feasible randomization procedure can achieve covariate balance and eliminate the shift.
arXivππ€
A General (Non-Markovian) Framework for Covariate Adaptive Randomization: Achieving Balance While Eliminating the Shift
By Fang, Ma
27.02.2026 16:28 β
π 0
π 0
π¬ 0
π 0
Causal inference in modern largescale systems faces growing challenges, including highdimensional covariates, multi-valued treatments, massive observational (OBS) data, and limited randomized controlled trial (RCT) samples due to cost constraints. We formalize treatment-induced structural non-overlap and show that, under this regime, commonly used weighted fusion methods provably fail to satisfy randomized identifying restrictions.To address this issue,we propose a constrained joint estimation framework that minimizes observational risk while enforcing causal validity through orthogonal experimental moment conditions. We further show that structural non-overlap creates a feasibility obstruction for moment enforcement in the original covariate space.We also derive a penalized primaldual algorithm that jointly learns representations and predictors, and establish oracle inequalities decomposing error into overlap recovery, moment violation, and statistical terms.Extensive synthetic experiments demonstrate robust performance under varying degrees of nonoverlap. A largescale ridehailing application shows that our method achieves substantial gains over existing baselines, matching the performance of models trained with significantly more RCT data.
arXivππ€
Feasible Fusion: Constrained Joint Estimation under Structural Non-Overlap
By Du, Zhang, Li et al
27.02.2026 16:26 β
π 0
π 0
π¬ 0
π 0
Learning low-dimensional latent representations is a central topic in statistics and machine learning, and rotation methods have long been used to obtain sparse and interpretable representations. Despite nearly a century of widespread use across many fields, rigorous guarantees for valid inference for the learned representation remain lacking. In this paper, we identify a surprisingly prevalent phenomenon that suggests a reason for this gap: for a broad class of vintage rotations, the resulting estimators exhibit a non-estimable bias. Because this bias is independent of the data, it fundamentally precludes the development of valid inferential procedures, including the construction of confidence intervals and hypothesis testing. To address this challenge, we propose a novel bias-free rotation method within a general representation learning framework based on latent variables. We establish an oracle inference property for the learned sparse representations: the estimators achieve the same asymptotic variance as in the ideal setting where the latent variables are observed. To bridge the gap between theory and computation, we develop an efficient computational framework and prove that its output estimators retain the same oracle property. Our results provide a rigorous inference procedure for the rotated estimators, yielding statistically valid and interpretable representation learning.
arXivππ€
Beyond Vintage Rotation: Bias-Free Sparse Representation Learning with Oracle Inference
By Cui, Chen, Ouyang et al
27.02.2026 16:23 β
π 2
π 0
π¬ 0
π 0
Time-varying covariates in longitudinal studies frequently evolve through reciprocal feedback, undergo role reversal, and reflect unobserved individual heterogeneity. Standard statistical frameworks often assume fixed covariate roles and exogenous predictors, limiting their utility in systems governed by dynamic behavioral or physiological processes. We develop a hierarchical joint modeling framework that unifies three key features of such systems: (i) bidirectional feedback between a binary and a continuous covariate, (ii) role reversal in which these covariates become jointly modeled outcomes at a prespecified decision phase, and (iii) a shared latent trait influencing both intermediate covariates and a final binary endpoint. The model proceeds in three phases: a feedback-driven longitudinal process, a reversal phase in which the two covariates are jointly modeled conditional on the latent trait, and an outcome model linking a binary, decision-relevant endpoint to observed and latent components. Estimation is carried out using maximum likelihood and Bayesian inference, with Hamiltonian Monte Carlo supporting robust posterior estimation for models with latent structure and mixed outcome types. Simulation studies show that the model yields well calibrated coverage, small bias, and improved predictive performance compared to standard generalized linear mixed models, marginal approaches, and models that ignore feedback or latent traits. In an analysis of nationally representative U.S. panel data, the model captures the co-evolution of physical activity and body mass index and their joint influence, moderated by a latent behavioral resilience factor, on income mobility. The framework offers a flexible, practically implementable tool for analyzing longitudinal decision systems in which feedback, covariate role transition, and unmeasured capacity are central to prediction and intervention.
arXivππ€
Modeling Covariate Feedback, Reversal, and Latent Traits in Longitudinal Data: A Joint Hierarchical Framework
By Ramezani, Nitiema, Wilson
27.02.2026 16:20 β
π 0
π 0
π¬ 0
π 0
A conventional Bayesian approach to prediction uses the posterior distribution to integrate out parameters in a density for unobserved data conditional on the observed data and parameters. When the true posterior is intractable, it is replaced by an approximation; here we focus on variational approximations. Recent work has explored methods that learn posteriors optimized for predictive accuracy under a chosen scoring rule, while regularizing toward the prior or conventional posterior. Our work builds on an existing predictive variational inference (PVI) framework that improves prediction, but also diagnoses model deficiencies through implicit model expansion. In models where the sampling density depends on the parameters through a linear predictor, we improve the interpretability of existing PVI methods as a diagnostic tool. This is achieved by adopting PVI posteriors of Gaussian mixture form (GM-PVI) and establishing connections with plug-in prediction for mixture-of-experts models. We make three main contributions. First, we show that GM-PVI prediction is equivalent to plug-in prediction for certain mixture-of-experts models with covariate-independent weights in generalized linear models and hierarchical extensions of them. Second, we extend standard PVI by allowing GM-PVI posteriors to vary with the prediction covariate and in this case an equivalence to plug-in prediction for mixtures of experts with covariate-dependent weights is established. Third, we demonstrate the diagnostic value of this approach across several examples, including generalized linear models, linear mixed models, and latent Gaussian process models, demonstrating how the parameters of the original model must vary across the covariate space to achieve improvements in prediction.
arXivππ€
Predictive variational inference for flexible regression models
By Kock, Sisson, Rodrigues et al
27.02.2026 16:17 β
π 0
π 0
π¬ 0
π 0
We identify a fundamental pathology in the likelihood for time delay inference which challenges standard inference methods. By analysing the likelihood for time delay inference with Gaussian process light curve models, we show that it generically develops a boundary-driven "W"-shape with a global maximum at the true delay and gradual rises towards the edges of the observation window. This arises because time delay estimation is intrinsically extrapolative. In practice, global samplers such as nested sampling are steered towards spurious edge modes unless strict convergence criteria are adopted. We demonstrate this with simulations and show that the effect strengthens with higher data density over a fixed time span. To ensure convergence, we provide concrete guidance, notably increasing the number of live points. Further, we show that methods implicitly favouring small delays, for example optimisers and local MCMC, induce a bias towards larger $H_0$. Our results clarify failure modes and offer practical remedies for robust fully Bayesian time delay inference.
arXivππ€
The global structure of the time delay likelihood
By Kroupa, Handley
27.02.2026 16:14 β
π 0
π 0
π¬ 0
π 0
Joint modeling has become increasingly popular for characterizing the association between one or more longitudinal biomarkers and competing risks time-to-event outcomes. However, semiparametric multivariate joint modeling for large-scale data encounter substantial statistical and computational challenges, primarily due to the high dimensionality of random effects and the complexity of estimating nonparametric baseline hazards. These challenges often lead to prolonged computation time and excessive memory usage, limiting the utility of joint modeling for biobank-scale datasets. In this article, we introduce an efficient implementation of a semiparametric multivariate joint model, supported by a normal approximation and customized linear scan algorithms within an expectation-maximization (EM) framework. Our method significantly reduces computation time and memory consumption, enabling the analysis of data from thousands of subjects. The scalability and estimation accuracy of our approach are demonstrated through two simulation studies. We also present an application to the Primary Biliary Cirrhosis (PBC) dataset involving five longitudinal biomarkers as an illustrative example. A user-friendly R package, \texttt{FastJM}, has been developed for the shared random effects joint model with efficient implementation. The package is publicly available on the Comprehensive R Archive Network: https://CRAN.R-project.org/package=FastJM.
arXivππ€
Efficient Implementation of a Semiparametric Joint Model for Multivariate Longitudinal Biomarkers and Competing Risks Time-to-Event Data
By Li, Ouyang, Zhou et al
27.02.2026 03:49 β
π 1
π 0
π¬ 0
π 0
arXiv:2112.13738v3 Announce Type: replace
Abstract: Machine learning classification methods usually assume that all possible classes are sufficiently present within the training set. Due to their inherent rarities, extreme events are always under-represented and classifiers tailored for predicting extremes need to be carefully designed to handle this under-representation. In this paper, we address the question of how to assess and compare classifiers with respect to their capacity to capture extreme occurrences. This is also related to the topic of scoring rules used in forecasting literature. In this context, we propose and study a risk function adapted to extremal classifiers. The inferential properties of our empirical risk estimator are derived under the framework of multivariate regular variation and hidden regular variation. A simulation study compares different classifiers and indicates their performance with respect to our risk function. To conclude, we apply our framework to the analysis of extreme river discharges in the Danube river basin. The application compares different predictive algorithms and test their capacity at forecasting river discharges from other river stations.
arXivππ€
Evaluation of binary classifiers for asymptotically dependent and independent extremes
By
27.02.2026 01:37 β
π 1
π 0
π¬ 0
π 0
This paper proposes Fourier-based and wavelet-based techniques for analyzing periodic financial time series. Conventional models such as the periodic autoregressive conditional heteroscedastic (PGARCH) and periodic autoregressive conditional duration (PACD) often involve many parameters. The methods put forward here resulted in more parsimonious models with increased forecast efficiency. The effectiveness of these approaches is demonstrated through simulation and data analysis studies.
arXivππ€
Parsimonious Modeling of Periodic Time Series Using Fourier and Wavelet Techniques
By Davis, Balakrishna
26.02.2026 22:11 β
π 1
π 0
π¬ 0
π 0
Statistical inference in high-dimensional settings is challenging when standard unregularized methods are employed. In this work, we focus on the case of multiple correlated proportions for which we develop a Bayesian inference framework. For this purpose, we construct an $m$-dimensional Beta distribution from a $2^m$-dimensional Dirichlet distribution, building on work by Olkin and Trikalinos (2015). This readily leads to a multivariate Beta-binomial model for which simple update rules from the common Dirichlet-multinomial model can be adopted. From the frequentist perspective, this approach amounts to adding pseudo-observations to the data and allows a joint shrinkage estimation of mean vector and covariance matrix. For higher dimensions ($m > 10$), the extensive model based on $2^m$ parameters starts to become numerically infeasible. To counter this problem, we utilize a reduced parametrisation which has only $1 + m(m + 1)/2$ parameters describing first and second order moments. A copula model can then be used to approximate the (posterior) multivariate Beta distribution. A natural inference goal is the construction of multivariate credible regions. The properties of different credible regions are assessed in a simulation study in the context of investigating the accuracy of multiple binary classifiers. It is shown that the extensive and copula approach lead to a (Bayes) coverage probability very close to the target level. In this regard, they outperform credible regions based on a normal approximation of the posterior distribution, in particular for small sample sizes. Additionally, they always lead to credible regions which lie entirely in the parameter space which is not the case when the normal approximation is used.
arXivππ€
Simultaneous Inference for Multiple Proportions: A Multivariate Beta-Binomial Model
By Westphal
26.02.2026 19:20 β
π 0
π 0
π¬ 0
π 0
Representing, comparing, and measuring the distance between probability distributions is a key task in computational statistics and machine learning. The choice of representation and the associated distance determine properties of the methods in which they are used: for example, certain distances can allow one to encode robustness or smoothness of the problem. Kernel methods offer flexible and rich Hilbert space representations of distributions that allow the modeller to enforce properties through the choice of kernel, and estimate associated distances at efficient nonparametric rates. In particular, the maximum mean discrepancy (MMD), a kernel-based distance constructed by comparing Hilbert space mean functions, has received significant attention due to its computational tractability and is favoured by practitioners.
In this thesis, we conduct a thorough study of kernel-based distances with a focus on efficient computation, with core contributions in Chapters 3 to 6. Part I of the thesis is focused on the MMD, specifically on improved MMD estimation. In Chapter 3 we propose a theoretically sound, improved estimator for MMD in simulation-based inference. Then, in Chapter 4, we propose an MMD-based estimator for conditional expectations, a ubiquitous task in statistical computation. Closing Part I, in Chapter 5 we study the problem of calibration when MMD is applied to the task of integration.
In Part II, motivated by the recent developments in kernel embeddings beyond the mean, we introduce a family of novel kernel-based discrepancies: kernel quantile discrepancies. These address some of the pitfalls of MMD, and are shown through both theoretical results and an empirical study to offer a competitive alternative to MMD and its fast approximations. We conclude with a discussion on broader lessons and future work emerging from the thesis.
arXivππ€
Scalable Kernel-Based Distances for Statistical Inference and Integration
By Naslidnyk
26.02.2026 17:21 β
π 1
π 0
π¬ 0
π 0