Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

publications

Two-part Statistical Model for Identifying Baseline Predictors of Chronic Postsurgical Pain.

Published in Anesthesiology, 2026

A substantial proportion of patients report no pain after surgery, resulting in an excess of zero values that pose challenges for analysis using traditional statistical models. The present study was designed to test the hypothesis that a two-part model, commonly used in healthcare expenditures research, would demonstrate superior performance in predicting postsurgical pain when compared to traditional models, and would secondarily better identify predictors of this clinically important outcome.

Recommended citation: Stephan G Frangakis, Xuran Meng, Mark C Bicket, Vidhya Gunaseelan, Sawsan As Sanie, Andrew Urquhart, Yi Li and Chad M Brummett,
Download Paper

Beyond Consistency: Inference for the Relative Risk Functional in Deep Nonparametric Cox Models.

Published in Arxiv, 2026

There remain theoretical gaps in deep neural network estimators for the nonparametric Cox proportional hazards model. In particular, it is unclear how gradient-based optimization error propagates to population risk under partial likelihood, how pointwise bias can be controlled to permit valid inference, and how ensemble-based uncertainty quantification behaves under realistic variance decay regimes. We develop an asymptotic distribution theory for deep Cox estimators that addresses these issues. First, we establish nonasymptotic oracle inequalities for general trained networks that link in-sample optimization error to population risk without requiring the exact empirical risk optimizer. We then construct a structured neural parameterization that achieves infinity-norm approximation rates compatible with the oracle bound, yielding control of the pointwise bias. Under these conditions and using the Hajek–Hoeffding projection, we prove pointwise and multivariate asymptotic normality for subsampled ensemble estimators. We derive a range of subsample sizes that balances bias correction with the requirement that the Hajek–Hoeffding projection remain dominant. This range accommodates decay conditions on the single-overlap covariance, which measures how strongly a single shared observation influences the estimator, and is weaker than those imposed in the subsampling literature. An infinitesimal jackknife representation provides analytic covariance estimation and valid Wald-type inference for relative risk contrasts such as log-hazard ratios. Finally, we illustrate the finite-sample implications of the theory through simulations and a real data application.

Recommended citation: Sattwik Ghosal, Xuran Meng and Yi Li,
Download Paper

Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future.

Published in International Conference on Machine Learning, 2026

Self-Rewarding Language Models propose an architecture in which the Large Language Models(LLMs) both generates responses and evaluates its own outputs via LLM-as-a-Judge prompting, dynamically improving its generative capabilities through iterative Direct Preference Optimization (DPO). However, our analysis reveals a critical limitation in existing Self-Rewarding paradigms: the synchronized improvement of chosen and rejected responses progressively narrows the representational difference between contrasting samples, undermining effective preference learning. We propose \textbf{Temporal Self-Rewarding Language Models} that strategically coordinate past, present, and future model generations to sustain learning signals. Our dual-phase framework introduces: (1) \textit{Anchored Rejection} - fixing rejected responses using the past initial model’s outputs and (2) \textit{Future-Guided Chosen} - dynamically curating chosen samples using next-generation model predictions. Extensive experiments across three model families (Llama, Qwen, Mistral) and different model sizes (Llama3B/8B/70B) demonstrate significant improvements when trained with our method compared to Self-Rewarding using same computation resources. For example, Llama3.1-8B reaches a 29.44 win rate on AlpacaEval 2.0 with our method, outperforming the Self-Rewarding baseline (19.69) by 9.75. Notably, our method also demonstrates superior out-of-distribution generalization across mathematical reasoning (GSM8K), knowledge-based QA (ARC, TruthfulQA), and code generation (HumanEval) tasks, even though we do not specifically collect such training data.

Recommended citation: Yidong Wang et. al., "Temporal Self-Rewarding Language Models: Decoupling Chosen-Rejected via Past-Future." arxiv: 2508.06026, 2025.
Download Paper

talks

teaching

Tutor from 2020-2024

Undergraduate/Postgraduate course, University of Hong Kong, Department of Statistics and Actuarial Science, 2020

Stochastic Process, Financial Economics, Bayesian Learning