Empirical Likelihood in Modern AI
Empirical likelihood (EL) is a classical nonparametric framework enabling likelihood-based inference — confidence intervals, likelihood ratio tests — without parametric assumptions. It was later reinterpreted as a way to integrate data across heterogeneous populations via density ratio modeling: a baseline distribution is modeled nonparametrically and related to each population’s distribution via multiplicative tilts.
My research modernizes EL for AI and machine learning, leveraging neural networks to enrich density ratio modeling while using EL to inspire new algorithmic paradigms.
Application 1: Reimagining federated learning — from aggregation to guidance
In federated learning (FL), multiple clients collaboratively train models without sharing raw data. Existing FL methods focus on aggregating client updates into a single global model, which often underutilizes the rich heterogeneity across sites. Hospitals may specialize in different patient subgroups — that diversity should be leveraged rather than suppressed.
We use EL to reimagine FL, shifting the paradigm from aggregation to guidance. Focusing on multi-class classification, we formulate a density ratio model allowing constrained heterogeneity across classes and clients. The resulting loss decomposes naturally into two cross-entropy terms:
- one for predicting the class label of each individual, and
- one for identifying which client they originate from.
Beyond more efficient data pooling, our approach proposes a paradigm shift: the central server acts as an intelligent router, directing new tasks to the client most specialized for them — transforming client diversity from a challenge into an asset.
Figure 1: The FL server as an intelligent router — directing queries to the most specialized client.
Key paper: Wang, Z., Zhang, X., Zhang, X., Liu, Y., and Zhang, Q. (2026). Beyond Aggregation: Guiding Clients in Heterogeneous Federated Learning. ICLR 2026. [Paper] [arXiv] [Code] [Poster]
Application 2: Learning from noisy labels
Label noise is pervasive in real-world machine learning. Studies estimate roughly 5% of labels are incorrect, and in specialized domains such as healthcare, label uncertainty is often unavoidable. This becomes especially critical in high-stakes settings — failing to detect a serious disease is far more costly than a false alarm, and in such settings we need reliable control of specific error types, not just high overall accuracy.
We use EL to develop a principled framework for Neyman–Pearson multiclass classification under label noise. Rather than treating noisy labels as a preprocessing problem, we explicitly model how noisy labels relate to true labels.
Two key insights:
- Feature distributions across classes are linked through a density ratio model.
- Features given a noisy label can be expressed as a convex combination of clean class-conditional distributions.
This structure allows all relevant distributions to be represented as parametric tilts of a common reference distribution. Leveraging EL, we estimate the reference nonparametrically and recover tilting parameters through profiling.
A central challenge is that without additional structure, learning from noisy labels is non-identifiable. Our work shows that the density ratio structure resolves this — under mild conditions, it uniquely recovers the key quantities needed for learning without requiring knowledge of the noise mechanism. To our knowledge, this identifiability result is new in this setting.
Key paper: Zhang, Q., Tian, Q., and Li, P. (2026). Neyman–Pearson Multiclass Classification under Label Noise. Preprint. [arXiv]