publications | Hoan D. Nguyen

2023

ICLR
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation

M.-C. Dinu, M. Holzleitner, M. Beck, D. H. Nguyen, and 6 more authors

The Eleventh International Conference on Learning Representations, 2023

Abs Bib HTML

We study the problem of choosing algorithm hyper-parameters in unsupervised domain adaptation, i.e., with labeled data in a source domain and unlabeled data in a target domain, drawn from a different input distribution. We follow the strategy to compute several models using different hyper-parameters, and, to subsequently compute a linear aggregation of the models. While several heuristics exist that follow this strategy, methods are still missing that rely on thorough theories for bounding the target error. In this turn, we propose a method that extends weighted least squares to vector-valued functions, e.g., deep neural networks. We show that the target error of the proposed algorithm is asymptotically not worse than twice the error of the unknown optimal aggregation. We also perform a large scale empirical comparative study on several datasets, including text, images, electroencephalogram, body sensor signals and signals from mobile phones. Our method outperforms deep embedded validation (DEV) and importance weighted validation (IWV) on all datasets, setting a new state-of-the-art performance for solving parameter choice issues in unsupervised domain adaptation with theoretical error guarantees. We further study several competitive heuristics, all outperforming IWV and DEV on at least five datasets. However, our method outperforms each heuristic on at least five of seven datasets.
@article{ICLR, title = {Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation}, author = {Dinu, M.-C. and Holzleitner, M. and Beck, M. and Nguyen, D. H. and Huber, A. and Eghbal-zadeh, H. and Moser, B. A. and Pereverzyev, S. and Hochreiter, S. and Zellinger, W.}, journal = {The Eleventh International Conference on Learning Representations}, year = {2023}, url = {https://openreview.net/forum?id=M95oDwJXayG}, }
preprint
On regularized Radon-Nikodym differentiation

Duc Hoan Nguyen, Werner Zellinger, and Sergei V. Pereverzyev

2023

Abs Bib HTML

We discuss the problem of estimating Radon-Nikodym derivatives. This problem appears in various applications, such as covariate shift adaptation, likelihood-ratio testing, mutual information estimation, and conditional probability estimation. To address the above problem, we employ the general regularization scheme in reproducing kernel Hilbert spaces. The convergence rate of the corresponding regularized algorithm is established by taking into account both the smoothness of the derivative and the capacity of the space in which it is estimated. This is done in terms of general source conditions and the regularized Christoffel functions. We also find that the reconstruction of Radon-Nikodym derivatives at any particular point can be done with high order of accuracy. Our theoretical results are illustrated by numerical simulations.
@article{radon, title = {On regularized Radon-Nikodym differentiation}, author = {Nguyen, Duc Hoan and Zellinger, Werner and Pereverzyev, Sergei V.}, year = {2023}, eprint = {2308.07887}, archiveprefix = {arXiv}, primaryclass = {math.ST}, }
preprint
General regularization in covariate shift adaptation

Duc Hoan Nguyen, Sergei V. Pereverzyev, and Werner Zellinger

2023

Abs Bib HTML

Sample reweighting is one of the most widely used methods for correcting the error of least squares learning algorithms in reproducing kernel Hilbert spaces (RKHS), that is caused by future data distributions that are different from the training data distribution. In practical situations, the sample weights are determined by values of the estimated Radon-Nikodým derivative, of the future data distribution w.r.t. the training data distribution. In this work, we review known error bounds for reweighted kernel regression in RKHS and obtain, by combination, novel results. We show under weak smoothness conditions, that the amount of samples, needed to achieve the same order of accuracy as in the standard supervised learning without differences in data distributions, is smaller than proven by state-of-the-art analyses.
@article{thesis, title = {General regularization in covariate shift adaptation}, author = {Nguyen, Duc Hoan and Pereverzyev, Sergei V. and Zellinger, Werner}, year = {2023}, eprint = {2307.11503}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, }

2022

ACHA
On a regularization of unsupervised domain adaptation in RKHS

E. R. Gizewski, L. Mayer, B. A. Moser, D. H. Nguyen, and 4 more authors

Applied and Computational Harmonic Analysis, 2022

Abs Bib HTML

We analyze the use of the so-called general regularization scheme in the scenario of unsupervised domain adaptation under the covariate shift assumption. Learning algorithms arising from the above scheme are generalizations of importance weighted regularized least squares method, which up to now is among the most used approaches in the covariate shift setting. We explore a link between the considered domain adaptation scenario and estimation of Radon-Nikodym derivatives in reproducing kernel Hilbert spaces, where the general regularization scheme can also be employed and is a generalization of the kernelized unconstrained least-squares importance fitting. We estimate the convergence rates of the corresponding regularized learning algorithms and discuss how to resolve the issue with the tuning of their regularization parameters. The theoretical results are illustrated by numerical examples, one of which is based on real data collected for automatic stenosis detection in cervical arteries.
@article{acha2022, title = {On a regularization of unsupervised domain adaptation in RKHS}, journal = {Applied and Computational Harmonic Analysis}, volume = {57}, pages = {201--227}, year = {2022}, issn = {1063-5203}, doi = {https://doi.org/10.1016/j.acha.2021.12.002}, url = {https://www.sciencedirect.com/science/article/pii/S1063520321001032}, author = {Gizewski, E. R. and Mayer, L. and Moser, B. A. and Nguyen, D. H. and Pereverzyev Jr, S. and Pereverzyev, S. V. and Shepeleva, N. and Zellinger, W.}, }

2021

NeurIPS
The balancing principle for parameter choice in distance-regularized domain adaptation

W. Zellinger, N. Shepeleva, M.Dinu, H. Eghbalzadeh, and 4 more authors

Advances in Neural Information Processing Systems, 2021

Abs Bib HTML

We address the unsolved algorithm design problem of choosing a justified regularization parameter in unsupervised domain adaptation. This problem is intriguing as no labels are available in the target domain. Our approach starts with the observation that the widely-used method of minimizing the source error, penalized by a distance measure between source and target feature representations, shares characteristics with regularized ill-posed inverse problems. Regularization parameters in inverse problems are optimally chosen by the fundamental principle of balancing approximation and sampling errors. We use this principle to balance learning errors and domain distance in a target error bound. As a result, we obtain a theoretically justified rule for the choice of the regularization parameter. In contrast to the state of the art, our approach allows source and target distributions with disjoint supports. An empirical comparative study on benchmark datasets underpins the performance of our approach.
@article{NeurIPS, title = {The balancing principle for parameter choice in distance-regularized domain adaptation}, author = {Zellinger, W. and Shepeleva, N. and M.Dinu and Eghbalzadeh, H. and Nguyen, D. H. and Nessler, B. and Pereverzyev, S. and Moser, B. A.}, journal = {Advances in Neural Information Processing Systems}, year = {2021}, url = {https://openreview.net/forum?id=TSxWhJRk4oF}, }