Research
Overview
My research interests lie at the intersection of information theory, machine learning, and high-dimensional statistics, with ongoing areas of interest including the following:
- Information-theoretic understanding of statistical inference and learning problems
- Adaptive decision-making under uncertainty (e.g., Bayesian optimization, bandits)
- Scalable algorithms for large-scale inference and learning (e.g., group testing, graph learning)
- Robustness considerations in machine learning
If you have been admitted to the NUS PhD program and are looking for a supervisor, feel free to email me to arrange a meeting. Other prospective PhD applicants are also welcome to get in touch, but I apologize that I may not reply to most enquiries. Admission to NUS can be done through the Department of Computer Science, the Department of Mathematics, or the Institute of Data Science.
If you would like to apply for a post-doc or research assistant position, please send me your CV and an outline of your research interests. Applicants should have a strong track record in an area related to my research interests, such as machine learning, information theory, statistics, statistical signal processing, or theoretical computer science.
Research Group
- Arpan Losalka (PhD student)
- Sun Yang (PhD student)
- Zihan Li (PhD student)
- Yan Hao Ling (PhD student)
- Ivan Lau (PhD Student)
- Chenkai Ma (PhD Student)
- Yuting Pan (Masters student w/ Qianxiao Li)
- Recep Can Yavas (post-doc w/ Vincent Tan)
Former postdocs: Lan Truong (U. Essex), Qiaoqiao Zhou (Southeast U.), Daming Cao (NUIST), Zhaoqiang Liu (UESTC), Thach Bui (VNU-HCM), Prathamesh Mayekar (Propheus)
Former RAs: Anamay Chaturvedi (ISTA), Selwyn Gomes (UCSD), Hangdong Zhao (U. Wisconsin-Madison), Mayank Shrivastava (UIUC)
Research Funding
- (Nov. 2024 - Nov. 2027) Adaptive and Resource-Efficient Sequential Decision-Making Algorithms, AI Visiting Professorship ($2.87M, w/ Kevin Jamieson)
- (Oct. 2024 - Oct. 2029) Statistical Estimation and Learning with 1-bit Feedback, NUS Presidential Young Professorship ($320k)
- (March 2023 - March 2026) Safety and Reliability in Black-Box Optimization, MoE Academic Research Fund (AcRF) Tier 1 ($250k)
- (May 2019 - May 2024) Robust Statistical Model Under Model Uncertainty, Singapore National Research Foundation (NRF) Fellowship ($2.29M)
- (Nov. 2018 - Oct. 2022) Information-Theoretic Methods in Data Science, NUS Early Career Research Award ($500k)
- (Jan. 2018 - Jan. 2021) Theoretical and Algorithmic Advances in Noisy Adaptive Group Testing, NUS Startup Grant ($180k)
Ongoing Research Projects
Some potential research projects that students and post-docs could pursue with me are listed below; this list is far from exhaustive.
1) Information-theoretic limits of statistical inference and learning problems
The field of information theory was introduced as a means for understanding the fundamental limits of data compression and transmission, and has shaped the design of practical communication systems for decades. This project will pursue the emerging perspective that information theory is not only a theory of communication, but a far-reaching theory of data benefiting diverse inference and learning problems such as estimation, prediction, and optimization. This perspective leads to principled mathematical approaches to certifying the near-optimality of practical algorithms, and steering practical research towards where the greatest improvements are possible.
Selected relevant publications:
- Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework
- Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization
- Optimal Rates of Teaching and Learning Under Uncertainty
2) Modern methods for high-dimensional estimation and learning
Extensive research over the last 1-2 decades has led to a variety of powerful techniques for high-dimensional estimation and learning, with the prevailing approach being to introduce low-dimensional modeling assumptions such as sparsity, low-rankness, and graphical model structure. Recently, there has been a paradigm shift towards data-driven techniques, including the replacement of explicit modeling assumptions by implicit generative models based on deep neural networks. In comparison to traditional approaches, this line of works remains in its infancy; this project explores this exciting new research avenue from both a theoretical and practical perspective. Selected relevant publications:
- Theoretical Perspectives on Deep Learning Methods in Inverse Problems
- Generative Principal Component Analysis
- Information-Theoretic Lower Bounds for Compressive Sensing with Generative Models
3) Robustness considerations in machine learning
Robustness requirements pose many of the most important unsolved challenges in modern machine learning, arising from sources of uncertainty such as mismatched modeling assumptions, corrupted data, and the presence of adversaries. For instance, large distributed learning systems must be able to deal with individual node failures, robotics tasks learned in a simulated environment should be designed to degrade as little as possible when transferred to a real environment, and robustness against adversarial attacks remains a considerable unsolved challenge in deep learning, just to name a few examples. This project will seek to better understand some of the most practically pertinent sources of uncertainty and develop new algorithms that are robust in the face of this uncertainty, with rigorous guarantees.
Selected relevant publications:
- Adversarially Robust Optimization with Gaussian Processes
- Stochastic Linear Bandits Robust to Adversarial Attacks
- Robust Submodular Maximization: A Non-Uniform Partitioning Approach
4) Theory and algorithms for group testing
Group testing is a classical sparse estimation problem that seeks to identify "defective" items by testing groups of items in pools, with recent applications including database systems, communication protocols, and COVID-19 testing. A recent line of works has led to significant advances in the theory of group testing, including the development of precise performance limits and practical algorithms for attaining them. This project seeks to push these advances further towards more challenging settings that better account for crucial practical phenomena, including noisy outcomes, testing constraints, prior information, and non-binary measurements.
Selected relevant publications:
- Group Testing: An Information Theory Perspective
- Exact Thresholds for Noisy Non-Adaptive Group Testing
- Noisy Adaptive Group Testing: Bounds and Algorithms
5) Theory and algorithms for Bayesian optimization
Bayesian optimization (BO) has recently emerged as a versatile tool for optimizing "black-box" functions, with particular success in automating machine learning algorithms by learning a good set of hyperparameters (e.g., used in the famous AlphaGo program), as well as other applications such as robotics and materials design. A widespread modeling assumption in BO is that the function is well-approximated by a Gaussian process, whose smoothness properties are dictated by a kernel function. This project seeks to advance the current state-of-the-art theory and algorithms for BO with an emphasis on practical variations that remain lesser-understood, including model misspecification, adversarial corruptions, and high dimensionality.
Selected relevant publications: