I am a postdoctoral researcher at the University of Illinois at Urbana-Champaign, where I am fortunate to be advised by Prof. R. Srikant. I am broadly interested in the area of machine learning, including reinforcement learning, online learning (in particular, multi-armed bandit) and representation learning.
Prior to that, I received my Ph.D. from Institute for Interdisciplinary Information Sciences (headed by Prof. Andrew Chi-Chih Yao), Tsinghua University in June 2023. During my Ph.D. study, I was fortunate to be advised by Prof. Longbo Huang and also work closely with Dr. Wei Chen (Director of MSR Asia Theory Center).
I visited Cornell University in person during September-December 2022, where I was lucky to be supervised by Prof. Wen Sun. I was also a research intern at MSR Asia during January-May 2020, supervised by Dr. Wei Chen.
Email: yihandu@illinois.edu
Download my CV here.
Ph.D. in Computer Science, September 2018 - June 2023
Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University
B.E. in Computer Science, September 2014 - June 2018
Xiamen University
Yihan Du, Anna Winnicki, Gal Dalal, Shie Mannor, R. Srikant, “Reinforcement Learning with Segment Feedback,” Preprint, 2024.
Yihan Du, Anna Winnicki, Gal Dalal, Shie Mannor, R. Srikant, “Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization,” International Conference on Machine Learning (ICML), 2024. [pdf] [arXiv]
Yihan Du, R. Srikant, Wei Chen, “Cascading Reinforcement Learning,” International Conference on Learning Representations (ICLR), 2024 (spotlight, top 5%). [pdf] [arXiv]
Yu Chen#, Yihan Du, Pihe Hu, Siwei Wang, Desheng Wu, Longbo Huang, “Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback,” International Conference on Learning Representations (ICLR), 2024 (#graduate student mentored with my Ph.D. advisor). [pdf] [arXiv]
Nuoya Xiong#, Yihan Du, Longbo Huang, “Provably Safe Reinforcement Learning with Step-wise Violation Constraints,” Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023 (#undergraduate student mentored with my Ph.D. advisor). [pdf] [arXiv]
Yihan Du, Longbo Huang, Wen Sun, “Multi-task Representation Learning for Pure Exploration in Linear Bandits,” International Conference on Machine Learning (ICML), 2023. [pdf] [arXiv]
Yihan Du, Siwei Wang, Longbo Huang, “Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path,” International Conference on Learning Representations (ICLR), 2023. [pdf] [arXiv]
Yihan Du, Wei Chen, Yuko Kuroki, Longbo Huang, “Collaborative Pure Exploration in Kernel Bandit,” International Conference on Learning Representations (ICLR), 2023. [pdf] [arXiv]
Yihan Du, Wei Chen, “Branching Reinforcement Learning,” International Conference on Machine Learning (ICML), 2022. [pdf] [arXiv]
Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang, “Continuous Mean-Covariance Bandits,” Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf] [arXiv]
Yihan Du, Yuko Kuroki, Wei Chen, “Combinatorial Pure Exploration with Bottleneck Reward Function,” Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf] [arXiv]
Yihan Du, Siwei Wang, Longbo Huang, “A One-Size-Fits-All Solution to Conservative Bandit Problems,” Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021. [pdf] [arXiv]
Yihan Du*, Yuko Kuroki*, Wei Chen, “Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback,” Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021 (*equal contribution). [pdf] [arXiv]
[*alphabetical order] Wei Chen, Yihan Du, Longbo Huang, Haoyu Zhao, “Combinatorial Pure Exploration for Dueling Bandit,” International Conference on Machine Learning (ICML), 2020. [pdf] [arXiv]
Yihan Du, Siwei Wang, Longbo Huang, “Dueling Bandits: From Two-dueling to Multi-dueling,” Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2020. [pdf] [arXiv]
Yihan Du, Yan Yan, Si Chen, Yang Hua, “Object-adaptive LSTM Network for Real-time Visual Tracking with Adversarial Data Augmentation,” Neurocomputing, 2019.
Yihan Du, Yan Yan, Si Chen, Yang Hua, Hanzi Wang, “Object-adaptive LSTM Network for Visual Tracking,” International Conference on Pattern Recognition (ICPR), 2018.
China Computer Federation (CCF) Agent and Multi-Agent System Doctoral Dissertation Award, by CCF Multi-Agent System Committee, June 2024 (the only recipient nationwide)
Tsinghua Outstanding Doctoral Dissertation Award, by Tsinghua University, June 2023 (the only recipient among CS graduates at IIIS, Tsinghua University)
Beijing Outstanding Graduate, by Beijing Municipal Education Commission, June 2023 (the only recipient among CS graduates at IIIS, Tsinghua University)
China National Scholarship for Ph.D. Students, by Ministry of Education of China, October 2022 (the only recipient among CS students at IIIS, Tsinghua University)
Toyota Scholarship, by Toyota and Tsinghua University, October 2021
Huawei Academic Excellence Scholarship, by Huawei and Tsinghua University, October 2020
Wuqing Talent Scholarship, by Tianjin Wuqing District Government and Tsinghua University, October 2020
Outstanding Graduate, by Xiamen University, June 2018
“Why is RLHF Data-Efficient in Policy Optimization,” China Computer Federation (CCF) Agent and Multi-Agent System Seminar, June 2024
“Risk-aware Online Decision Making,” TrustML Young Scientist Seminar, RIKEN AIP, May 2023
“Risk-aware Online Decision Making,” MLOPT Idea Seminar, University of Wisconsin-Madison, April 2023
“Combinatorial Pure Exploration for Dueling Bandit,” CCF Doctoral Forum in Theoretical Computer Science, June 2021 (only 18 Ph.D. students in theoretical computer science are invited nationwide)
Reviewer
Conference: ICML 2021-2024, NeurIPS 2021-2024, ICLR 2022-2025, AAAI 2025, AISTATS 2025, UAI 2024, RLC 2024
Journal: Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Journal of Machine Learning Research (JMLR), Transactions on Networking (ToN), Transactions on Machine Learning Research (TMLR), Transactions on Network Science and Engineering (TNSE)
Technical Program Committee (TPC) Member
INFOCOM 2025, WiOpt 2024
Teaching Assistant
Stochastic Network Optimization (taught in English), graduate course at IIIS, Tsinghua University, Spring 2021
Introduction to Computer Science (taught in English), undergraduate course at Yao Class, Tsinghua University, Fall 2019
Social Activity
President of Graduate Union at IIIS, Tsinghua University, June 2020 - June 2021