Yihan Du 杜伊涵

Yihan Du 杜伊涵

Postdoctoral Researcher

ECE, UIUC

About me

I am a postdoctoral researcher at the University of Illinois at Urbana-Champaign, where I am fortunate to be advised by Prof. R. Srikant. I am broadly interested in the area of machine learning, including reinforcement learning, online learning (in particular, multi-armed bandit) and representation learning.

Prior to that, I received my Ph.D. from Institute for Interdisciplinary Information Sciences (headed by Prof. Andrew Chi-Chih Yao), Tsinghua University in June 2023. During my Ph.D. study, I was fortunate to be advised by Prof. Longbo Huang and also work closely with Dr. Wei Chen (Director of MSR Asia Theory Center).

I visited Cornell University in person during September-December 2022, where I was lucky to be supervised by Prof. Wen Sun. I was also a research intern at MSR Asia during January-May 2020, supervised by Dr. Wei Chen.

Email: yihandu@illinois.edu

Download my CV here.

Interests
  • Reinforcement Learning
  • Online Learning (in particular, Multi-armed Bandit)
  • Representation Learning
Education
  • Ph.D. in Computer Science, September 2018 - June 2023

    Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University

  • B.E. in Computer Science, September 2014 - June 2018

    Xiamen University

Preprints

Yihan Du, Anna Winnicki, Gal Dalal, Shie Mannor, R. Srikant, “Reinforcement Learning with Segment Feedback,” Preprint, 2024.

Publications

Yihan Du, Anna Winnicki, Gal Dalal, Shie Mannor, R. Srikant, “Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization,” International Conference on Machine Learning (ICML), 2024. [pdf] [arXiv]

Yihan Du, R. Srikant, Wei Chen, “Cascading Reinforcement Learning,” International Conference on Learning Representations (ICLR), 2024 (spotlight, top 5%). [pdf] [arXiv]

Yu Chen#, Yihan Du, Pihe Hu, Siwei Wang, Desheng Wu, Longbo Huang, “Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback,” International Conference on Learning Representations (ICLR), 2024 (#graduate student mentored with my Ph.D. advisor). [pdf] [arXiv]

Nuoya Xiong#, Yihan Du, Longbo Huang, “Provably Safe Reinforcement Learning with Step-wise Violation Constraints,” Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023 (#undergraduate student mentored with my Ph.D. advisor). [pdf] [arXiv]

Yihan Du, Longbo Huang, Wen Sun, “Multi-task Representation Learning for Pure Exploration in Linear Bandits,” International Conference on Machine Learning (ICML), 2023. [pdf] [arXiv]

Yihan Du, Siwei Wang, Longbo Huang, “Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path,” International Conference on Learning Representations (ICLR), 2023. [pdf] [arXiv]

Yihan Du, Wei Chen, Yuko Kuroki, Longbo Huang, “Collaborative Pure Exploration in Kernel Bandit,” International Conference on Learning Representations (ICLR), 2023. [pdf] [arXiv]

Yihan Du, Wei Chen, “Branching Reinforcement Learning,” International Conference on Machine Learning (ICML), 2022. [pdf] [arXiv]

Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang, “Continuous Mean-Covariance Bandits,” Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf] [arXiv]

Yihan Du, Yuko Kuroki, Wei Chen, “Combinatorial Pure Exploration with Bottleneck Reward Function,” Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2021. [pdf] [arXiv]

Yihan Du, Siwei Wang, Longbo Huang, “A One-Size-Fits-All Solution to Conservative Bandit Problems,” Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021. [pdf] [arXiv]

Yihan Du*, Yuko Kuroki*, Wei Chen, “Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback,” Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021 (*equal contribution). [pdf] [arXiv]

[*alphabetical order] Wei Chen, Yihan Du, Longbo Huang, Haoyu Zhao, “Combinatorial Pure Exploration for Dueling Bandit,” International Conference on Machine Learning (ICML), 2020. [pdf] [arXiv]

Yihan Du, Siwei Wang, Longbo Huang, “Dueling Bandits: From Two-dueling to Multi-dueling,” Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2020. [pdf] [arXiv]

Yihan Du, Yan Yan, Si Chen, Yang Hua, “Object-adaptive LSTM Network for Real-time Visual Tracking with Adversarial Data Augmentation,” Neurocomputing, 2019.

Yihan Du, Yan Yan, Si Chen, Yang Hua, Hanzi Wang, “Object-adaptive LSTM Network for Visual Tracking,” International Conference on Pattern Recognition (ICPR), 2018.

Selected Awards

China Computer Federation (CCF) Agent and Multi-Agent System Doctoral Dissertation Award, by CCF Multi-Agent System Committee, June 2024 (the only recipient nationwide)

Tsinghua Outstanding Doctoral Dissertation Award, by Tsinghua University, June 2023 (the only recipient among CS graduates at IIIS, Tsinghua University)

Beijing Outstanding Graduate, by Beijing Municipal Education Commission, June 2023 (the only recipient among CS graduates at IIIS, Tsinghua University)

China National Scholarship for Ph.D. Students, by Ministry of Education of China, October 2022 (the only recipient among CS students at IIIS, Tsinghua University)

Toyota Scholarship, by Toyota and Tsinghua University, October 2021

Huawei Academic Excellence Scholarship, by Huawei and Tsinghua University, October 2020

Wuqing Talent Scholarship, by Tianjin Wuqing District Government and Tsinghua University, October 2020

Outstanding Graduate, by Xiamen University, June 2018

Invited Talks

“Why is RLHF Data-Efficient in Policy Optimization,” China Computer Federation (CCF) Agent and Multi-Agent System Seminar, June 2024

“Risk-aware Online Decision Making,” TrustML Young Scientist Seminar, RIKEN AIP, May 2023

“Risk-aware Online Decision Making,” MLOPT Idea Seminar, University of Wisconsin-Madison, April 2023

“Combinatorial Pure Exploration for Dueling Bandit,” CCF Doctoral Forum in Theoretical Computer Science, June 2021 (only 18 Ph.D. students in theoretical computer science are invited nationwide)

Academic Service & Activities

Reviewer
Conference: ICML 2021-2024, NeurIPS 2021-2024, ICLR 2022-2025, AAAI 2025, AISTATS 2025, UAI 2024, RLC 2024

Journal: Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Journal of Machine Learning Research (JMLR), Transactions on Networking (ToN), Transactions on Machine Learning Research (TMLR), Transactions on Network Science and Engineering (TNSE)

Technical Program Committee (TPC) Member
INFOCOM 2025, WiOpt 2024

Teaching Assistant
Stochastic Network Optimization (taught in English), graduate course at IIIS, Tsinghua University, Spring 2021
Introduction to Computer Science (taught in English), undergraduate course at Yao Class, Tsinghua University, Fall 2019

Social Activity
President of Graduate Union at IIIS, Tsinghua University, June 2020 - June 2021

Contact

  • yihandu@illinois.edu
  • Coordinated Science Laboratory, 1308 W Main Street, Urbana, IL 61801, United States