Provably Efficient Reinforcement Learning with Linear Function Approximation C Jin, Z Yang, Z Wang, MI Jordan Mathematics of Operations Research/Annual Conference on Learning Theory, 2022 | 809 | 2022 |
A Theoretical Analysis of Deep Q-Learning J Fan, Z Wang, Y Xie, Z Yang Learning for Dynamics and Control, 2020 | 767 | 2020 |
Is Pessimism Provably Efficient for Offline RL? Y Jin, Z Yang, Z Wang International Conference on Machine Learning, 2021 | 405 | 2021 |
Provably Efficient Exploration in Policy Optimization Q Cai, Z Yang, C Jin, Z Wang International Conference on Machine Learning, 2020 | 306 | 2020 |
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic M Hong, HT Wai, Z Wang, Z Yang SIAM Journal on Optimization, 2022 | 270* | 2022 |
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence L Wang, Q Cai, Z Yang, Z Wang International Conference on Learning Representations, 2020 | 256 | 2020 |
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy B Liu, Q Cai, Z Yang, Z Wang Advances in Neural Information Processing Systems, 2019 | 216* | 2019 |
Optimal Computational and Statistical Rates of Convergence for Sparse Nonconvex Learning Problems Z Wang, H Liu, T Zhang Annals of Statistics, 2014 | 206 | 2014 |
Multi-Agent Reinforcement Learning via Double-Averaging Primal-Dual Optimization HT Wai, Z Yang, Z Wang, M Hong Advances in Neural Information Processing Systems, 2018 | 202 | 2018 |
A Strictly Contractive Peaceman--Rachford Splitting Method for Convex Programming B He, H Liu, Z Wang, X Yuan SIAM Journal on Optimization, 2014 | 195 | 2014 |
A Nonconvex Optimization Framework for Low Rank Matrix Estimation T Zhao, Z Wang, H Liu Advances in Neural Information Processing Systems, 2015 | 193* | 2015 |
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization D Ding, X Wei, Z Yang, Z Wang, MR Jovanović International Conference on Artificial Intelligence and Statistics, 2021 | 177 | 2021 |
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima Q Cai, Z Yang, JD Lee, Z Wang Mathematics of Operations Research/Advances in Neural Information Processing …, 2019 | 152* | 2019 |
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium Q Xie, Y Chen, Z Wang, Z Yang Mathematics of Operations Research/Annual Conference on Learning Theory, 2022 | 151 | 2022 |
On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost Z Yang, Y Chen, M Hong, Z Wang Advances in Neural Information Processing Systems, 2019 | 148 | 2019 |
Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations Z Yang, C Jin, Z Wang, M Wang, MI Jordan Advances in Neural Information Processing Systems, 2020 | 138* | 2020 |
High-Dimensional Expectation-Maximization Algorithm: Statistical Optimization and Asymptotic Normality Z Wang, Q Gu, Y Ning, H Liu Advances in Neural Information Processing Systems, 2015 | 137 | 2015 |
A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum P Khanduri, S Zeng, M Hong, HT Wai, Z Wang, Z Yang Advances in Neural Information Processing Systems, 2021 | 132 | 2021 |
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu, Z Wang International Conference on Learning Representations, 2022 | 129 | 2022 |
Convergent Policy Optimization for Safe Reinforcement Learning M Yu, Z Yang, M Kolar, Z Wang Advances in Neural Information Processing Systems, 2019 | 128 | 2019 |