MRPO: Magnitude-Regularized Policy Optimization via L1 Constraints
ICML
Wei, Han and Yuanxing, Liu and Mingda, Li and Ruiyu, Xiao and Weinan, Zhang and Ting, Liu
Wei Han
ICML
Wei, Han and Yuanxing, Liu and Mingda, Li and Ruiyu, Xiao and Weinan, Zhang and Ting, Liu
ICML
Wei, Han and Yuanxing, Liu and Mingda, Li and Ruiyu, Xiao and Weinan, Zhang and Ting, Liu