Notes on Secrets of RLHF in Large Language Models Part I: PPO
Link to paper: https://arxiv.org/abs/2307.04964
Paper published on: 2023-07-11
Paper's authors: Rui Zheng, Shihan Dou, Songyang Gao, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Limao Xiong, Lu Chen, Zhiheng Xi, Yuhao Zhou, Nuo Xu, Wenbin La...
feralmachine.com5 min read