Make agents reliable at OpenAI. Previously staff research scientist at Google DeepMind on Gemini post-training. I received my PhD in computer science from UC Berkeley, advised by Prof. Dawn Song in 2022. Previously, I earned my Bachelor's degree with honors in computer science from Peking University in 2018.
Why LLM evaluations may fail exactly when models undergo qualitative phase transitions.