I am a Ph.D. candidate at X-Lance Lab, Shanghai Jiao Tong University, under supervision of Prof. Kai Yu, majoring in computer science and technology. Before that, I received my B.S. degree from the Department of Automation, Tsinghua University in 2020. In the first four months of 2024, I worked as a research assisstant with Prof. Tao Yu at XLANG Lab, the University of Hong Kong.
My research interest focuses on text-rich visual UI interaction. Currently, Iām working on constrution of realistic, complex interaction benchmark for GUI interaction. Iām also studying how to design smarter GUI agents with reinforcement learning (RL) and large language models (LLM), or by combining both.
š Publications
Large Language Models Are Semi-Parametric Reinforcement Learning Agents
Danyang Zhang, Lu Chen, Situo Zhang, Hongshen Xu, Zihan Zhao, Kai Yu
NeurIPS 2023 | Code
- We designed Rememberer, a novel evolvable LLM-based agent framework, by equipping the LLM with a long-term experience memory, so as to enable the LLM to exploit the interaction experiences to improve performance.
- We proposed Reinforcement Learning with Experience Memory (RLEM) to update the memory, so that the agent can learn from both success and failure, and evolve its capability without fine-tuning LLM parameters.
- Rememberer demonstrates superior performance and robustness on both WebShop and WikiHow task set.
Mobile-Env: An Evaluation Platform and Benchmark for Interactive Agents in LLM Era
Danyang Zhang, Lu Chen, Zihan Zhao, Ruisheng Cao, Kai Yu
Project | Task Set
We designed an easily-extensible, adaptable, and close-to-reality interaction platform based on Android Mobile.
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu
Project | Code
We designed a unified interaction benchmark for real-world desktop tasks.
- WebSRC: A Dataset for Web-Based Structural Reading Comprehension
Xingyu Chen, Zihan Zhao, Lu Chen, JiaBao Ji, Danyang Zhang, Ao Luo, Yuxuan Xiong, Kai Yu
EMNLP 2021 | Project - Rotation-robust Intersection over Union for 3D Object Detection
Yu Zheng, Danyang Zhang, Sinan Xie, Jiwen Lu, Jie Zhou
ECCV 2020 - COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis
Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, Jie Zhou
CVPR 2019 | Project - Technical Report of MoGUI and MoCon. Zichen Zhu, Liangtai Sun, Danyang Zhang, Ziyuan Li, Guangpeng Li, Lu Chen, Kai Yu.
- Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition. Yansong Tang, Xingyu Liu, Xumin Yu, Danyang Zhang, Jiwen Lu, Jie Zhou. TOMM 2022.
- Uncertainty-Aware Score Distribution Learning for Action Quality Assessment. Yansong Tang, Zanlin Ni, Jiahuan Zhou, Danyang Zhang, Jiwen Lu, Ying Wu, Jie Zhou. CVPR 2020.
š Educations
- 2020.9-(2025.6), Ph.D., SEIEE, Shanghai Jiao Tong University
- 2016.8-2020.6, B.S., Department of Automation, Tsinghua University
š Honors and Awards
- 2020.9-2025.6, The 2nd Wu Wenjun AI Honorary Doctoral Progam
- 2020.9-2025.6, Zhang Xu Scholarship
- 2018.10, Academic Excellence Scholarship 2017~2018
- 2017.10, Academic Excellence Scholarship 2017~2018