Zhengyan Shi

Hi, welcome to my personal page. I am a Senior Researcher at Microsoft Research (MSR). I obtained my PhD in Computer Science at University College London (UCL). Before that, I completed an MSc in Data Science (Statistics) with Distinction at UCL and a BSc in Mathematics with First Class Honours from the University of Liverpool and Xi'an Jiaotong-Liverpool University. I have also held research internships at Cohere (London) and Amazon (London & Seattle).

My current research at MSR focuses on teaching language models (LMs) to code. I build learning loops in which LMs not only act but also reason within scalable, self-evolving environments. By allowing models to plan, converse, and iterate inside these realistic sandboxes, I explore how LMs can continually refine themselves through interaction. Central to my work is the ambition to leverage LMs efficiently and robustly to solve general tasks. To that end, my existing work can be broadly categorized into the following directions:

Language Model Post-training [ Preprint 2024, NeurIPS 2024, ICLR 2024, NeurIPS 2023, Findings of ACL 2023 ]
Interactive Systems [ ECIR 2024, Findings of EMNLP 2023, AAAI 2022, Findings of NAACL 2022, NeurIPS WS 2022 ]

Google Scholar / Twitter / Github / LinkedIn / Email

Research (Selected)

Understanding Likelihood Over-optimisation in Direct Alignment Algorithms

Zhengyan Shi, Sander Land, Acyr Locatelli, Matthieu Geist, Max Bartolo

Preprint, 2024

Paper

In this work, we identify a critical issue of likelihood over-optimisation in state-of-the-art Direct Alignment Algorithms (DAAs) and explore the relationship between completion likelihood and model performance.

Instruction Tuning With Loss Over Instructions

Zhengyan Shi, Adam X. Yang, Bin Wu, Laurence Aitchison, Emine Yilmaz, Aldo Lipani

Advances in Neural Information Processing Systems (NeurIPS), 2024

Paper / Github / Community Discussion

We show that in certain scenarios, applying loss to instructions rather than outputs only, which we refer to as Instruction Modelling, could largely improve the performance of instruction tuning on both various NLP and open-ended generation benchmarks. Remarkably, in the most advantageous case, our approach boosts model performance on AlpacaEval 1.0 by over 100%.

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

Zhengyan Shi, Aldo Lipani

International Conference on Learning Representations (ICLR), 2024

Paper / Github / Trending in Community

Improves efficiency of Prompt Tuning in both time and memory by over 20% (when T5-Base is used as the backbone), with better performance. DePT grows more efficient as the model size increases.

Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner

Zhengyan Shi, Aldo Lipani

Advances in Neural Information Processing Systems (NeurIPS), 2023

Paper / Github / Trending in Community

Combines the idea of the instruction tuning and language modelling. Represents the first work to perform instruction tuning via unsupervised objectives. Boosts prompt-based fine-tuning performance by over 20% in absolute.

Rethinking Semi-supervised Learning with Language Models

Zhengyan Shi, Francesco Tonolini, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai, Yunlong Jiao

Association for Computational Linguistics (Findings of ACL), 2023

Paper / Github

Shows Task-adaptive Pre-training (TAPT) as a simple yet effective method for semi-supervised learning (often SoTA performance). Highlights the effectiveness of TAPT even with only a few hundred unlabelled samples (in contrary to the common belief that continued pre-training requires a large amount of unlabelled data).

StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

Zhengyan Shi, Qiang Zhang, Aldo Lipani

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2022

Paper / Github / HuggingFace Dataset

Introduces StepGame, a new benchmark for testing multi-hop spatial reasoning in texts. This dataset challenges models to perform robust spatial reasoning across multiple steps, providing a valuable tool for advancing natural language understanding in complex spatial scenarios.

Teaching Activities

Guest Lecturer: Applied Artificial Intelligence
University College London, Academic year 2023/24
Guest Lecturer: Machine Learning for Data Science
University College London, Academic year 2023/24
Teaching Assistant: Statistical Natural Language Processing
University College London, Academic year 2023/24
Teaching Assistant: Machine Learning for Data Science
University College London, Academic years 2020/21 - 2023/24
Teaching Assistant: Geospatial Programming
University College London, Academic years 2020/21 - 2023/24
Co-supervisor: MSc Research Project
University College London, Academic year 2022/23 - 2023/24

Recent News

2024/10 New preprint on likelihood over-optimisation in direct alignment algorithms is now available on arXiv.

2024/09 Paper on "Instruction Tuning With Loss Over Instructions" accepted to NeurIPS 2024!

2024/01 Paper on "DePT: Decomposed Prompt Tuning" accepted to ICLR 2024!

2023/09 Paper accepted to NeurIPS 2023 on powerful prompt-based fine-tuning!

Academic Services

Program Committee: NeurIPS (2023, 2024), ICML (2024), ICLR (2025), AAAI (2023, 2024), COLM (2024), ACL ARR (Feb. 2023 - Jan. 2024), ACL (2023), EMNLP (2022, 2023), EACL (2023), COLING (2023, 2024), ECML/PKDD (2022), KDD (2023), SIGIR (2022, 2023, 2024), ECIR (2024), SDM (2024)