Zhehao Zhang

I am a second year Master student in Computer Science at Dartmouth College. Currently, I am a research intern at Stanford SALT Lab under the supervision of Diyi Yang. Previously, I worked as a Research Intern at Adobe Research and Microsoft Research Lab – Asia. I received my bachelor's degree in Artificial Intelligence Honor Class at Shanghai Jiao Tong University.

My research interests lie in Natural Language Processing (NLP).

I am actively seeking a PhD position in NLP in the USA for Fall 2025.

Please feel free to contact me by email!

Mail  /  RĂ©sumĂ©  /  Google Scholar  /  Github 

profile photo

News📢

  • 2024 Sept 25th: The DARG paper is accepted to NeurIPS 2024. See you in Vancouver!
  • 2024 Sept 20th: One first-author paper on vision language model for code generation is accepted to EMNLP 2024. See you in Miami!
  • 2024 Sept 5th: Honored to give a talk on Recent Advances in Synthetic Data for Foundation Models at Stanford SALT lab. Slides are available here.
  • 2024 June 25th: A new preprint is released! Please check DARG which is a dynamic evaluation framework aiming to augment current reasoning benchmarks from the level of reasoning graphs.
  • 2024 Mar 13th: One first-author paper on LLMs for hierarchical table analysis is accepted to NAACL 2024. See you in Mexico City!
  • 2024 Mar: Honored to give a talk on Augmented Language Models at TRIP Lab at Dartmouth, hosted by Prof. Yaoqing Yang. Recording and the slide are available.
  • 2024 Feb: I will join Adobe Research as Research Intern in this summer. See you in San Jose and bay area!
  • 2023 Dec 14th: One first-author papers from my undergraduate is accepted to ICASSP 2024.
  • 2023 Oct 27th: The paper titled "Can Large Language Models Transform Computational Social Science?" is accepted to Computational Linguistics.
  • 2023 Oct 7th: Two first-author papers from my undergraduate are accepted to EMNLP 2023. See you in Singapore!

Research Interests🔍

My current research interests lie in multiple fields in NLP including:

  • Language Agent
  • Synthetic Data and Dynamic Evaluation of LLMs
  • Computational social science (NLP for social good)
  • Multi-Modal Large Language Model
I also have a broad interest in other topics in NLP and Machine Learning.

Selected Publications đź“š (View full publication list on my Google Scholar Profile)

DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph
Zhehao Zhang, Jiaao Chen, Diyi Yang
NeurIPS, 2024
Code / Project / Paper

TL; DR: We propose a dynamic evaluation framework named DARG, aimed at augmenting current reasoning benchmarks from the level of reasoning graphs. We evaluate 15 SOTA LLMs and observe a consistent performance decrease for all LLMs as the complexity level increases, along with increasing biases on some datasets.

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use
Zhehao Zhang, Ryan A. Rossi, Tong Yu, Franck Dernoncourt, Ruiyi Zhang, Jiuxiang Gu, Sungchul Kim, Xiang Chen, Zichao Wang, Nedim Lipka,
Arxiv, 2410.16400
Paper

TL; DR: We introduce VipAct, a multi-agent framework that combines visual-language models (VLMs) with vision expert models to enhance fine-grained visual perception, showing significant improvements over current baselines by leveraging multi-agent collaboration and specialized tools for visual perception tasks.

Is GPT-4V (ision) All You Need for Automating Academic Data Visualization? Exploring Vision-Language Models’ Capability in Reproducing Academic Charts
Zhehao Zhang, Weicheng Ma, Soroush Vosoughi
Findings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Dataset / Paper

TL; DR: We explore the capabilities of vision-language models (VLMs) in generating Python code templates to reproduce academic data visualizations, finding that while closed-source models like GPT-4-Vision show promise in reproducing complex charts, open-source alternatives are less effective, especially for sophisticated visuals.

Personalization of Large Language Models: A Survey
Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, Ruiyi Zhang, Jiuxiang Gu, Tyler Derr, Hongjie Chen, Junda Wu, Xiang Chen, Zichao Wang, Subrata Mitra, Nedim Lipka, Nesreen Ahmed, Yu Wang
Arxiv, 2411.00027
Paper

TL; DR: We provide a comprehensive survey of personalized large language models (LLMs), offering a unifying taxonomy across usage, granularity, techniques, datasets, and evaluation methods, while identifying critical challenges and open research questions in the field.

E5: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit, and Extrapolate
Zhehao Zhang, Yan Gao Jian-Guang Lou
Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Code / Paper

TL; DR: We propose a tool-augmented LLM framework named $E^5$ that contains 5 stages to solve the challenging real-life hierarchical table analysis task which achieves an 85.08 exact match score and 93.11 GPT-4-Eval Score. We also introduce $F^3$ which significantly reduces the token length while maintaining useful information to analyze such huge tables.

CRT-QA: A Dataset of Complex Reasoning Question Answering over Tabular Data
Zhehao Zhang, Xitao Li, Yan Gao Jian-Guang Lou
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Code / Paper

TL; DR: We systematically evaluate LLMs' reasoning ability on tabular data and establish a comprehensive taxonomy on operation and reasoning types for table analysis. hen, we propose CRT-QA, a dataset of complex reasoning QA over tables. We propose ARC which effectively utilizes table analysis tools to solve table reasoning tasks without manually-annotated exemplars.

Mitigating Biases in Hate Speech Detection from A Causal Perspective
Zhehao Zhang, Jiaao Chen, Diyi Yang
Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Code / Paper

TL; DR: We analyze the generation process of HS biases from a causal view and identify two confounders that cause the biases. Propose Multi-Task Intervention and Data-Specific Intervention to mitigate them.

Can Large Language Models Transform Computational Social Science?
Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, Diyi Yang
Computational Linguistics, 2023
Code / Paper

TL; DR: We provide a road map for using LLMs as CSS tools and contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 24 representative CSS benchmarks.

Education🎓

Dartmouth College, Master of Science in Computer Science, 2023 - 2025 (expected)

Shanghai Jiao Tong University, B.Eng. in Artificial Intelligence (Honor Class), 2019 - 2023

Experiencesđź› 

Social and Language Technologies (SALT) lab at Stanford, Research Intern

Adobe Research, Research Intern

Data, Knowledge, and Intelligence group at Microsoft Research Lab – Asia, Research Intern

Selected Courses

AI courses: Natural language processing (94 points), Deep learning and application (92.95 points), Computer vision, Reinforcement learning (94 points), Machine learning, Machine learning project, Knowledge representation and reasoning (97 points), A practical course to intelligence perception and cognition (90 points), Brain-Inspired Intelligence (92 points), Artificial intelligence problem solving and practice (95 points), Intelligent speech recognition (92 points), Data mining (91 points), Game theory and multi-agent learning, Programming practices of artificial intelligence, Lab practice (A+)

Other CS courses: Data structure(Honor)(92 points), Thinking and approach of programming (C++)(Honor), Data structure (C++)(Honor), Design and analysis of algorithms, Computer architecture(91 points), Operating system(91 points), Internet of thing(95.5 points)

Math courses: Stochastic process (95 points), Mathematical analysis(Honor), Linear algebra(Honor), Discrete mathematics(Honor), Complex analysis(Honor), Probability and Statistics, Convex and linear optimization, Signals and Systems, Digital signal and image processing


Website template borrowed from Jon Barron's personal page Here