Efficient ML Systems and LLM Infrastructure
Building scalable, practical, and trustworthy machine learning systems.
I am an Assistant Professor in the Department of Computer Science at Rutgers University. My research focuses on scalable and efficient machine learning algorithms and systems, with a current emphasis on LLMs.
Background
I was a Senior Project Scientist at the Machine Learning Department at CMU, working with Eric Xing. I obtained my PhD in Computer Science from UW-Madison, advised by Dimitris Papailiopoulos.
Research Direction
I study efficient training and serving of large-scale machine learning models, especially large language models under real system constraints.
Research focus
Systems for Useful ML
LLM infrastructure
Training, serving, evaluation, and transparency for large models under real system constraints.
Federated and private ML
Algorithms and systems that let models learn across distributed, sensitive, and heterogeneous data.
Efficient optimization
Compression, low-rank methods, model fusion, and communication-efficient distributed training.
K2-V2: A 360-Open, Reasoning-Enhanced LLM
K2 Team, arXiv technical report, 2025.
Details
Citation
K2 Team, arXiv technical report, 2025.
BibTeX
@article{k2team2025k2v2,
title = {K2-V2: A 360-Open, Reasoning-Enhanced LLM},
author = {{K2 Team}},
journal = {arXiv preprint arXiv:2512.06201},
year = {2025}
}
LLM360: Towards Fully Transparent Open-Source LLMs
Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]
Details
Citation
Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]
Cuttlefish: Low-rank Model Training without All The Tuning
H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]
Federated Learning with Matched Averaging
H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]
EduVisAgent accepted to ICLR 2026
Our work on EduVisBench and EduVisAgent was accepted to the Fourteenth International Conference on Learning Representations.
AMD University Program AI and HPC Cluster Allocation Award
Our lab received an AMD University Program AI and HPC Cluster Allocation Award.
K2-V2 and LLM360 K2 technical reports
We added technical reports on K2-V2 and LLM360 K2 to the publications page.
RU CS 671: Recent Advances in Large Language Models
I am teaching RU CS 671, a graduate course on recent advances in large language models.
People
Research Group
Courses
Teaching
Community
Services
Area Chair: NeurIPS 2026, MLSys 2025, CPAL 2026
PC Member: DAC 2024, EuroSys 2024, SOSP 2023 (light PC), MLSys 2023-2026, SIGKDD 2022, AAAI 2021-2022
Reviewer (Journals): JMLR, TMLR, IEEE TNNLS, IEEE IoT-J, IEEE/ACM Transactions on Networking
Reviewer (Conferences): SC 2026, COLM 2026, ICML 2019-2026, NeurIPS 2019-2025, CVPR 2021-2023, ICCV 2021-2022, ICLR 2021-2025, AAAI 2021-2024, SIGKDD 2022-2023
