Publications

Selected and recent work across efficient ML systems, LLM infrastructure, MoE/RL post-training, federated learning, optimization, pedagogical visualization, and trustworthy machine learning.

32 papers 9 years 13 venues 7 technical reports
Year
Venue
Topic

Showing 32 publications

2026

Peer-reviewed ICLR 2026

From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization

H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]

OpenReview
AI Agents Data & Evaluation Education Visualization
Details

Citation

H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]

BibTeX

@inproceedings{ji2026from,
  title={From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization},
  author={Haonian Ji and Shi Qiu and Siyang Xin and Siwei Han and Zhaorun Chen and Dake Zhang and Hongyi Wang and Huaxiu Yao},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=FVCpV04ZRe}
}

2024

Peer-reviewed NeurIPS 2024

SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]

LLM Systems Data & Evaluation
Details

Citation

Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]

Peer-reviewed NeurIPS Datasets and Benchmarks Track 2024

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]

LLM Systems Open Models
Details

Citation

X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]

Peer-reviewed NeurIPS 2024

FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations

Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]

LLM Systems Federated Learning
Details

Citation

Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]

Peer-reviewed ICML 2024

TrustLLM: Trustworthiness in Large Language Models

H. Wang with many collegues (Position Paper), ICML 2024 [link] [arXiv]

LLM Systems Data & Evaluation
Details

Citation

H. Wang with many collegues (Position Paper), ICML 2024 [link] [arXiv]

Peer-reviewed ICML 2024

Maestro: Uncovering Low-Rank Structures via Trainable Decomposition

S. Horváth, S. Laskaridis, S. Rajput, H. Wang, ICML 2024 [link] [arXiv]

Optimization Model Compression
Details

Citation

S. Horváth, S. Laskaridis, S. Rajput, H. Wang, ICML 2024 [link] [arXiv]

Peer-reviewed COLM 2024 Selected

LLM360: Towards Fully Transparent Open-Source LLMs

Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]

LLM Systems Open Models
Details

Citation

Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]

Peer-reviewed COLM 2024

Crystal: Illuminating LLM Abilities on Language and Code

T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]

LLM Systems Data & Evaluation
Details

Citation

T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]

Peer-reviewed NAACL Demo 2024 Best Demo Runner-Up

RedCoast: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

B. Tan, Y. Zhu, L. Liu, H. Wang, Y. Zhuang, J. Chen, E. P. Xing, Z. Hu, NAACL Demo 2024 ($\color{red}{\text{the Best Demo Runner Up}}$) [link] [arXiv]

LLM Systems Distributed Training
Details

Citation

B. Tan, Y. Zhu, L. Liu, H. Wang, Y. Zhuang, J. Chen, E. P. Xing, Z. Hu, NAACL Demo 2024 ($\color{red}{\text{the Best Demo Runner Up}}$) [link] [arXiv]

Peer-reviewed MLSys 2024

Does compressing activations help model parallel training?

S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]

Distributed Training Model Compression
Details

Citation

S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]

Peer-reviewed ICLR 2024

Fusing Models with Complementary Expertise

H. Wang, F. M. Polo, Y. Sun, S. Kundu, E. P. Xing, M. Yurochkin, ICLR 2024 [link] [arXiv]

Optimization Model Fusion
Details

Citation

H. Wang, F. M. Polo, Y. Sun, S. Kundu, E. P. Xing, M. Yurochkin, ICLR 2024 [link] [arXiv]

2023

Peer-reviewed NeurIPS 2023

FedNAR: Federated Optimization with Normalized Annealing Regularization

J. Li, A. Li, C. Tian, Q. Ho, E. Xing, H. Wang, NeurIPS 2023 [link] [arXiv]

Federated Learning Optimization
Details

Citation

J. Li, A. Li, C. Tian, Q. Ho, E. Xing, H. Wang, NeurIPS 2023 [link] [arXiv]

Peer-reviewed MLSys 2023 Selected

Cuttlefish: Low-rank Model Training without All The Tuning

H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]

Efficient Training Optimization
Details

Citation

H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]

Peer-reviewed ICLR 2023 Spotlight

MPCFormer: fast, performant and private Transformer inference with MPC

D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]

Privacy & Security Efficient Inference
Details

Citation

D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]

Peer-reviewed ICLR 2023

Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach

H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]

Federated Learning Optimization
Details

Citation

H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]

2022

Peer-reviewed Findings of EMNLP 2022

Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation

K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Findings of EMNLP 2022

Federated Learning Privacy & Security
Details

Citation

K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Findings of EMNLP 2022

Peer-reviewed NeurIPS 2022

Rare Gems: Finding Lottery Tickets at Initialization

K. Sreenivasan, J. Sohn, L. Yang, M. Grinde, A. Nagle, H. Wang, E. P. Xing, K. Lee, D. Papailiopoulos, NeurIPS 2022 [arXiv]

Optimization Deep Learning
Details

Citation

K. Sreenivasan, J. Sohn, L. Yang, M. Grinde, A. Nagle, H. Wang, E. P. Xing, K. Lee, D. Papailiopoulos, NeurIPS 2022 [arXiv]

Peer-reviewed NeurIPS 2022

AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness

D. Li, H. Wang, E. P. Xing, H. Zhang, NeurIPS 2022 [arXiv]

Distributed Training Model Parallelism
Details

Citation

D. Li, H. Wang, E. P. Xing, H. Zhang, NeurIPS 2022 [arXiv]

Peer-reviewed MLSys 2022

On the Utility of Gradient Compression in Distributed Training Systems

S. Agarwal, H. Wang, S. Venkataraman, D. Papailiopoulos, MLSys 2022 [link] [arXiv]

Distributed Training Optimization
Details

Citation

S. Agarwal, H. Wang, S. Venkataraman, D. Papailiopoulos, MLSys 2022 [link] [arXiv]

2021

Peer-reviewed MLSys 2021

Pufferfish: Communication-efficient Models At No Extra Cost

H. Wang, S. Agarwal, D. Papailiopoulos, MLSys 2021 [arXiv] [link] [talk]

Efficient Training Model Compression
Details

Citation

H. Wang, S. Agarwal, D. Papailiopoulos, MLSys 2021 [arXiv] [link] [talk]

Peer-reviewed MLSys 2021

Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification

S. Agarwal, H. Wang, K. Lee, S. Venkataraman, D. Papailiopoulos, MLSys 2021, [arXiv] [link] [talk]

Distributed Training Optimization
Details

Citation

S. Agarwal, H. Wang, K. Lee, S. Venkataraman, D. Papailiopoulos, MLSys 2021, [arXiv] [link] [talk]

Technical reports

Technical Reports

Solid preprints, technical reports, and workshop manuscripts that complement the peer-reviewed publication list above.

Technical report arXiv 2026

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

D. Dong, J. Chen, H. Jia, J. Wu, H. Di, J. Liu, J. Wu, Z. Liu, Z. Liu, E. Barsoum, et al., arXiv technical report, 2026.

arXiv
LLM Systems MoE Systems Reinforcement Learning Efficient Training
Details

Abstract

PR2 studies reinforcement learning for MoE-based LLMs, where router drift can create rollout-training mismatch and unstable PPO-style updates. It introduces predictive routing replay, a lightweight router-evolution predictor that anticipates short-horizon routing changes and replays predicted routes to stabilize importance estimation and improve reasoning benchmark performance.

Citation

D. Dong, J. Chen, H. Jia, J. Wu, H. Di, J. Liu, J. Wu, Z. Liu, Z. Liu, E. Barsoum, et al., arXiv technical report, 2026.

BibTeX

@article{dong2026pr2,
  title={PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning},
  author={Dong, Daize and Chen, Junlin and Jia, Haolong and Wu, Jiawei and Di, Huanwei and Liu, Jiang and Wu, Jialian and Liu, Zhengzhong and Liu, Zicheng and Barsoum, Emad and others},
  journal={arXiv preprint arXiv:2606.00395},
  year={2026},
  url={https://arxiv.org/abs/2606.00395}
}
Technical report arXiv 2025 Selected

K2-V2: A 360-Open, Reasoning-Enhanced LLM

K2 Team, arXiv technical report, 2025.

arXiv
LLM Systems Open Models
Details

Citation

K2 Team, arXiv technical report, 2025.

BibTeX

@article{k2team2025k2v2,
  title = {K2-V2: A 360-Open, Reasoning-Enhanced LLM},
  author = {{K2 Team}},
  journal = {arXiv preprint arXiv:2512.06201},
  year = {2025}
}
Technical report arXiv 2025

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.

arXiv
LLM Systems Open Models
Details

Citation

Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.

BibTeX

@article{liu2025llm360k2,
  title = {LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch},
  author = {Liu, Zhengzhong and Tan, Bowen and Wang, Hongyi and Neiswanger, Willie and Tao, Tianhua and Li, Haonan and Koto, Fajri and Wang, Yuqi and Sun, Suqi and Pangarkar, Omkar and Fan, Richard and Gu, Yi and Miller, Victor and Ma, Liqun and Tang, Liping and Ranjan, Nikhil and Zhuang, Yonghao and He, Guowei and Wang, Renxi and Deng, Mingkai and Algayres, Robin and Li, Yuanzhi and Shen, Zhiqiang and Nakov, Preslav and Xing, Eric P.},
  journal = {arXiv preprint arXiv:2501.07124},
  year = {2025}
}
Technical report bioRxiv 2024

Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale

C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.

Biological Foundation Models Genomics Life Science AI
Details

Citation

C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.

BibTeX

@article{ellington2024accurate,
  title = {Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale},
  author = {Ellington, Caleb N. and Sun, Ning and Ho, Nicholas and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.12.01.625444}
}
Technical report bioRxiv 2024

Mixture of Experts Enable Efficient and Effective Protein Understanding and Design

N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.

bioRxiv
Biological Foundation Models Protein Models Life Science AI
Details

Citation

N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.

BibTeX

@article{sun2024mixture,
  title = {Mixture of Experts Enable Efficient and Effective Protein Understanding and Design},
  author = {Sun, Ning and Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.11.29.625425}
}
Technical report bioRxiv 2024

A Large-Scale Foundation Model for RNA Function and Structure Prediction

S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.

bioRxiv
Biological Foundation Models RNA Models Life Science AI
Details

Citation

S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.

BibTeX

@article{zou2024large,
  title = {A Large-Scale Foundation Model for RNA Function and Structure Prediction},
  author = {Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Ellington, Caleb N. and Algayres, Robin and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.11.28.625345}
}
Technical report bioRxiv 2024

Scaling Dense Representations for Single Cell with Transcriptome-Scale Context

N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.

bioRxiv
Biological Foundation Models Single Cell Life Science AI
Details

Citation

N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.

BibTeX

@article{ho2024scaling,
  title = {Scaling Dense Representations for Single Cell with Transcriptome-Scale Context},
  author = {Ho, Nicholas and Ellington, Caleb N. and Hou, Jinyu and Addagudi, Sohan and Mo, Shentong and Tao, Tianhua and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.11.28.625303}
}
Earlier Publications Peer-reviewed work before the recent five publication years

2020

Peer-reviewed NeurIPS 2020 SpicyFL workshop 2020 Best Paper Award

FedML: A Research Library and Benchmark for Federated Machine Learning

C. He, S. Li, J. So, M. Zhang, H. Wang, X. Wang, P. Vepakomma, A. Singh, H. Qiu, L. Shen, P. Zhao, Y. Kang, Y. Liu, R. Raskar, Q. Yang, M. Annavaram, S. Avestimehr, NeurIPS 2020 SpicyFL workshop, ($\color{red}{\text{the Baidu Best Paper Award}}$) [arXiv]

Federated Learning ML Systems
Details

Citation

C. He, S. Li, J. So, M. Zhang, H. Wang, X. Wang, P. Vepakomma, A. Singh, H. Qiu, L. Shen, P. Zhao, Y. Kang, Y. Liu, R. Raskar, Q. Yang, M. Annavaram, S. Avestimehr, NeurIPS 2020 SpicyFL workshop, ($\color{red}{\text{the Baidu Best Paper Award}}$) [arXiv]

Peer-reviewed NeurIPS 2020

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]

Federated Learning Privacy & Security
Details

Citation

H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]

Peer-reviewed ICLR 2020 Selected Oral

Federated Learning with Matched Averaging

H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]

Federated Learning Model Fusion
Details

Citation

H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]

2019

Peer-reviewed NeurIPS 2019

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation

S. Rajput*, H. Wang*, Z. Charles, D. Papailiopoulos, NeurIPS 2019, [link]

Distributed Training Robustness
Details

Citation

S. Rajput*, H. Wang*, Z. Charles, D. Papailiopoulos, NeurIPS 2019, [link]

Peer-reviewed ACM SIGMOD, demo track 2019

Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace

L. Chen, H. Wang, L. Chen, P. Koutris, A. Kumar, ACM SIGMOD 2019 demo track, [link]

ML Systems Data Markets
Details

Citation

L. Chen, H. Wang, L. Chen, P. Koutris, A. Kumar, ACM SIGMOD 2019 demo track, [link]

Preprint arXiv 2019

ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding

H. Wang, Z. Charles, D. Papailiopoulos [arXiv]

Distributed Training Gradient Coding
Details

Citation

H. Wang, Z. Charles, D. Papailiopoulos [arXiv]

2018

Peer-reviewed NeurIPS 2018

The Effect of Network Width on the Performance of Large-batch Training

L. Chen, H. Wang, J. Zhao, D. Papailiopoulos, P. Koutris, NeurIPS 2018, [link]

Optimization Deep Learning
Details

Citation

L. Chen, H. Wang, J. Zhao, D. Papailiopoulos, P. Koutris, NeurIPS 2018, [link]

Peer-reviewed NeurIPS 2018

ATOMO: Communication-efficient Learning via Atomic Sparsification

H. Wang*, S. Sievert*, Z. Charles, S. Wright, D. Papailiopoulos, NeurIPS 2018, [link]

Distributed Training Optimization
Details

Citation

H. Wang*, S. Sievert*, Z. Charles, S. Wright, D. Papailiopoulos, NeurIPS 2018, [link]

Peer-reviewed ICML 2018

DRACO: Robust Distributed Training via Redundant Gradients

L. Chen, H. Wang, Z. Charles, D. Papailiopoulos, ICML 2018, [link]

Distributed Training Robustness
Details

Citation

L. Chen, H. Wang, Z. Charles, D. Papailiopoulos, ICML 2018, [link]

Peer-reviewed SysML 2018

Draco: Robust Distributed Training against Adversaries

L. Chen, H. Wang, D. Papailiopoulos, SysML 2018, [link]

Distributed Training Robustness
Details

Citation

L. Chen, H. Wang, D. Papailiopoulos, SysML 2018, [link]

2017

Peer-reviewed IROS 2017

Recognizing Actions during Tactile Manipulations through Force Sensing

G. Subramani, D. Rakita, H. Wang, J. Black, M. Zinn, M. Gleicher, IROS 2017, [link]

Robotics Sensing
Details

Citation

G. Subramani, D. Rakita, H. Wang, J. Black, M. Zinn, M. Gleicher, IROS 2017, [link]