Publications
Selected and recent work across efficient ML systems, LLM infrastructure, MoE/RL post-training, federated learning, optimization, pedagogical visualization, and trustworthy machine learning.
Showing 32 publications
2026
From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]
Details
Citation
H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]
BibTeX
@inproceedings{ji2026from,
title={From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization},
author={Haonian Ji and Shi Qiu and Siyang Xin and Siwei Han and Zhaorun Chen and Dake Zhang and Hongyi Wang and Huaxiu Yao},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=FVCpV04ZRe}
}
2024
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]
Details
Citation
Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]
Details
Citation
X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]
FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations
Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]
Details
Citation
Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]
TrustLLM: Trustworthiness in Large Language Models
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
LLM360: Towards Fully Transparent Open-Source LLMs
Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]
Details
Citation
Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]
Crystal: Illuminating LLM Abilities on Language and Code
T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]
Details
Citation
T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]
RedCoast: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs
B. Tan, Y. Zhu, L. Liu, H. Wang, Y. Zhuang, J. Chen, E. P. Xing, Z. Hu, NAACL Demo 2024 ($\color{red}{\text{the Best Demo Runner Up}}$) [link] [arXiv]
Does compressing activations help model parallel training?
S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]
Details
Citation
S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]
Fusing Models with Complementary Expertise
2023
FedNAR: Federated Optimization with Normalized Annealing Regularization
Cuttlefish: Low-rank Model Training without All The Tuning
H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]
MPCFormer: fast, performant and private Transformer inference with MPC
D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]
Details
Citation
D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]
Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach
H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]
Details
Citation
H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]
2022
Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation
K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Findings of EMNLP 2022
Details
Citation
K. Zhang, Y. Wang, H. Wang, L. Huang, C. Yang, X. Chen, L. Sun, Findings of EMNLP 2022
Rare Gems: Finding Lottery Tickets at Initialization
K. Sreenivasan, J. Sohn, L. Yang, M. Grinde, A. Nagle, H. Wang, E. P. Xing, K. Lee, D. Papailiopoulos, NeurIPS 2022 [arXiv]
Details
Citation
K. Sreenivasan, J. Sohn, L. Yang, M. Grinde, A. Nagle, H. Wang, E. P. Xing, K. Lee, D. Papailiopoulos, NeurIPS 2022 [arXiv]
AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness
D. Li, H. Wang, E. P. Xing, H. Zhang, NeurIPS 2022 [arXiv]
Details
Citation
D. Li, H. Wang, E. P. Xing, H. Zhang, NeurIPS 2022 [arXiv]
2021
Pufferfish: Communication-efficient Models At No Extra Cost
Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Technical reports
Technical Reports
Solid preprints, technical reports, and workshop manuscripts that complement the peer-reviewed publication list above.
PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning
D. Dong, J. Chen, H. Jia, J. Wu, H. Di, J. Liu, J. Wu, Z. Liu, Z. Liu, E. Barsoum, et al., arXiv technical report, 2026.
Details
Abstract
PR2 studies reinforcement learning for MoE-based LLMs, where router drift can create rollout-training mismatch and unstable PPO-style updates. It introduces predictive routing replay, a lightweight router-evolution predictor that anticipates short-horizon routing changes and replays predicted routes to stabilize importance estimation and improve reasoning benchmark performance.
Citation
D. Dong, J. Chen, H. Jia, J. Wu, H. Di, J. Liu, J. Wu, Z. Liu, Z. Liu, E. Barsoum, et al., arXiv technical report, 2026.
BibTeX
@article{dong2026pr2,
title={PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning},
author={Dong, Daize and Chen, Junlin and Jia, Haolong and Wu, Jiawei and Di, Huanwei and Liu, Jiang and Wu, Jialian and Liu, Zhengzhong and Liu, Zicheng and Barsoum, Emad and others},
journal={arXiv preprint arXiv:2606.00395},
year={2026},
url={https://arxiv.org/abs/2606.00395}
}
K2-V2: A 360-Open, Reasoning-Enhanced LLM
K2 Team, arXiv technical report, 2025.
Details
Citation
K2 Team, arXiv technical report, 2025.
BibTeX
@article{k2team2025k2v2,
title = {K2-V2: A 360-Open, Reasoning-Enhanced LLM},
author = {{K2 Team}},
journal = {arXiv preprint arXiv:2512.06201},
year = {2025}
}
LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch
Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.
Details
Citation
Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.
BibTeX
@article{liu2025llm360k2,
title = {LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch},
author = {Liu, Zhengzhong and Tan, Bowen and Wang, Hongyi and Neiswanger, Willie and Tao, Tianhua and Li, Haonan and Koto, Fajri and Wang, Yuqi and Sun, Suqi and Pangarkar, Omkar and Fan, Richard and Gu, Yi and Miller, Victor and Ma, Liqun and Tang, Liping and Ranjan, Nikhil and Zhuang, Yonghao and He, Guowei and Wang, Renxi and Deng, Mingkai and Algayres, Robin and Li, Yuanzhi and Shen, Zhiqiang and Nakov, Preslav and Xing, Eric P.},
journal = {arXiv preprint arXiv:2501.07124},
year = {2025}
}
Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale
C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.
Details
Citation
C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.
BibTeX
@article{ellington2024accurate,
title = {Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale},
author = {Ellington, Caleb N. and Sun, Ning and Ho, Nicholas and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.12.01.625444}
}
Mixture of Experts Enable Efficient and Effective Protein Understanding and Design
N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.
Details
Citation
N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.
BibTeX
@article{sun2024mixture,
title = {Mixture of Experts Enable Efficient and Effective Protein Understanding and Design},
author = {Sun, Ning and Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.11.29.625425}
}
A Large-Scale Foundation Model for RNA Function and Structure Prediction
S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.
Details
Citation
S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.
BibTeX
@article{zou2024large,
title = {A Large-Scale Foundation Model for RNA Function and Structure Prediction},
author = {Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Ellington, Caleb N. and Algayres, Robin and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.11.28.625345}
}
Scaling Dense Representations for Single Cell with Transcriptome-Scale Context
N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.
Details
Citation
N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.
BibTeX
@article{ho2024scaling,
title = {Scaling Dense Representations for Single Cell with Transcriptome-Scale Context},
author = {Ho, Nicholas and Ellington, Caleb N. and Hou, Jinyu and Addagudi, Sohan and Mo, Shentong and Tao, Tianhua and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.11.28.625303}
}
Earlier Publications Peer-reviewed work before the recent five publication years
2020
FedML: A Research Library and Benchmark for Federated Machine Learning
C. He, S. Li, J. So, M. Zhang, H. Wang, X. Wang, P. Vepakomma, A. Singh, H. Qiu, L. Shen, P. Zhao, Y. Kang, Y. Liu, R. Raskar, Q. Yang, M. Annavaram, S. Avestimehr, NeurIPS 2020 SpicyFL workshop, ($\color{red}{\text{the Baidu Best Paper Award}}$) [arXiv]
Details
Citation
C. He, S. Li, J. So, M. Zhang, H. Wang, X. Wang, P. Vepakomma, A. Singh, H. Qiu, L. Shen, P. Zhao, Y. Kang, Y. Liu, R. Raskar, Q. Yang, M. Annavaram, S. Avestimehr, NeurIPS 2020 SpicyFL workshop, ($\color{red}{\text{the Baidu Best Paper Award}}$) [arXiv]
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]
Details
Citation
H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]
Federated Learning with Matched Averaging
H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]
2019
DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation
S. Rajput*, H. Wang*, Z. Charles, D. Papailiopoulos, NeurIPS 2019, [link]
Details
Citation
S. Rajput*, H. Wang*, Z. Charles, D. Papailiopoulos, NeurIPS 2019, [link]
Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace
L. Chen, H. Wang, L. Chen, P. Koutris, A. Kumar, ACM SIGMOD 2019 demo track, [link]
Details
Citation
L. Chen, H. Wang, L. Chen, P. Koutris, A. Kumar, ACM SIGMOD 2019 demo track, [link]
ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding
H. Wang, Z. Charles, D. Papailiopoulos [arXiv]
Details
Citation
H. Wang, Z. Charles, D. Papailiopoulos [arXiv]
2018
The Effect of Network Width on the Performance of Large-batch Training
L. Chen, H. Wang, J. Zhao, D. Papailiopoulos, P. Koutris, NeurIPS 2018, [link]
Details
Citation
L. Chen, H. Wang, J. Zhao, D. Papailiopoulos, P. Koutris, NeurIPS 2018, [link]
ATOMO: Communication-efficient Learning via Atomic Sparsification
H. Wang*, S. Sievert*, Z. Charles, S. Wright, D. Papailiopoulos, NeurIPS 2018, [link]
Details
Citation
H. Wang*, S. Sievert*, Z. Charles, S. Wright, D. Papailiopoulos, NeurIPS 2018, [link]
DRACO: Robust Distributed Training via Redundant Gradients
L. Chen, H. Wang, Z. Charles, D. Papailiopoulos, ICML 2018, [link]
Details
Citation
L. Chen, H. Wang, Z. Charles, D. Papailiopoulos, ICML 2018, [link]
Draco: Robust Distributed Training against Adversaries
L. Chen, H. Wang, D. Papailiopoulos, SysML 2018, [link]
Details
Citation
L. Chen, H. Wang, D. Papailiopoulos, SysML 2018, [link]
2017
Recognizing Actions during Tactile Manipulations through Force Sensing
G. Subramani, D. Rakita, H. Wang, J. Black, M. Zinn, M. Gleicher, IROS 2017, [link]
Details
Citation
G. Subramani, D. Rakita, H. Wang, J. Black, M. Zinn, M. Gleicher, IROS 2017, [link]
