Research themes

Selected projects across efficient, open, and trustworthy machine learning systems.

RAISL studies the systems, algorithms, and data workflows needed to make modern machine learning practical at scale, from open LLM infrastructure to MoE/RL post-training and evaluation. This page organizes representative projects by research theme rather than by publication year.

Selected Projects Research Themes All Publications

5 themes 5 selected projects 7 technical reports

MoE and LLM RL

Predictive Routing Replay

Stabilizing MoE-based LLM reinforcement learning by predicting router evolution and replaying consistent expert routes.

See related theme LLM Infrastructure

360-Open LLMs

Building transparent large language models with open artifacts across data, training, evaluation, and reasoning behavior.

See related theme Open Models

Fully Transparent Open-Source LLMs

Making the full lifecycle of large model development inspectable and reproducible for the research community.

See related theme Efficient ML Systems

Low-Rank Efficient Training

Reducing training cost through practical low-rank methods that avoid brittle hand-tuning.

See related theme Federated Learning

Layer-Wise Federated Learning

Matching and merging neural components across clients to support personalized and communication-efficient federation.

See related theme

LLM Infrastructure and Open Models

We build infrastructure for training, post-training, evaluating, serving, and opening large language models. The goal is not only larger models, but systems whose routing behavior, reasoning behavior, and development process can be inspected, reproduced, and improved.

Questions we ask

How can open models expose enough artifacts to support real scientific scrutiny?
How should model training, RL post-training, evaluation, and serving systems adapt to reasoning-heavy workloads?
How can MoE routing remain stable when rollout and training phases drift?
What tooling makes distributed LLM development usable across heterogeneous compute?

Representative work

Technical report arXiv 2026

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

D. Dong, J. Chen, H. Jia, J. Wu, H. Di, J. Liu, J. Wu, Z. Liu, Z. Liu, E. Barsoum, et al., arXiv technical report, 2026.

arXiv

LLM Systems MoE Systems Reinforcement Learning Efficient Training

Details

Abstract

PR2 studies reinforcement learning for MoE-based LLMs, where router drift can create rollout-training mismatch and unstable PPO-style updates. It introduces predictive routing replay, a lightweight router-evolution predictor that anticipates short-horizon routing changes and replays predicted routes to stabilize importance estimation and improve reasoning benchmark performance.

Citation

D. Dong, J. Chen, H. Jia, J. Wu, H. Di, J. Liu, J. Wu, Z. Liu, Z. Liu, E. Barsoum, et al., arXiv technical report, 2026.

Classification

Technical report · arXiv · 2026

BibTeX

@article{dong2026pr2,
  title={PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning},
  author={Dong, Daize and Chen, Junlin and Jia, Haolong and Wu, Jiawei and Di, Huanwei and Liu, Jiang and Wu, Jialian and Liu, Zhengzhong and Liu, Zicheng and Barsoum, Emad and others},
  journal={arXiv preprint arXiv:2606.00395},
  year={2026},
  url={https://arxiv.org/abs/2606.00395}
}

Technical report arXiv 2025 Selected

K2-V2: A 360-Open, Reasoning-Enhanced LLM

K2 Team, arXiv technical report, 2025.

arXiv

LLM Systems Open Models

Details

Citation

K2 Team, arXiv technical report, 2025.

Classification

Technical report · arXiv · 2025

BibTeX

@article{k2team2025k2v2,
  title = {K2-V2: A 360-Open, Reasoning-Enhanced LLM},
  author = {{K2 Team}},
  journal = {arXiv preprint arXiv:2512.06201},
  year = {2025}
}

Technical report arXiv 2025

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.

arXiv

LLM Systems Open Models

Details

Citation

Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.

Classification

Technical report · arXiv · 2025

BibTeX

@article{liu2025llm360k2,
  title = {LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch},
  author = {Liu, Zhengzhong and Tan, Bowen and Wang, Hongyi and Neiswanger, Willie and Tao, Tianhua and Li, Haonan and Koto, Fajri and Wang, Yuqi and Sun, Suqi and Pangarkar, Omkar and Fan, Richard and Gu, Yi and Miller, Victor and Ma, Liqun and Tang, Liping and Ranjan, Nikhil and Zhuang, Yonghao and He, Guowei and Wang, Renxi and Deng, Mingkai and Algayres, Robin and Li, Yuanzhi and Shen, Zhiqiang and Nakov, Preslav and Xing, Eric P.},
  journal = {arXiv preprint arXiv:2501.07124},
  year = {2025}
}

Peer-reviewed COLM 2024 Selected

LLM360: Towards Fully Transparent Open-Source LLMs

Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]

arXiv

LLM Systems Open Models

Details

Citation

Classification

Peer-reviewed · COLM · 2024

Peer-reviewed NeurIPS Datasets and Benchmarks Track 2024

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild

X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]

Paper

LLM Systems Open Models

Details

Citation

X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]

Classification

Peer-reviewed · NeurIPS Datasets and Benchmarks Track · 2024

Peer-reviewed NAACL Demo 2024 Best Demo Runner-Up

RedCoast: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

B. Tan, Y. Zhu, L. Liu, H. Wang, Y. Zhuang, J. Chen, E. P. Xing, Z. Hu, NAACL Demo 2024 ($\color{red}{\text{the Best Demo Runner Up}}$) [link] [arXiv]

OpenReview arXiv

LLM Systems Distributed Training

Details

Citation

B. Tan, Y. Zhu, L. Liu, H. Wang, Y. Zhuang, J. Chen, E. P. Xing, Z. Hu, NAACL Demo 2024 ($\color{red}{\text{the Best Demo Runner Up}}$) [link] [arXiv]

Classification

Peer-reviewed · NAACL Demo · 2024

Efficient ML Systems and Optimization

We design methods that reduce the cost of training and deploying machine learning models while preserving practical performance. This includes compression, low-rank training, communication efficiency, and model fusion.

Questions we ask

When do compression and low-rank structure help real training systems rather than just benchmarks?
How can distributed training communicate less while preserving convergence and utility?
How can independently trained models be fused or reused instead of retrained from scratch?

Representative work

Peer-reviewed MLSys 2023 Selected

Cuttlefish: Low-rank Model Training without All The Tuning

H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]

Paper arXiv

Efficient Training Optimization

Details

Citation

H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]

Classification

Peer-reviewed · MLSys · 2023

Peer-reviewed ICML 2024

Maestro: Uncovering Low-Rank Structures via Trainable Decomposition

S. Horváth, S. Laskaridis, S. Rajput, H. Wang, ICML 2024 [link] [arXiv]

OpenReview arXiv

Optimization Model Compression

Details

Citation

S. Horváth, S. Laskaridis, S. Rajput, H. Wang, ICML 2024 [link] [arXiv]

Classification

Peer-reviewed · ICML · 2024

Peer-reviewed ICLR 2024

Fusing Models with Complementary Expertise

H. Wang, F. M. Polo, Y. Sun, S. Kundu, E. P. Xing, M. Yurochkin, ICLR 2024 [link] [arXiv]

OpenReview arXiv

Optimization Model Fusion

Details

Citation

H. Wang, F. M. Polo, Y. Sun, S. Kundu, E. P. Xing, M. Yurochkin, ICLR 2024 [link] [arXiv]

Classification

Peer-reviewed · ICLR · 2024

Peer-reviewed MLSys 2024

Does compressing activations help model parallel training?

S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]

arXiv

Distributed Training Model Compression

Details

Citation

S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]

Classification

Peer-reviewed · MLSys · 2024

Peer-reviewed MLSys 2022

On the Utility of Gradient Compression in Distributed Training Systems

S. Agarwal, H. Wang, S. Venkataraman, D. Papailiopoulos, MLSys 2022 [link] [arXiv]

Paper arXiv

Distributed Training Optimization

Details

Citation

S. Agarwal, H. Wang, S. Venkataraman, D. Papailiopoulos, MLSys 2022 [link] [arXiv]

Classification

Peer-reviewed · MLSys · 2022

Peer-reviewed MLSys 2021

Pufferfish: Communication-efficient Models At No Extra Cost

H. Wang, S. Agarwal, D. Papailiopoulos, MLSys 2021 [arXiv] [link] [talk]

arXiv Paper Talk

Efficient Training Model Compression

Details

Citation

H. Wang, S. Agarwal, D. Papailiopoulos, MLSys 2021 [arXiv] [link] [talk]

Classification

Peer-reviewed · MLSys · 2021

Federated, Private, and Distributed ML

We study learning systems that operate across distributed and sensitive data. The focus is on algorithms that are scalable, robust, privacy-aware, and compatible with heterogeneous real-world deployments.

Questions we ask

How should models be aggregated when client data and architectures are heterogeneous?
What optimization principles make federated learning stable at scale?
How can privacy-preserving and secure inference systems remain usable?

Representative work

Peer-reviewed NeurIPS 2024

FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations

Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]

arXiv

LLM Systems Federated Learning

Details

Citation

Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]

Classification

Peer-reviewed · NeurIPS · 2024

Peer-reviewed ICLR 2020 Selected Oral

Federated Learning with Matched Averaging

H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]

OpenReview Blog Talk

Federated Learning Model Fusion

Details

Citation

H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]

Classification

Peer-reviewed · ICLR · 2020

Peer-reviewed ICLR 2023

Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach

H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]

OpenReview

Federated Learning Optimization

Details

Citation

H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]

Classification

Peer-reviewed · ICLR · 2023

Peer-reviewed NeurIPS 2023

FedNAR: Federated Optimization with Normalized Annealing Regularization

J. Li, A. Li, C. Tian, Q. Ho, E. Xing, H. Wang, NeurIPS 2023 [link] [arXiv]

OpenReview arXiv

Federated Learning Optimization

Details

Citation

J. Li, A. Li, C. Tian, Q. Ho, E. Xing, H. Wang, NeurIPS 2023 [link] [arXiv]

Classification

Peer-reviewed · NeurIPS · 2023

Peer-reviewed NeurIPS 2020

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]

Paper

Federated Learning Privacy & Security

Details

Citation

H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]

Classification

Peer-reviewed · NeurIPS · 2020

Peer-reviewed ICLR 2023 Spotlight

MPCFormer: fast, performant and private Transformer inference with MPC

D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]

OpenReview

Privacy & Security Efficient Inference

Details

Citation

D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]

Classification

Peer-reviewed · ICLR · 2023

Trustworthy Data, Evaluation, and Agents

We develop benchmarks, datasets, and workflows that make model behavior easier to evaluate and improve. This includes trustworthiness, data refinement, educational visualization, and agentic evaluation frameworks.

Questions we ask

How do we evaluate reasoning, pedagogy, and trustworthiness beyond static leaderboards?
What data refinement workflows improve instruction tuning without obscuring failure modes?
How can agents help with visualization and evaluation while remaining inspectable?

Representative work

Peer-reviewed ICLR 2026

From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization

H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]

OpenReview

AI Agents Data & Evaluation Education Visualization

Details

Citation

H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]

Classification

Peer-reviewed · ICLR · 2026

BibTeX

@inproceedings{ji2026from,
  title={From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization},
  author={Haonian Ji and Shi Qiu and Siyang Xin and Siwei Han and Zhaorun Chen and Dake Zhang and Hongyi Wang and Huaxiu Yao},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=FVCpV04ZRe}
}

Peer-reviewed ICML 2024

TrustLLM: Trustworthiness in Large Language Models

H. Wang with many collegues (Position Paper), ICML 2024 [link] [arXiv]

Hugging Face arXiv

LLM Systems Data & Evaluation

Details

Citation

H. Wang with many collegues (Position Paper), ICML 2024 [link] [arXiv]

Classification

Peer-reviewed · ICML · 2024

Peer-reviewed COLM 2024

Crystal: Illuminating LLM Abilities on Language and Code

T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]

arXiv

LLM Systems Data & Evaluation

Details

Citation

T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]

Classification

Peer-reviewed · COLM · 2024

Peer-reviewed NeurIPS 2024

SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]

arXiv

LLM Systems Data & Evaluation

Details

Citation

Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]

Classification

Peer-reviewed · NeurIPS · 2024

Foundation Models for Science

We also explore foundation-model systems for scientific data, especially biological sequences and single-cell data, through technical reports and collaborations that connect representation learning with domain-specific evaluation.

Questions we ask

How should dense representations scale for DNA, RNA, protein, and single-cell modalities?
What evaluation signals show whether scientific foundation models are useful beyond pretraining metrics?
How can systems methods support foundation models with specialized scientific constraints?

Representative work

Technical report bioRxiv 2024

Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale

C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.

bioRxiv OpenReview

Biological Foundation Models Genomics Life Science AI

Details

Citation

C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.

Classification

Technical report · bioRxiv · 2024

BibTeX

@article{ellington2024accurate,
  title = {Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale},
  author = {Ellington, Caleb N. and Sun, Ning and Ho, Nicholas and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.12.01.625444}
}

Technical report bioRxiv 2024

Scaling Dense Representations for Single Cell with Transcriptome-Scale Context

N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.

bioRxiv

Biological Foundation Models Single Cell Life Science AI

Details

Citation

N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.

Classification

Technical report · bioRxiv · 2024

BibTeX

@article{ho2024scaling,
  title = {Scaling Dense Representations for Single Cell with Transcriptome-Scale Context},
  author = {Ho, Nicholas and Ellington, Caleb N. and Hou, Jinyu and Addagudi, Sohan and Mo, Shentong and Tao, Tianhua and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.11.28.625303}
}

Technical report bioRxiv 2024

Mixture of Experts Enable Efficient and Effective Protein Understanding and Design

N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.

bioRxiv

Biological Foundation Models Protein Models Life Science AI

Details

Citation

N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.

Classification

Technical report · bioRxiv · 2024

BibTeX

@article{sun2024mixture,
  title = {Mixture of Experts Enable Efficient and Effective Protein Understanding and Design},
  author = {Sun, Ning and Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.11.29.625425}
}

Technical report bioRxiv 2024

A Large-Scale Foundation Model for RNA Function and Structure Prediction

S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.

bioRxiv

Biological Foundation Models RNA Models Life Science AI

Details

Citation

S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.

Classification

Technical report · bioRxiv · 2024

BibTeX

@article{zou2024large,
  title = {A Large-Scale Foundation Model for RNA Function and Structure Prediction},
  author = {Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Ellington, Caleb N. and Algayres, Robin and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
  journal = {bioRxiv},
  year = {2024},
  doi = {10.1101/2024.11.28.625345}
}

Selected projects across efficient, open, and trustworthy machine learning systems.

Project-Level Entry Points

Predictive Routing Replay

360-Open LLMs

Fully Transparent Open-Source LLMs

Low-Rank Efficient Training

Layer-Wise Federated Learning

Themes and Representative Work

LLM Infrastructure and Open Models

Questions we ask

Representative work

Abstract

Citation

Classification

BibTeX

Citation

Classification

BibTeX

Citation

Classification

BibTeX

Citation

Classification

Citation

Classification

Citation

Classification

Efficient ML Systems and Optimization

Questions we ask

Representative work

Citation

Classification

Citation

Classification

Citation

Classification

Citation

Classification

Citation

Classification

Citation

Classification

Federated, Private, and Distributed ML

Questions we ask

Representative work

Citation

Classification

Citation

Classification

Citation

Classification

Citation

Classification

Citation

Classification

Citation

Classification

Trustworthy Data, Evaluation, and Agents

Questions we ask

Representative work

Citation

Classification

BibTeX

Citation

Classification

Citation

Classification

Citation

Classification

Foundation Models for Science

Questions we ask

Representative work

Citation

Classification

BibTeX

Citation

Classification

BibTeX

Citation

Classification