Research themes
Selected projects across efficient, open, and trustworthy machine learning systems.
RAISL studies the systems, algorithms, and data workflows needed to make modern machine learning practical at scale. This page organizes representative projects by research theme rather than by publication year.
Selected projects
Project-Level Entry Points
These projects are good starting points for understanding the group's research trajectory.
360-Open LLMs
Building transparent large language models with open artifacts across data, training, evaluation, and reasoning behavior.
See related theme Open ModelsFully Transparent Open-Source LLMs
Making the full lifecycle of large model development inspectable and reproducible for the research community.
See related theme Efficient ML SystemsLow-Rank Efficient Training
Reducing training cost through practical low-rank methods that avoid brittle hand-tuning.
See related theme Federated LearningLayer-Wise Federated Learning
Matching and merging neural components across clients to support personalized and communication-efficient federation.
See related themeResearch map
Themes and Representative Work
Each theme highlights current questions and representative papers or technical reports.
01
LLM Infrastructure and Open Models
We build infrastructure for training, evaluating, serving, and opening large language models. The goal is not only larger models, but models whose development process can be inspected, reproduced, and improved.
Questions we ask
- How can open models expose enough artifacts to support real scientific scrutiny?
- How should model training, evaluation, and serving systems adapt to reasoning-heavy workloads?
- What tooling makes distributed LLM development usable across heterogeneous compute?
Representative work
K2-V2: A 360-Open, Reasoning-Enhanced LLM
K2 Team, arXiv technical report, 2025.
Details
Citation
K2 Team, arXiv technical report, 2025.
BibTeX
@article{k2team2025k2v2,
title = {K2-V2: A 360-Open, Reasoning-Enhanced LLM},
author = {{K2 Team}},
journal = {arXiv preprint arXiv:2512.06201},
year = {2025}
}
LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch
Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.
Details
Citation
Z. Liu, B. Tan, H. Wang, et al., arXiv technical report, 2025.
BibTeX
@article{liu2025llm360k2,
title = {LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch},
author = {Liu, Zhengzhong and Tan, Bowen and Wang, Hongyi and Neiswanger, Willie and Tao, Tianhua and Li, Haonan and Koto, Fajri and Wang, Yuqi and Sun, Suqi and Pangarkar, Omkar and Fan, Richard and Gu, Yi and Miller, Victor and Ma, Liqun and Tang, Liping and Ranjan, Nikhil and Zhuang, Yonghao and He, Guowei and Wang, Renxi and Deng, Mingkai and Algayres, Robin and Li, Yuanzhi and Shen, Zhiqiang and Nakov, Preslav and Xing, Eric P.},
journal = {arXiv preprint arXiv:2501.07124},
year = {2025}
}
LLM360: Towards Fully Transparent Open-Source LLMs
Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]
Details
Citation
Z. Liu, A. Qiao, W. Neiswanger, H. Wang, B. Tan, T. Tao, J. Li, Y. Wang, S. Sun, O. Pangarkar, R. Fan, Y. Gu, V. Miller, Y. Zhuang, G. He, H. Li, F. Koto, L. Tang, N. Ranjan, Z. Shen, R. Iriondo, C. Mu, Z. Hu, M. Schulze, P. Nakov, T. Baldwin, E. P. Xing, COLM 2024 [arXiv]
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]
Details
Citation
X. Zhao, G. Sun, R. Cai, Y. Zhou, P. Li, P. Wang, B. Tan, Y. He, L. Chen, Y. Liang, B. Chen, B. Yuan, H. Wang, A. Li, Z. Wang, T. Chen, NeurIPS 2024 Datasets and Benchmarks [link]
RedCoast: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs
B. Tan, Y. Zhu, L. Liu, H. Wang, Y. Zhuang, J. Chen, E. P. Xing, Z. Hu, NAACL Demo 2024 ($\color{red}{\text{the Best Demo Runner Up}}$) [link] [arXiv]
02
Efficient ML Systems and Optimization
We design methods that reduce the cost of training and deploying machine learning models while preserving practical performance. This includes compression, low-rank training, communication efficiency, and model fusion.
Questions we ask
- When do compression and low-rank structure help real training systems rather than just benchmarks?
- How can distributed training communicate less while preserving convergence and utility?
- How can independently trained models be fused or reused instead of retrained from scratch?
Representative work
Cuttlefish: Low-rank Model Training without All The Tuning
H. Wang, S. Agarwal, P. U-chupala, Y. Tanaka, E. P. Xing, D. Papailiopoulos, MLSys 2023 [link] [arXiv]
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Fusing Models with Complementary Expertise
Does compressing activations help model parallel training?
S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]
Details
Citation
S. Bian, D. Li, H. Wang, E. P. Xing, S. Venkataraman, MLSys 2024 [arXiv]
On the Utility of Gradient Compression in Distributed Training Systems
03
Federated, Private, and Distributed ML
We study learning systems that operate across distributed and sensitive data. The focus is on algorithms that are scalable, robust, privacy-aware, and compatible with heterogeneous real-world deployments.
Questions we ask
- How should models be aggregated when client data and architectures are heterogeneous?
- What optimization principles make federated learning stable at scale?
- How can privacy-preserving and secure inference systems remain usable?
Representative work
FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations
Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]
Details
Citation
Z. Wang, Z. Shen, Y. He, G. Sun, H. Wang, L. Lyu, A. Li, NeurIPS 2024 [arXiv]
Federated Learning with Matched Averaging
H. Wang, M. Yurochkin, Y. Sun, D. Papailiopoulos, Y. Khazaeni, ICLR 2020, ($\color{red}{\text{Oral}}$) [link][blog][talk]
Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach
H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]
Details
Citation
H. Guo, P. Greengard, H. Wang, A. Gelman, E. P. Xing, Y. Kim, ICLR 2023 [link]
FedNAR: Federated Optimization with Normalized Annealing Regularization
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]
Details
Citation
H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, D. Papailiopoulos, NeurIPS 2020, [link]
MPCFormer: fast, performant and private Transformer inference with MPC
D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]
Details
Citation
D. Li*, R. Shao*, H. Wang*, H. Guo, E. P. Xing, H. Zhang, ICLR 2023, ($\color{red}{\text{Spotlight}}$) [link]
04
Trustworthy Data, Evaluation, and Agents
We develop benchmarks, datasets, and workflows that make model behavior easier to evaluate and improve. This includes trustworthiness, data refinement, educational visualization, and agentic evaluation frameworks.
Questions we ask
- How do we evaluate reasoning, pedagogy, and trustworthiness beyond static leaderboards?
- What data refinement workflows improve instruction tuning without obscuring failure modes?
- How can agents help with visualization and evaluation while remaining inspectable?
Representative work
From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]
Details
Citation
H. Ji, S. Qiu, S. Xin, S. Han, Z. Chen, D. Zhang, H. Wang, H. Yao, ICLR 2026 [OpenReview]
BibTeX
@inproceedings{ji2026from,
title={From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization},
author={Haonian Ji and Shi Qiu and Siyang Xin and Siwei Han and Zhaorun Chen and Dake Zhang and Hongyi Wang and Huaxiu Yao},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=FVCpV04ZRe}
}
TrustLLM: Trustworthiness in Large Language Models
Crystal: Illuminating LLM Abilities on Language and Code
T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]
Details
Citation
T. Tao, J. Li, B. Tan, H. Wang, W. Marshall, B. M Kanakiya, J. Hestness, N. Vassilieva, Z. Shen, E. P. Xing, Z. Liu, COLM 2024 [arXiv]
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]
Details
Citation
Y. He, Z. Wang, Z. Shen, G. Sun, Y. Dai, Y. Wu, H. Wang, A. Li, NeurIPS 2024 [arXiv]
05
Foundation Models for Science
We also explore foundation-model systems for scientific data, especially biological sequences and single-cell data, through technical reports and collaborations that connect representation learning with domain-specific evaluation.
Questions we ask
- How should dense representations scale for DNA, RNA, protein, and single-cell modalities?
- What evaluation signals show whether scientific foundation models are useful beyond pretraining metrics?
- How can systems methods support foundation models with specialized scientific constraints?
Representative work
Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale
C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.
Details
Citation
C. N. Ellington, N. Sun, N. Ho, et al., bioRxiv, 2024.
BibTeX
@article{ellington2024accurate,
title = {Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale},
author = {Ellington, Caleb N. and Sun, Ning and Ho, Nicholas and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.12.01.625444}
}
Scaling Dense Representations for Single Cell with Transcriptome-Scale Context
N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.
Details
Citation
N. Ho, C. N. Ellington, J. Hou, et al., bioRxiv, 2024.
BibTeX
@article{ho2024scaling,
title = {Scaling Dense Representations for Single Cell with Transcriptome-Scale Context},
author = {Ho, Nicholas and Ellington, Caleb N. and Hou, Jinyu and Addagudi, Sohan and Mo, Shentong and Tao, Tianhua and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.11.28.625303}
}
Mixture of Experts Enable Efficient and Effective Protein Understanding and Design
N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.
Details
Citation
N. Sun, S. Zou, T. Tao, et al., bioRxiv, 2024.
BibTeX
@article{sun2024mixture,
title = {Mixture of Experts Enable Efficient and Effective Protein Understanding and Design},
author = {Sun, Ning and Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Cheng, Xingyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.11.29.625425}
}
A Large-Scale Foundation Model for RNA Function and Structure Prediction
S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.
Details
Citation
S. Zou, T. Tao, S. Mahbub, et al., bioRxiv, 2024.
BibTeX
@article{zou2024large,
title = {A Large-Scale Foundation Model for RNA Function and Structure Prediction},
author = {Zou, Shuxian and Tao, Tianhua and Mahbub, Sazan and Ellington, Caleb N. and Algayres, Robin and Li, Dian and Zhuang, Yonghao and Wang, Hongyi and Song, Le and Xing, Eric P.},
journal = {bioRxiv},
year = {2024},
doi = {10.1101/2024.11.28.625345}
}