Does compressing activations help model parallel training?Published in MLSys, 2024Share on Twitter Facebook LinkedIn Previous Next