Model Checkpointing

DeepSpeed provides routines for checkpointing model state during training.

Loading Training Checkpoints

Saving Training Checkpoints

ZeRO Checkpoint fp32 Weights Recovery

DeepSpeed provides routines for extracting fp32 weights from the saved ZeRO checkpoint’s optimizer states.