Monitoring

Deepspeed’s Monitor module can log training details into a Tensorboard-compatible file, to WandB, or to simple CSV files. Below is an overview of what DeepSpeed will log automatically.

TensorBoard

class deepspeed.monitor.config.TensorBoardConfig[source]

Sets parameters for TensorBoard monitor.

enabled: bool = False

Whether logging to Tensorboard is enabled. Requires tensorboard package is installed.

output_path: str = ''

Path to where the Tensorboard logs will be written. If not provided, the output path is set under the training script’s launching path.

job_name: str = 'DeepSpeedJobName'

Name for the current job. This will become a new directory inside output_path.

WandB

class deepspeed.monitor.config.WandbConfig[source]

Sets parameters for WandB monitor.

enabled: bool = False

Whether logging to WandB is enabled. Requires wandb package is installed.

group: str = None

Name for the WandB group. This can be used to group together runs.

team: str = None

Name for the WandB team.

project: str = 'deepspeed'

Name for the WandB project.

CSV Monitor

class deepspeed.monitor.config.CSVConfig[source]

Sets parameters for CSV monitor.

enabled: bool = False

Whether logging to local CSV files is enabled.

output_path: str = ''

Path to where the csv files will be written. If not provided, the output path is set under the training script’s launching path.

job_name: str = 'DeepSpeedJobName'

Name for the current job. This will become a new directory inside output_path.