Monitoring¶
Deepspeed’s Monitor module can log training details into a Tensorboard-compatible file, to WandB, or to simple CSV files. Below is an overview of what DeepSpeed will log automatically.
TensorBoard¶
- class deepspeed.monitor.config.TensorBoardConfig[source]¶
Sets parameters for TensorBoard monitor.
- enabled: bool = False¶
Whether logging to Tensorboard is enabled. Requires tensorboard package is installed.
- output_path: str = ''¶
Path to where the Tensorboard logs will be written. If not provided, the output path is set under the training script’s launching path.
- job_name: str = 'DeepSpeedJobName'¶
Name for the current job. This will become a new directory inside output_path.
WandB¶
- class deepspeed.monitor.config.WandbConfig[source]¶
Sets parameters for WandB monitor.
- enabled: bool = False¶
Whether logging to WandB is enabled. Requires wandb package is installed.
- group: str = None¶
Name for the WandB group. This can be used to group together runs.
- team: str = None¶
Name for the WandB team.
- project: str = 'deepspeed'¶
Name for the WandB project.
CSV Monitor¶
- class deepspeed.monitor.config.CSVConfig[source]¶
Sets parameters for CSV monitor.
- enabled: bool = False¶
Whether logging to local CSV files is enabled.
- output_path: str = ''¶
Path to where the csv files will be written. If not provided, the output path is set under the training script’s launching path.
- job_name: str = 'DeepSpeedJobName'¶
Name for the current job. This will become a new directory inside output_path.