DeepSpeed
Training Setup
Inference Setup
Training API
Inference API
Model Checkpointing
Activation Checkpointing
ZeRO
AutoEP (Automatic Expert Parallelism)
Mixture of Experts (DeepSpeed MoE)
Transformer Kernels
Pipeline Parallelism
Optimizers
Learning Rate Schedulers
Flops Profiler
Autotuning
Memory Requirements
Monitoring
DeepSpeed
Index
Edit on GitHub
Index
A
|
B
|
C
|
D
|
E
|
F
|
G
|
I
|
J
|
K
|
L
|
M
|
N
|
O
|
P
|
Q
|
R
|
S
|
T
|
U
|
W
|
Z
A
activation (deepspeed.inference.config.QuantizationConfig attribute)
allgather_bucket_size (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
allgather_partitions (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
allgather_sequential (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
allreduce_tied_weight_gradients() (deepspeed.pipe.PipelineModule method)
api_key (deepspeed.monitor.config.CometConfig attribute)
autotuner (in module deepspeed.autotuning)
B
BackwardPass (class in deepspeed.runtime.pipe.schedule)
base_dir (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.InferenceCheckpointConfig attribute)
buffer_count (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
(deepspeed.runtime.zero.config.DeepSpeedZeroOffloadParamConfig attribute)
buffer_size (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadParamConfig attribute)
BufferOpInstruction (class in deepspeed.runtime.pipe.schedule)
build() (deepspeed.pipe.LayerSpec method)
C
checkpoint (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
checkpoint_config (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
checkpoint_dir (deepspeed.inference.config.InferenceCheckpointConfig attribute)
ckpt_layer_path() (deepspeed.pipe.PipelineModule method)
ckpt_layer_path_list() (deepspeed.pipe.PipelineModule method)
ckpt_prefix() (deepspeed.pipe.PipelineModule method)
clone_tensors_for_torch_save() (in module deepspeed.checkpoint.utils)
compile() (deepspeed.pipe.PipelineModule method)
config (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
contiguous_gradients (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
convert_zero_checkpoint_to_fp32_state_dict() (in module deepspeed.utils.zero_to_fp32)
cpu_offload (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
cpu_offload_param (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
cpu_offload_use_pin_memory (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
cpuadam_cores_perc (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
D
DataParallelSchedule (class in deepspeed.runtime.pipe.schedule)
deepspeed.profiling.flops_profiler.profiler
module
deepspeed.runtime.pipe.schedule
module
DeepSpeedCPUAdam (class in deepspeed.ops.adam)
device (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
(deepspeed.runtime.zero.config.DeepSpeedZeroOffloadParamConfig attribute)
dtype (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
E
elastic_checkpoint (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
enable_cuda_graph (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
enable_sanity_checks (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
enabled (deepspeed.inference.config.DeepSpeedMoEConfig attribute)
(deepspeed.inference.config.DeepSpeedTPConfig attribute)
(deepspeed.inference.config.QuantizationConfig attribute)
(deepspeed.monitor.config.CometConfig attribute)
(deepspeed.monitor.config.CSVConfig attribute)
(deepspeed.monitor.config.TensorBoardConfig attribute)
(deepspeed.monitor.config.WandbConfig attribute)
end_profile() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
ep_group (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.DeepSpeedMoEConfig attribute)
ep_mp_group (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.DeepSpeedMoEConfig attribute)
ep_size (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.DeepSpeedMoEConfig attribute)
experiment_key (deepspeed.monitor.config.CometConfig attribute)
experiment_name (deepspeed.monitor.config.CometConfig attribute)
F
fast_init (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
filter_match() (deepspeed.runtime.pipe.ProcessTopology method)
FlopsProfiler (class in deepspeed.profiling.flops_profiler.profiler)
forward() (deepspeed.moe.layer.MoE method)
(deepspeed.pipe.PipelineModule method)
ForwardPass (class in deepspeed.runtime.pipe.schedule)
FusedAdam (class in deepspeed.ops.adam)
FusedLamb (class in deepspeed.ops.lamb)
G
gather_16bit_weights_on_model_save (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
get_additional_losses() (deepspeed.pipe.PipelineModule method)
get_axis_comm_lists() (deepspeed.runtime.pipe.ProcessTopology method)
get_axis_list() (deepspeed.runtime.pipe.ProcessTopology method)
get_axis_names() (deepspeed.runtime.pipe.ProcessTopology method)
get_coord() (deepspeed.runtime.pipe.ProcessTopology method)
get_default_autocast_lower_precision_modules() (in module deepspeed.runtime.torch_autocast)
get_dim() (deepspeed.runtime.pipe.ProcessTopology method)
get_fp32_state_dict_from_zero_checkpoint() (in module deepspeed.utils.zero_to_fp32)
get_model_profile() (in module deepspeed.profiling.flops_profiler.profiler)
get_rank() (deepspeed.runtime.pipe.ProcessTopology method)
get_rank_repr() (deepspeed.runtime.pipe.ProcessTopology method)
get_total_duration() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
get_total_flops() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
get_total_macs() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
get_total_params() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
group (deepspeed.monitor.config.WandbConfig attribute)
I
ignore_unused_parameters (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
InferenceSchedule (class in deepspeed.runtime.pipe.schedule)
init_autocast_params() (in module deepspeed.runtime.torch_autocast)
injection_policy (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
injection_policy_tuple (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
is_autocast_initialized() (in module deepspeed.runtime.torch_autocast)
is_first_stage (deepspeed.runtime.pipe.schedule.PipeSchedule property)
is_last_stage (deepspeed.runtime.pipe.schedule.PipeSchedule property)
J
job_name (deepspeed.monitor.config.CSVConfig attribute)
(deepspeed.monitor.config.TensorBoardConfig attribute)
K
keep_module_on_host (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
L
LayerSpec (class in deepspeed.pipe)
leaf_module (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
legacy_stage1 (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
load_from_fp32_weights (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
load_state_dict_from_zero_checkpoint() (in module deepspeed.utils.zero_to_fp32)
LoadMicroBatch (class in deepspeed.runtime.pipe.schedule)
log_trace_cache_warnings (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
LRRangeTest (class in deepspeed.runtime.lr_schedules)
M
max_in_cpu (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadParamConfig attribute)
max_live_parameters (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
max_out_tokens (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
max_reuse_distance (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
memory_efficient_linear (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
mics_hierarchical_params_gather (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
mics_shard_size (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
min_out_tokens (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
mode (deepspeed.monitor.config.CometConfig attribute)
model_persistence_threshold (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
module
deepspeed.profiling.flops_profiler.profiler
deepspeed.runtime.pipe.schedule
module_granularity_threshold (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
MoE (class in deepspeed.moe.layer)
moe (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
moe_experts (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.DeepSpeedMoEConfig attribute)
moe_type (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
mp_size (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
mpu (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.DeepSpeedTPConfig attribute)
N
num_micro_batches (deepspeed.runtime.pipe.schedule.PipeSchedule property)
num_pipe_buffers() (deepspeed.runtime.pipe.schedule.DataParallelSchedule method)
(deepspeed.runtime.pipe.schedule.InferenceSchedule method)
(deepspeed.runtime.pipe.schedule.PipeSchedule method)
(deepspeed.runtime.pipe.schedule.TrainSchedule method)
num_stages (deepspeed.runtime.pipe.schedule.PipeSchedule property)
nvme_path (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
(deepspeed.runtime.zero.config.DeepSpeedZeroOffloadParamConfig attribute)
O
offload_optimizer (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
offload_param (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
OneCycle (class in deepspeed.runtime.lr_schedules)
online (deepspeed.monitor.config.CometConfig attribute)
OptimizerStep (class in deepspeed.runtime.pipe.schedule)
output_path (deepspeed.monitor.config.CSVConfig attribute)
(deepspeed.monitor.config.TensorBoardConfig attribute)
overlap_comm (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
override_module_apply (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
P
param_persistence_threshold (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
pin_memory (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
(deepspeed.runtime.zero.config.DeepSpeedZeroOffloadParamConfig attribute)
PipeInstruction (class in deepspeed.runtime.pipe.schedule)
pipeline_loading_checkpoint (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
pipeline_read (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
pipeline_write (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
PipelineModule (class in deepspeed.pipe)
PipeSchedule (class in deepspeed.runtime.pipe.schedule)
prefetch_bucket_size (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
print_model_aggregated_profile() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
print_model_profile() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
ProcessTopology (class in deepspeed.runtime.pipe)
project (deepspeed.monitor.config.CometConfig attribute)
(deepspeed.monitor.config.WandbConfig attribute)
Q
qkv (deepspeed.inference.config.QuantizationConfig attribute)
quant (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
R
ratio (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
RecvActivation (class in deepspeed.runtime.pipe.schedule)
RecvGrad (class in deepspeed.runtime.pipe.schedule)
reduce_bucket_size (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
reduce_scatter (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
ReduceGrads (class in deepspeed.runtime.pipe.schedule)
ReduceTiedGrads (class in deepspeed.runtime.pipe.schedule)
replace_method (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
replace_with_kernel_inject (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
reset_profile() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
return_tuple (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
round_robin_gradients (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
S
safe_get_full_fp32_param() (in module deepspeed.utils)
safe_get_full_grad() (in module deepspeed.utils)
safe_get_full_optimizer_state() (in module deepspeed.utils)
safe_get_local_fp32_param() (in module deepspeed.utils)
safe_get_local_grad() (in module deepspeed.utils)
safe_get_local_optimizer_state() (in module deepspeed.utils)
safe_set_full_fp32_param() (in module deepspeed.utils)
safe_set_full_grad() (in module deepspeed.utils)
safe_set_full_optimizer_state() (in module deepspeed.utils)
safe_set_local_fp32_param() (in module deepspeed.utils)
safe_set_local_grad() (in module deepspeed.utils)
safe_set_local_optimizer_state() (in module deepspeed.utils)
safe_update_full_grad_vectorized() (in module deepspeed.utils)
samples_log_interval (deepspeed.monitor.config.CometConfig attribute)
save_mp_checkpoint_path (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
(deepspeed.inference.config.InferenceCheckpointConfig attribute)
save_muon_momentum_buffer_in_memory (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
SendActivation (class in deepspeed.runtime.pipe.schedule)
SendGrad (class in deepspeed.runtime.pipe.schedule)
set_empty_params (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
stage (deepspeed.runtime.pipe.schedule.PipeSchedule property)
(deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
stage3_gather_fp16_weights_on_model_save (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
start_profile() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
steps() (deepspeed.runtime.pipe.schedule.PipeSchedule method)
stop_profile() (deepspeed.profiling.flops_profiler.profiler.FlopsProfiler method)
sub_group_size (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
super_offload (deepspeed.runtime.zero.config.DeepSpeedZeroOffloadOptimizerConfig attribute)
T
team (deepspeed.monitor.config.WandbConfig attribute)
tensor_parallel (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
TiedLayerSpec (class in deepspeed.pipe)
topology() (deepspeed.pipe.PipelineModule method)
tp_grain_size (deepspeed.inference.config.DeepSpeedTPConfig attribute)
tp_group (deepspeed.inference.config.DeepSpeedTPConfig attribute)
tp_size (deepspeed.inference.config.DeepSpeedTPConfig attribute)
training_mp_size (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
TrainSchedule (class in deepspeed.runtime.pipe.schedule)
transposed_mode (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
triangular_masking (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
triton_autotune (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
type (deepspeed.inference.config.DeepSpeedMoEConfig attribute)
U
use_all_reduce_for_fetch_params (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
use_multi_rank_bucket_allreduce (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
use_triton (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
W
WarmupCosineLR (class in deepspeed.runtime.lr_schedules)
WarmupDecayLR (class in deepspeed.runtime.lr_schedules)
WarmupLR (class in deepspeed.runtime.lr_schedules)
weight (deepspeed.inference.config.QuantizationConfig attribute)
workspace (deepspeed.monitor.config.CometConfig attribute)
Z
zenflow (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
zero (deepspeed.inference.config.DeepSpeedInferenceConfig attribute)
zero_hpz_partition_size (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
zero_quantized_gradients (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
zero_quantized_nontrainable_weights (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
zero_quantized_weights (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)
zeropp_loco_param (deepspeed.runtime.zero.config.DeepSpeedZeroConfig attribute)