Fused Optimizers and Custom Ops for Intel Gaudi
On this Page
Fused Optimizers and Custom Ops for Intel Gaudi¶
The Intel® Gaudi® AI accelerator provides its own implementation of complex PyTorch ops customized for Gaudi devices. Replacing these complex ops with custom Gaudi versions enhances model performance.
Fused Optimizers¶
The following lists fused optimizers currently supported on Gaudi devices:
FusedAdagrad - refer to torch.optim.Adagrad
FusedAdamW - refer to AdamW from HuggingFace
FusedLamb - refer to LAMB optimizer paper
FusedSGD - refer to torch.optim.SGD
Functional FusedAdamW - functional version of FusedAdamW based on torch.distributed.optim._FunctionalAdamW. It can be enabled with
habana_frameworks.torch.hpex.optimizers.distributed.FusedAdamW
.
The following code snippet shows how to import FusedLamb optimizer:
try:
from habana_frameworks.torch.hpex.optimizers import FusedLamb
except ImportError:
raise ImportError("Please install habana_torch package")
optimizer = FusedLamb(model.parameters(), lr=args.learning_rate)
Note
For models using Lazy mode execution, mark_step()
must be added right after loss.backward()
and optimizer.step()
.
Custom Ops¶
FusedClipNorm is supported on the Gaudi device. Refer to toch.nn.utils.clip_grad_norm_ for more details.
The following code snippet shows how to import FusedClipNorm:
try:
from habana_frameworks.torch.hpex.normalization import FusedClipNorm
except ImportError:
raise ImportError("Please install habana_torch package")
FusedNorm = FusedClipNorm(model.parameters(), args.max_grad_norm)
FusedNorm.clip_norm(model.parameters())