Fused Optimizers and Custom Ops for Intel Gaudi

The Intel® Gaudi® AI accelerator provides its own implementation of complex PyTorch ops customized for Gaudi devices. Replacing these complex ops with custom Gaudi versions enhances model performance.

Fused Optimizers

The following lists fused optimizers currently supported on Gaudi devices:

The following code snippet shows how to import FusedLamb optimizer:

try:
   from habana_frameworks.torch.hpex.optimizers import FusedLamb
except ImportError:
   raise ImportError("Please install habana_torch package")
   optimizer = FusedLamb(model.parameters(), lr=args.learning_rate)

Note

For models using Lazy mode execution, mark_step() must be added right after loss.backward() and optimizer.step().

Custom Ops

FusedClipNorm is supported on the Gaudi device. Refer to toch.nn.utils.clip_grad_norm_ for more details.

The following code snippet shows how to import FusedClipNorm:

try:
   from habana_frameworks.torch.hpex.normalization import FusedClipNorm
except ImportError:
   raise ImportError("Please install habana_torch package")
   FusedNorm = FusedClipNorm(model.parameters(), args.max_grad_norm)

FusedNorm.clip_norm(model.parameters())