Habana Media Loader

The Habana Media Loader is used to dataload and pre-process inputs for deep learning frameworks. It consists of pre-enabled dataloaders for commonly used datasets, currently only ImageNet is supported, and building blocks to assemble a generic data loader. The loader decides internally if part of operations can be offloaded to Habana accelerator. If the offload cannot be executed, it will result in using the passing alternative dataloader function to run.

Habana Media Loader can operate in different modes, the optimal one is selected based on the underlying hardware:

  • In first-gen Gaudi it uses either the framework default dataloader or AEON based dataloader, depending on the use case. Both are done on the host CPU.

  • In Gaudi2 the dataloader uses hardware-based decoders for acceleration, lowering the load on the host CPU.

Setting Up the Environment

To install Habana Media Loader, run the following command:

pip install hpu_media_loader-1.5.0-610-py3-none-any.whl

Note

If you are using Habana docker image, skip this step as Habana Media Loader is pre-installed.

Using Media Loader with PyTorch

Import Habana ImageNet dataset object.

The SynapseAI software selects the dataloader based on the underlying hardware:

  • In first-gen Gaudi, it uses AEON dataloader.

  • In Gaudi2, it uses the Media Loader for ImageNet based models.

import habana_dataloader

Note

  • Models that are not based on ImageNet datasets will use the PyTorch dataloader.

  • During runtime using Gaudi2, you must set the PT_HPU_POOL_STRATEGY=3 variable to enable Habana Media Loader. For more details, refer to the following example in the Computer Vision Model Reference GitHub page.