MediaPipe¶

The Intel® Gaudi® AI accelerator provides hardware-based acceleration for pre-processing inputs in deep learning frameworks. It also provides highly optimized operators, based on Tensor Processing Cores, generally used for data augmentation and image, video manipulation during input media processing.

The MediaPipe provides an interface for implementing media operations on input elements, such as images, videos. Its purpose is to prepare batches of processed and augmented images or videos, as well as labels, to be fed into training or inference models. Input data may include additional input information such as labels for classes or bounding boxes. Most of the operations are implemented on HPU, enabling accelerated execution compared to operations on CPU. See Operators section for more details.

Highlights:

Includes Gaudi 3 and Gaudi 2 support.
Supports images (JPEG format) and videos (H.264, HEVC format).
Scalable across multiple cards.
Supports PyTorch framework.
Accelerates image classification (ResNet-50).
Supports multiple functions such as reading and decoding, as well as various data transformations including image/video cropping or flipping.

For further details, see the below sections:

Gaudi Documentation 1.21.1 documentation

MediaPipe

MediaPipe¶