habana_frameworks.mediapipe.fn.CropMirrorNorm

Class:
  • habana_frameworks.mediapipe.fn.CropMirrorNorm(**kwargs)

Define graph call:
  • __call__(input, mean, inv_std)

Parameter:

  • input - Input tensor to operator. Supported dimensions: minimum = 4, maximum = 4. Supported data types: INT8, UINT8, UINT16.

  • mean - Mean tensor for image normalization. Supported dimensions: minimum = 1, maximum = 1. Supported data types: FLOAT32.

  • inv_std - Inverse standard deviation tensor for image normalization. Supported dimensions: minimum = 1, maximum = 1. Supported data types: FLOAT32.

Description:

This operator performs fused cropping, mirroring, normalization, and type casting. It crops images with specified crop window dimensions and position. Normalization produces output using formula output = (input - mean) * inv_std. If input is a RGB24 image and normalized mean = [0.485, 0.456, 0.406], then input mean tensor = [(0.485 * 255), (0.456 * 255), (0.406 * 255)]. Similarly for a RGB24 input normalized standard deviation = [0.229, 0.224, 0.225] then inverse standard deviation tensor = [1 / (0.229 * 255), 1 / (0.224 * 255), 1 / (0.225 * 255)].

Supported backend:
  • HPU

Keyword Arguments:

kwargs

Description

mirror

Flag for horizontal flip. 0 means do not perform horizontal flip. 1 means perform horizontal flip.

  • Type: int

  • Default: 0

  • Optional: yes

crop_w

Specify width of crop window in pixels. crop_w should be non zero value and less than or equal to input tensor width.

  • Type: int

  • Default: 100

  • Optional: yes

crop_h

Specify height of crop window in pixels. crop_h should be non zero value and less than or equal to input tensor height.

  • Type: int

  • Default: 100

  • Optional: yes

crop_d

Cropping along depth axis is optional. crop_d should be set to 0 if there are no cropping along depth axis. crop_d specify depth of crop window in pixels, its default set to zero, only for volumetric data crop_d should be non zero value and less than or equal to input tensor depth.

  • Type: int

  • Default: 0

  • Optional: yes

crop_pos_x

Normalized (0.0 - 1.0) position of the cropping window along width. Actual position is calculated as crop_x = crop_pos_x * (w - crop_w), where crop_pos_x is the normalized position, w is the width of the input tensor and crop_w is the width of the cropping window.

  • Type: float

  • Default: 0.0

  • Optional: yes

crop_pos_y

Normalized (0.0 - 1.0) position of the cropping window along height. Actual position is calculated as crop_y = crop_pos_y * (h - crop_h), where crop_pos_y is the normalized position, h is the height of the input tensor and crop_h is the height of the cropping window.

  • Type: float

  • Default: 0.0

  • Optional: yes

crop_pos_z

Only for volumetric data, normalized (0.0 - 1.0) position of the cropping window along depth. Actual position is calculated as crop_z = crop_pos_z * (d - crop_d), where crop_pos_z is the normalized position, d is the depth of the input tensor and crop_d is the depth of the cropping window.

  • Type: float

  • Default: 0.0

  • Optional: yes

dtype

Output data type.

  • Type: habana_frameworks.mediapipe.media_types.dtype

  • Default: UINT8

  • Optional: yes

  • Warning: User should manually set dtype same as input data type. Otherwise output tensor’s data type will change to default UINT8.

  • Supported data types:

    • UINT8

    • FLOAT32

Example: CropMirrorNorm Operator

The following code snippet shows use of CropMirrorNorm operator with pre-computed normalized mean = [0.485, 0.456, 0.406] and normalized std [0.229, 0.224, 0.225]. Decoder resizes output images to 224x224, CropMirrorNorm crops them to crop_w=125, crop_h=125 size. To quantize data from FLOAT32 to UINT8, output_scale=0.03125, output_zerop=128 are used.

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import matplotlib.pyplot as plt
import numpy as np

class myMediaPipe(MediaPipe):
    def __init__(self, device, dir, queue_depth, batch_size, img_h, img_w):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            self.__class__.__name__)

        self.input = fn.ReadImageDatasetFromDir(shuffle=False,
                                                dir=dir,
                                                format="jpg")

        mean_data = np.array([(0.485 * 255), (0.456 * 255), (0.406 * 255)],
                            dtype=dt.FLOAT32)

        std_data = np.array([1 / (0.229 * 255), 1 / (0.224 * 255), 1 / (0.225 * 255)],
                            dtype=dt.FLOAT32)

        self.std_node = fn.MediaConst(data=std_data,
                                      shape=[1, 1, 3],
                                      dtype=dt.FLOAT32)

        self.mean_node = fn.MediaConst(data=mean_data,
                                       shape=[1, 1, 3],
                                       dtype=dt.FLOAT32)

        # WHCN
        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

        self.cmn = fn.CropMirrorNorm(crop_w=125,
                                     crop_h=125,
                                     output_scale=0.03125,
                                     output_zerop=128)

        # WHCN -> CWHN
        self.transpose = fn.Transpose(permutation=[2, 0, 1, 3],
                                      tensorDim=4,
                                      dtype=dt.UINT8)

    def definegraph(self):
        images, labels = self.input()
        std = self.std_node()
        mean = self.mean_node()
        images = self.decode(images)
        images = self.cmn(images, mean, std)
        images = self.transpose(images)
        return images, labels

def display_images(images, batch_size, cols):
    rows = (batch_size + 1) // cols
    plt.figure(figsize=(10, 10))
    for i in range(batch_size):
        ax = plt.subplot(rows, cols, i + 1)
        plt.imshow(images[i])
        plt.axis("off")
    plt.show()

def main():
    batch_size = 6
    img_width = 200
    img_height = 200
    img_dir = "/path/to/images"
    queue_depth = 2
    columns = 3

    # Create media pipeline object
    pipe = myMediaPipe('hpu', img_dir, queue_depth, batch_size,
                        img_height, img_width)

    # Build media pipeline
    pipe.build()

    # Initialize media pipeline iterator
    pipe.iter_init()

    # Run media pipeline
    images, labels = pipe.run()

    # Copy data to host from device as numpy array
    images = images.as_cpu().as_nparray()
    labels = labels.as_cpu().as_nparray()

    # Display images
    display_images(images, batch_size, columns)


if __name__ == "__main__":
    main()

Output Images from CropMirrorNorm Operation 1

Image1 of cmn.
Image2 of cmn.
Image3 of cmn.
Image4 of cmn.
Image5 of cmn.
Image6 of cmn.
1

Licensed under a CC BY SA 4.0 license. The images used here are taken from https://data.caltech.edu/records/mzrjq-6wc02.