habana_frameworks.mediapipe.fn.CropMirrorNorm

Class:
  • habana_frameworks.mediapipe.fn.CropMirrorNorm(**kwargs)

Define graph call:
  • __call__(input, mean, inv_std)

Parameter:

  • input - Input tensor to operator. Supported dimensions: minimum = 4, maximum = 4. Supported data types: INT8, UINT8, UINT16.

  • mean - Mean tensor for image normalization. Supported dimensions: minimum = 4, maximum = 4. Supported data types: FLOAT32.

  • inv_std - Inverse standard deviation tensor for image normalization. Supported dimensions: minimum = 4, maximum = 4. Supported data types: FLOAT32.

Description:

This operator performs fused cropping, mirroring, normalization, and type casting. It crops images with specified crop window dimensions and position. Normalization produces output using formula output = (input - mean) * inv_std. If input is a RGB24 image and normalized mean = [0.485, 0.456, 0.406], then input mean tensor = [(0.485 * 255), (0.456 * 255), (0.406 * 255)]. Similarly for a RGB24 input normalized standard deviation = [0.229, 0.224, 0.225] then inverse standard deviation tensor = [1 / (0.229 * 255), 1 / (0.224 * 255), 1 / (0.225 * 255)].

Supported backend:
  • HPU

Keyword Arguments:

kwargs

Description

mirror

Flag for horizontal flip. 0 means do not perform horizontal flip. 1 means perform horizontal flip.

  • Type: int

  • Default: 0

  • Optional: yes

crop_w

Specify width of crop window in pixels. crop_w should be non zero value and less than or equal to input tensor width.

  • Type: int

  • Default: 100

  • Optional: yes

crop_h

Specify height of crop window in pixels. crop_h should be non zero value and less than or equal to input tensor height.

  • Type: int

  • Default: 100

  • Optional: yes

crop_d

Cropping along depth axis is optional. crop_d should be set to 0 if there are no cropping along depth axis. crop_d specify depth of crop window in pixels, its default set to zero, only for volumetric data crop_d should be non zero value and less than or equal to input tensor depth.

  • Type: int

  • Default: 0

  • Optional: yes

crop_pos_x

Normalized (0.0 - 1.0) position of the cropping window along width. Actual position is calculated as crop_x = crop_pos_x * (w - crop_w), where crop_pos_x is the normalized position, w is the width of the input tensor and crop_w is the width of the cropping window.

  • Type: float

  • Default: 0.0

  • Optional: yes

crop_pos_y

Normalized (0.0 - 1.0) position of the cropping window along height. Actual position is calculated as crop_y = crop_pos_y * (h - crop_h), where crop_pos_y is the normalized position, h is the height of the input tensor and crop_h is the height of the cropping window.

  • Type: float

  • Default: 0.0

  • Optional: yes

crop_pos_z

Only for volumetric data, normalized (0.0 - 1.0) position of the cropping window along depth. Actual position is calculated as crop_z = crop_pos_z * (d - crop_d), where crop_pos_z is the normalized position, d is the depth of the input tensor and crop_d is the depth of the cropping window.

  • Type: float

  • Default: 0.0

  • Optional: yes

dtype

Output data type.

  • Type: habana_frameworks.mediapipe.media_types.dtype

  • Default: UINT8

  • Optional: yes

  • Warning: User should manually set dtype same as input data type. Otherwise output tensor’s data type will change to default UINT8.

  • Supported data types:

    • UINT8

    • FLOAT32

Example: CropMirrorNorm Operator

The following code snippet shows use of CropMirrorNorm operator with pre-computed normalized mean = [0.485, 0.456, 0.406] and normalized std [0.229, 0.224, 0.225]. Decoder resizes output images to 224x224, CropMirrorNorm crops them to crop_w=125, crop_h=125 size. To quantize data from FLOAT32 to UINT8, output_scale=0.03125, output_zerop=128 are used.

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import matplotlib.pyplot as plt
import numpy as np
import os

g_display_timeout = os.getenv("DISPLAY_TIMEOUT") or 5


class myMediaPipe(MediaPipe):
    def __init__(self, device, queue_depth, batch_size, num_threads, op_device, dir, img_h, img_w):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            num_threads,
            self.__class__.__name__)

        self.input = fn.ReadImageDatasetFromDir(shuffle=False,
                                                dir=dir,
                                                format="jpg",
                                                device="cpu")

        mean_data = np.array([(0.485 * 255), (0.456 * 255), (0.406 * 255)],
                            dtype=dt.FLOAT32)

        std_data = np.array([1 / (0.229 * 255), 1 / (0.224 * 255), 1 / (0.225 * 255)],
                            dtype=dt.FLOAT32)

        # Batch broadcast is true, the shape will be 4D
        self.std_node = fn.MediaConst(data=std_data,
                                      shape=[1, 1, 3],
                                      dtype=dt.FLOAT32,
                                      device="cpu")

        self.mean_node = fn.MediaConst(data=mean_data,
                                      shape=[1, 1, 3],
                                      dtype=dt.FLOAT32,
                                      device="cpu")

        # WHCN
        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

        self.cmn = fn.CropMirrorNorm(crop_w=125,
                                    crop_h=125,
                                    output_scale=0.03125,
                                    output_zerop=128,
                                    device=op_device)

        # WHCN -> CWHN
        self.transpose = fn.Transpose(permutation=[2, 0, 1, 3],
                                      tensorDim=4,
                                      dtype=dt.UINT8,
                                      device=op_device)

    def definegraph(self):
        images, labels = self.input()
        std = self.std_node()
        mean = self.mean_node()
        images = self.decode(images)
        inp = self.transpose(images)
        images = self.cmn(images, mean, std)
        images = self.transpose(images)
        return inp, images, labels


def display_images(images, batch_size, cols):
    rows = (batch_size + 1) // cols
    plt.figure(figsize=(10, 10))
    for i in range(batch_size):
        ax = plt.subplot(rows, cols, i + 1)
        plt.imshow(images[i])
        plt.axis("off")
    plt.show(block=False)
    plt.pause(g_display_timeout)
    plt.close()


def run(device, op_device):
    batch_size = 6
    queue_depth = 2
    num_threads = 1
    img_width = 200
    img_height = 200
    base_dir = os.environ['DATASET_DIR']
    dir = base_dir + "/img_data/"
    columns = 3

    # Create MediaPipe object
    pipe = myMediaPipe(device, queue_depth, batch_size,
                      num_threads, op_device, dir,
                      img_height, img_width)

    # Build MediaPipe
    pipe.build()

    # Initialize MediaPipe iterator
    pipe.iter_init()

    # Run MediaPipe
    inp, images, labels = pipe.run()

    def as_cpu(tensor):
        if (callable(getattr(tensor, "as_cpu", None))):
            tensor = tensor.as_cpu()
        return tensor

    # Copy data to host from device as numpy array
    inp = as_cpu(inp).as_nparray()
    images = as_cpu(images).as_nparray()
    labels = as_cpu(labels).as_nparray()

    del pipe

    # Display images
    display_images(images, batch_size, columns)


if __name__ == "__main__":
    dev_opdev = {'mixed': ['hpu'],
                'legacy': ['hpu']}

    for dev in dev_opdev.keys():
        for op_dev in dev_opdev[dev]:
            run(dev, op_dev)

Output Images from CropMirrorNorm Operation 1

Image1 of cmn.
Image2 of cmn.
Image3 of cmn.
Image4 of cmn.
Image5 of cmn.
Image6 of cmn.
1

Licensed under a CC BY SA 4.0 license. The images used here are taken from https://data.caltech.edu/records/mzrjq-6wc02.