habana_frameworks.mediapipe.fn.Resize

Class:
  • habana_frameworks.mediapipe.fn.Resize(**kwargs)

Define graph call:
  • __call__(input)

Parameter:

  • input - Input tensor to operator. Supported dimensions: minimum = 3, maximum = 4. Supported data types: INT8, UINT8, UINT16. 3D tensor or 4D tensor with last dimension size as 1, containing image data. Input image data in (N*C)xWxHx1 data layout with (N*C) being the fastest changing dimension.

Description:

Resize an image by the scale factor using either bilinear, cubic, nearest neighbor sampling.

Supported backend:
  • HPU

Keyword Arguments

kwargs

Description

mode

Interpolation mode to be selected.

  • Type: ResizeInterpolationMode_t

  • Default: 1

  • Optional: yes

  • Supported modes:

    • RESIZE_INTER_NEAREST: 0

      • Does nearest neighbor interpolation for 1D tensor and N-neighbor interpolation for N-D tensor

    • RESIZE_INTER_LINEAR: 1

      • Does linear interpolation for 1D tensor and N-linear interpolation for N-D tensor

    • RESIZE_INTER_CUBIC: 2

      • Does cubic interpolation for 1D tensor and N-cubic interpolation for N-D tensor

cubicCoeffA

Is valid only if mode is cubic The coefficient ‘a’ used in cubic interpolation. Two common choice are -0.5 and -0.75. Check https://ieeexplore.ieee.org/document/1163711 for details.

  • Type: float

  • Default: -0.75

  • Optional: yes

scaleDim1

Scaling along width axis.

  • Type: float

  • Default: 0.0

  • Optional: yes

scaleDim2

Scaling along height axis.

  • Type: float

  • Default: 0.0

  • Optional: yes

scaleDim3

Scaling along batch axis.

  • Type: float

  • Default: 0.0

  • Optional: yes

coordTransMode

Describes how to transform the coordinate in the resized tensor to the coordinate in the original tensor.

  • Type: ResizeCoordinateTransformationMode_t

  • Default: HALF_PIXEL_MODE

  • Optional: yes

  • Supported modes:

    • if HALF_PIXEL_MODE then x_original = (x_resized + 0.5) / scale - 0.5

    • if PYTORCH_HALF_PIXEL_MODE then x_original = length_resized > 1 ? (x_resized + 0.5) / scale - 0.5 : 0

    • if ALIGN_CORNERS_MODE then x_original = x_resized * (length_original - 1) / (length_resized - 1)

    • if ASYMMETRIC_MODE then x_original = x_resized / scale

    • if TF_HALF_PIXEL_FOR_NN_MODE then x_original = (x_resized + 0.5) / scale

nearestMode

Is valid only if mode is nearest interpolation. It indicates how to get nearest pixel in input tensor.

  • Type: ResizeNearestMode_t

  • Default: ROUND_PREFER_FLOOR

  • Optional: yes

  • Supported modes:

    • ROUND_PREFER_FLOOR (also know as round half down)

    • ROUND_PREFER_CEIL (also know as round half up)

    • FLOOR

    • CEIL

excludeOutside

If set to True then the weight of sampling locations outside the tensor will be set to 0 and the weight will be re-normalized so that their sum is 1.0.

  • Type: bool

  • Default: False

  • Optional: yes

useScales

Determine whether scaling will be applied to each dimensions. If False then output tensor size is determined by size.

  • Type: bool

  • Default: False

  • Optional: yes

size1

Width of the output tensor. This is set if useScales is set to false.

  • Type: int

  • Default: 0

  • Optional: yes

size2

Height of the output tensor. This is set if useScales is set to false.

  • Type: int

  • Default: 0

  • Optional: yes

size3

Batch of the output tensor. This is set if useScales is set to false.

  • Type: int

  • Default: 1

  • Optional: yes

dtype

Output data type.

  • Type: habana_frameworks.mediapipe.media_types.dtype

  • Default: UINT8

  • Optional: yes

  • Supported data type:

    • INT8

    • UINT8

    • UINT16

Note

  1. All Input/Output tensor data type must have same data type.

  2. Input/Output tensor should be in (N*C)xWxH data layout. (N*C) being the fastest changing dimension.

  3. Resize linear mode supports only bilinear interpolation along dimensions 1 (Width) and 2 (Height) only.

  4. Resize neighbor mode supports simultaneous scaling of three non fast changing dimensions.

  5. Resize cubic mode supports only bicubic interpolation along dimensions 1 (Width) and 2 (Height) only.

Example: Resize Operator

The following code snippet shows usage of Resize operator. Reshape and transpose operations are performed before resize operation to arrange tensor layout to (N*C)WH1 as required by resize operation. Post reshape and transpose convert the tensor layout to CWHN.

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import matplotlib.pyplot as plt

# Create media pipeline derived class
class myMediaPipe(MediaPipe):
    def __init__(self, device, dir, queue_depth, batch_size, channel, img_h, img_w, resize_h, resize_w):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            self.__class__.__name__)

        self.input = fn.ReadImageDatasetFromDir(shuffle=False,
                                                dir=dir,
                                                format="jpg")

        # WHCN
        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

        # WHCN -> NCWH
        self.pre_transpose = fn.Transpose(permutation=[3, 2, 0, 1],
                                          tensorDim=4,
                                          dtype=dt.UINT8)

        # NCWH -> (N*C)WH1
        self.pre_reshape = fn.Reshape(size=[batch_size*channel, img_w, img_h, 1],
                                      tensorDim=4,
                                      layout='',
                                      dtype=dt.UINT8)

        self.resize = fn.Resize(mode=1,
                                size1=resize_w,
                                size2=resize_h,
                                size3=1,
                                dtype=dt.UINT8)

        # (N*C)WH1 -> NCWH
        self.post_reshape = fn.Reshape(size=[batch_size, channel, resize_w, resize_h],
                                       tensorDim=4,
                                       layout='',
                                       dtype=dt.UINT8)

        # NCWH -> CWHN
        self.post_transpose = fn.Transpose(permutation=[1, 2, 3, 0],
                                           tensorDim=4,
                                           dtype=dt.UINT8)

    def definegraph(self):
        images, labels = self.input()
        images = self.decode(images)
        images = self.pre_transpose(images)
        images = self.pre_reshape(images)
        images = self.resize(images)
        images = self.post_reshape(images)
        images = self.post_transpose(images)
        return images, labels

def display_images(images, batch_size, cols):
    rows = (batch_size + 1) // cols
    plt.figure(figsize=(10, 10))
    for i in range(batch_size):
        ax = plt.subplot(rows, cols, i + 1)
        plt.imshow(images[i])
        plt.axis("off")
    plt.show()


def main():
    batch_size = 6
    channels = 3
    img_width = 200
    img_height = 200
    resize_width = 300
    resize_height = 300
    img_dir = "/path/to/images"
    queue_depth = 2
    columns = 3

    # Create media pipeline object
    pipe = myMediaPipe('hpu', img_dir, queue_depth, batch_size,
                        channels, img_height, img_width, resize_width, resize_height)

    # Build media pipeline
    pipe.build()

    # Initialize media pipeline iterator
    pipe.iter_init()

    # Run media pipeline
    images, labels = pipe.run()

    # Copy data to host from device as numpy array
    images = images.as_cpu().as_nparray()
    labels = labels.as_cpu().as_nparray()

    # Display images
    display_images(images, batch_size, columns)

if __name__ == "__main__":
    main()

Resized Images 1

Image1 of resize
Image2 of resize
Image3 of resize
Image4 of resize
Image5 of resize
Image6 of resize
1

Licensed under a CC BY SA 4.0 license. The images used here are taken from https://data.caltech.edu/records/mzrjq-6wc02.