habana_frameworks.mediapipe.fn.SSDCropWindowGen

Class:
  • habana_frameworks.mediapipe.fn.SSDCropWindowGen(**kwargs)

Define graph call:
  • __call__(sizes, boxes, labels, lengths)

Parameter:

  • sizes - Input tensor of image sizes. size=[batch, 2]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

  • boxes - Input tensor of bounding boxes (each bbox should be in [left, top, right, bottom] format). size=[batch, 200, 4] Supported dimensions: minimum = 3, maximum = 3. Supported data types: FLOAT32.

  • labels - Input tensor of image labels for each bounding box. size=[batch, 200]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

  • lengths - Input tensor of number of bounding boxes per image. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

Description:

SSDCropWindowGen operator takes the metadata output of Coco reader and generate crop window in such that:
  • Center of all ground truth boxes fall with-in crop window

  • IOU bwtewwn every ground truth box and crop window should be more than min_iou.

for every image, min_iou will be randomly selected from values [0.1f,0.3f,0.5f,0.7f,0.9f]. If any of ground truth bounding box little bit fall outside crop window, box will be cropped.

Supported backend:
  • CPU

Keyword Arguments

kwargs

Description

min_width

minimum width of crop window. It should be a normalized value (in range 0.0 to 1.0).

  • Type: float

  • Default: 0.3

  • Optional: yes

max_width

maximum width of crop window. It should be a normalized value (in range 0.0 to 1.0)..

  • Type: float

  • Default: 1.0

  • Optional: yes

min_height

minimum height of crop window. It should be a normalized value (in range 0.0 to 1.0).

  • Type: float

  • Default: 0.3

  • Optional: yes

max_height

maximum height of crop window. It should be a normalized value (in range 0.0 to 1.0)..

  • Type: float

  • Default: 1.0

  • Optional: yes

num_iterations

Number of iterations to be used to get valid crop window. If valid crop window is not found in given iterations, then no cropping is done.

  • Type: int

  • Default: 1

  • Optional: yes

seed

Seed to be used for SSD crop randomization.

  • Type: int

  • Default: -1

  • Optional: yes

Output:

Output Value

Description

sizes

Image size after crop.

boxes

List of bounding boxes for every image in [left, top, right, bottom] format. Box will be cropped if fall outside selected crop window.

labels

List of labels for every encoded bounding box.

lengths

Number of ground truth boxes per image.

windows

Crop window coordinates.

Example: SSDCropWindowGen Operator

The following code snippet shows usage of SSDCropWindowGen operator:

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import matplotlib.pyplot as plt
import os


g_display_timeout = os.getenv("DISPLAY_TIMEOUT") or 5

class myMediaPipe(MediaPipe):
    def __init__(self, device, queue_depth, batch_size, num_threads,
                op_device, dir, ann_file, img_h, img_w):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            num_threads,
            self.__class__.__name__)

        self.input = fn.CocoReader(root=dir,
                                  annfile=ann_file,
                                  seed=1234,
                                  shuffle=False,
                                  drop_remainder=True,
                                  num_slices=1,
                                  slice_index=0,
                                  partial_batch=False,
                                  device='cpu')

        self.reshape_ids = fn.Reshape(size=[batch_size],
                                      tensorDim=1,
                                      layout='',
                                      dtype=dt.UINT32, device='hpu')  # [batch_size]

        self.ssd_crop_win_gen = fn.SSDCropWindowGen(num_iterations=1,
                                                    seed=1234,
                                                    device='cpu')

        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

        # WHCN -> CWHN
        self.transpose = fn.Transpose(permutation=[2, 0, 1, 3],
                                      tensorDim=4,
                                      dtype=dt.UINT8)


    def definegraph(self):
        # Train pipe
        jpegs, ids, sizes, boxes, labels, lengths, batch = self.input()

        # ssd crop window generation
        sizes, boxes, labels, lengths, windows = self.ssd_crop_win_gen(
            sizes, boxes, labels, lengths)

        # perform crop after decode
        images = self.decode(jpegs, windows)

        images = self.transpose(images)
        return images, sizes, boxes, labels, lengths, windows


def display_images(images, batch_size, cols):
    rows = (batch_size + 1) // cols
    plt.figure(figsize=(10, 10))
    for i in range(batch_size):
        ax = plt.subplot(rows, cols, i + 1)
        plt.imshow(images[i])
        plt.axis("off")
    plt.show(block=False)
    plt.pause(g_display_timeout)
    plt.close()

def run(device, op_device):
    batch_size = 6
    img_width = 300
    img_height = 300
    num_threads = 1
    queue_depth = 2

    base_dir = os.environ['DATASET_DIR']
    base_dir = base_dir+"/coco_data/"
    dir = base_dir + "/imgs/"
    ann_file = base_dir + "/annotation.json"

    # Create MediaPipe object
    pipe = myMediaPipe(device, queue_depth, batch_size, num_threads,
                      op_device, dir, ann_file, img_height, img_width)

    # Build MediaPipe
    pipe.build()

    # Initialize MediaPipe iterator
    pipe.iter_init()

    # Run MediaPipe
    images, sizes, boxes, labels, lengths, windows = pipe.run()

    # Copy data to host from device as numpy array
    images = images.as_cpu().as_nparray()
    sizes = sizes.as_nparray()
    boxes = boxes.as_nparray()
    labels = labels.as_nparray()
    lengths = lengths.as_nparray()
    windows = windows.as_nparray()

    del pipe

    # Display images, shape, dtype
    print('images dtype:', images.dtype)
    print('images shape:', images.shape)

    print('sizes dtype:', sizes.dtype)
    print('sizes:', sizes)

    print('boxes dtype:', boxes.dtype)
    print('boxes:', boxes)

    print('labels dtype:', labels.dtype)
    print('labels:', labels)

    print('lengths dtype:', lengths.dtype)
    print('lengths:', lengths)

    print('crop windows dtype:', windows.dtype)
    print('crop windows:', windows)


    display_images(images, batch_size, 3)

if __name__ == "__main__":
    dev_opdev = {'mixed': ['cpu']}

    for dev in dev_opdev.keys():
        for op_dev in dev_opdev[dev]:
            run(dev, op_dev)

SSB BBox Flip Output Images 1

Image1 of slice
Image2 of slice
Image3 of slice
Image4 of slice
Image5 of slice
Image6 of slice
1

The following is the output of SSDMetadata operator:

images dtype: uint8
images shape: (6, 300, 300, 3)
sizes dtype: int32
sizes: [[220 160]
[200 268]
[192 200]
[300 291]
[148 128]
[ 96 204]]
boxes dtype: float32
boxes: [[[0.35       0.         0.99375    0.26363638]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]

[[0.0597015  0.1        0.43283582 0.8       ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]

[[0.405      0.08333335 0.90500003 0.6041667 ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]

[[0.5142611  0.         0.8407216  0.33333334]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]

[[0.         0.         0.98437494 0.90761214]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]

[[0.0882353  0.         1.         0.99999994]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]]
labels dtype: int32
labels: [[1 0 0 ... 0 0 0]
[2 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]
[2 0 0 ... 0 0 0]
[2 0 0 ... 0 0 0]]
lengths dtype: int32
lengths: [[1]
[1]
[1]
[1]
[1]
[1]]
crop windows dtype: int32
crop windows: [[ 68  32 160 220]
[ 24   0 268 200]
[ 20  84 200 192]
[  0   0 291 300]
[124  44 128 148]
[ 32  44 204  96]]