habana_frameworks.mediapipe.fn.SSDCropWindowGen

Class:
  • habana_frameworks.mediapipe.fn.SSDCropWindowGen(**kwargs)

Define graph call:
  • __call__(sizes, boxes, labels, lengths)

Parameter:

  • sizes - Input tensor of image sizes. size=[batch, 2]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

  • boxes - Input tensor of bounding boxes (each bbox should be in [left, top, right, bottom] format). size=[batch, 200, 4] Supported dimensions: minimum = 3, maximum = 3. Supported data types: FLOAT32.

  • labels - Input tensor of image labels for each bounding box. size=[batch, 200]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

  • lengths - Input tensor of number of bounding boxes per image. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

Description:

SSDCropWindowGen operator takes the metadata output of Coco reader and generate crop window in such that:
  • Center of all ground truth boxes fall with-in crop window

  • IOU bwtewwn every ground truth box and crop window should be more than min_iou.

for every image, min_iou will be randomly selected from values [0.1f,0.3f,0.5f,0.7f,0.9f]. If any of ground truth bounding box little bit fall outside crop window, box will be cropped.

Supported backend:
  • CPU

Keyword Arguments

kwargs

Description

min_width

minimum width of crop window. It should be a normalized value (in range 0.0 to 1.0).

  • Type: float

  • Default: 0.3

  • Optional: yes

max_width

maximum width of crop window. It should be a normalized value (in range 0.0 to 1.0)..

  • Type: float

  • Default: 1.0

  • Optional: yes

min_height

minimum height of crop window. It should be a normalized value (in range 0.0 to 1.0).

  • Type: float

  • Default: 0.3

  • Optional: yes

max_height

maximum height of crop window. It should be a normalized value (in range 0.0 to 1.0)..

  • Type: float

  • Default: 1.0

  • Optional: yes

num_iterations

Number of iterations to be used to get valid crop window. If valid crop window is not found in given iterations, then no cropping is done.

  • Type: int

  • Default: 1

  • Optional: yes

seed

Seed to be used for SSD crop randomization.

  • Type: int

  • Default: -1

  • Optional: yes

Output:

Output Value

Description

sizes

Image size after crop.

boxes

List of bounding boxes for every image in [left, top, right, bottom] format. Box will be cropped if fall outside selected crop window.

labels

List of labels for every encoded bounding box.

lengths

Number of ground truth boxes per image.

windows

Crop window coordinates.

Example: SSDCropWindowGen Operator

The following code snippet shows usage of SSDCropWindowGen operator:

from habana_frameworks.mediapipe import fn  # NOQA
from habana_frameworks.mediapipe.mediapipe import MediaPipe  # NOQA
from habana_frameworks.mediapipe.media_types import imgtype as it  # NOQA
from habana_frameworks.mediapipe.media_types import dtype as dt  # NOQA

class myMediaPipe(MediaPipe):
    def __init__(self, device, queue_depth, batch_size, img_h, img_w, dataset_path, annotation_file, num_threads=1):

        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            num_threads,
            self.__class__.__name__)

        self.input = fn.CocoReader(root=dataset_path,
                                    annfile=annotation_file,
                                    seed=1234,
                                    shuffle=False,
                                    drop_remainder=True,
                                    num_slices=1,
                                    slice_index=0,
                                    partial_batch=False,
                                    device='cpu')

        self.reshape_ids = fn.Reshape(size=[batch_size],
                                        tensorDim=1,
                                        layout='',
                                        dtype=dt.UINT32, device='hpu')  # [batch_size]

        self.ssd_crop_win_gen = fn.SSDCropWindowGen(
            num_iterations=1, seed=1234, device='cpu')

        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

    def definegraph(self):
        # Train pipe
        jpegs, ids, sizes, boxes, labels, lengths, batch = self.input()

        # ssd crop window generation
        sizes, boxes, labels, lengths, windows = self.ssd_crop_win_gen(
            sizes, boxes, labels, lengths)

        #perform crop after decode
        images = self.decode(jpegs, windows)

        return images, sizes, boxes, labels, lengths, windows

def main():
    batch_size = 2
    img_width = 300
    img_height = 300

    img_dir = "/path/to/images"
    ann_file = "/path/to/annotationfile"
    queue_depth = 2

    # Create MediaPipe object
    pipe = myMediaPipe('cpu', queue_depth, batch_size,
                        img_height, img_width, img_dir, ann_file)

    # Build MediaPipe
    pipe.build()

    # Initialize MediaPipe iterator
    pipe.iter_init()

    # Run MediaPipe
    images, sizes, boxes, labels, lengths, windows = pipe.run()

    # Copy data to host from device as numpy array
    images = images.as_cpu().as_nparray()
    sizes = sizes.as_nparray()
    boxes = boxes.as_nparray()
    labels = labels.as_nparray()
    lengths = lengths.as_nparray()
    windows = windows.as_nparray()

    # Display images, shape, dtype
    print('images dtype:', images.dtype)
    print('images shape:', images.shape)

    print('sizes dtype:', sizes.dtype)
    print('sizes:', sizes)

    print('boxes dtype:', boxes.dtype)
    print('boxes:', boxes)

    print('labels dtype:', labels.dtype)
    print('labels:', labels)

    print('lengths dtype:', lengths.dtype)
    print('lengths:', lengths)

    print('crop windows dtype:', windows.dtype)
    print('crop windows:', windows)

if __name__ == "__main__":
    main()

The following is the output of SSDMetadata operator:

images dtype: uint8
images shape:
(2, 3, 300, 300)
sizes dtype: uint32
sizes:
[[264 244]
[424 568]]
boxes dtype: float32
boxes:
[[[0.         0.         1.         0.9999999 ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]

[[0.58015853 0.14158018 0.9586268  0.8424292 ]
  [0.         0.8407783  0.22718309 0.97094333]
  [0.         0.         0.         0.        ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]]
labels dtype: uint32
labels:
[[46  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0]
[24 24  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0]]
lengths dtype: uint32
lengths:
[[1]
[2]]
crop windows dtype: uint32
crop windows:
[[  4 108 244 264]
[ 56   0 568 424]]