habana_frameworks.mediapipe.fn.SSDBBoxFlip

Class:
  • habana_frameworks.mediapipe.fn.SSDBBoxFlip(**kwargs)

Define graph call:
  • __call__(is_Flip, boxes, lengths)

Parameter:

  • is_Flip - Input tensor to indicate flip or don’t flip input bbox. size=[batch]. Supported data types: INT8.

  • boxes - Input tensor of bounding boxes (each bbox should be in [l,t,r,b] format). size=[batch, 200, 4] Supported dimensions: minimum = 3, maximum = 3. Supported data types: FLOAT32.

  • lengths - Input tensor of number of bounding boxes per image. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

Description:

SSDBBoxFlip operator takes bbox tensor and perform horizontal flip if corresponding value in is_flip tensor is 1.

Supported backend:
  • CPU

Output:

Output Value

Description

boxes

List of bounding boxes for every image in [left, top, right, bottom] format.

Example: SSDBBoxFlip Operator

The following code snippet shows usage of SSDBBoxFlip operator:

from habana_frameworks.mediapipe import fn  # NOQA
    from habana_frameworks.mediapipe.mediapipe import MediaPipe  # NOQA
    from habana_frameworks.mediapipe.media_types import imgtype as it  # NOQA
    from habana_frameworks.mediapipe.media_types import dtype as dt  # NOQA

    flip_prob = 1.0

    class myMediaPipe(MediaPipe):
        def __init__(self, device, queue_depth, batch_size, img_h, img_w, dataset_path, annotation_file, num_threads=1):

            super(
                myMediaPipe,
                self).__init__(
                device,
                queue_depth,
                batch_size,
                num_threads,
                self.__class__.__name__)

            self.input = fn.CocoReader(root=dataset_path,
                                        annfile=annotation_file,
                                        seed=1234,
                                        shuffle=False,
                                        drop_remainder=True,
                                        num_slices=1,
                                        slice_index=0,
                                        partial_batch=False,
                                        device='cpu')

            self.reshape_ids = fn.Reshape(size=[batch_size],
                                            tensorDim=1,
                                            layout='',
                                            dtype=dt.UINT32, device='hpu')  # [batch_size]

            self.ssd_crop_win_gen = fn.SSDCropWindowGen(
                num_iterations=1, seed=1234, device='cpu')

            self.bbox_flip_prob = fn.Constant(
                constant=flip_prob, dtype=dt.FLOAT32, device='cpu')

            self.is_bbox_flip = fn.CoinFlip(
                seed=1234, dtype=dt.INT8, device='cpu')

            self.ssd_bbox_flip = fn.SSDBBoxFlip(device='cpu')

            self.decode = fn.ImageDecoder(device="hpu",
                                        output_format=it.RGB_P,
                                        resize=[img_w, img_h])

            # iamge flip - Horizontal
            self.reshape_is_flip = fn.Reshape(size=[batch_size],
                                                tensorDim=1,
                                                layout='',
                                                dtype=dt.UINT8, device='hpu')
            self.random_flip = fn.RandomFlip(horizontal=1, device='hpu')

        def definegraph(self):
            # Train pipe
            jpegs, ids, sizes, boxes, labels, lengths, batch = self.input()

            # ssd crop window generation
            sizes, boxes, labels, lengths, windows = self.ssd_crop_win_gen(
                sizes, boxes, labels, lengths)

            images = self.decode(jpegs, windows)

            # ssd Bounding box flip
            bb_flip_prob = self.bbox_flip_prob()
            is_Flip = self.is_bbox_flip(bb_flip_prob)
            boxes_fliped = self.ssd_bbox_flip(is_Flip, boxes, lengths)

            # image flip
            is_Flip = self.reshape_is_flip(is_Flip)
            images = self.random_flip(images, is_Flip)

            return images, boxes, is_Flip, boxes_fliped

    def main():
        batch_size = 2
        img_width = 300
        img_height = 300

        img_dir = "/path/to/images"
        ann_file = "/path/to/annotationfile"
        queue_depth = 2

        # Create MediaPipe object
        pipe = myMediaPipe('cpu', queue_depth, batch_size,
                            img_height, img_width, img_dir, ann_file)

        # Build MediaPipe
        pipe.build()

        # Initialize MediaPipe iterator
        pipe.iter_init()

        # Run MediaPipe
        images, boxes, is_Flip, boxes_fliped = pipe.run()

        # Copy data to host from device as numpy array
        images = images.as_cpu().as_nparray()
        boxes = boxes.as_nparray()
        is_Flip = is_Flip.as_cpu().as_nparray()
        boxes_fliped = boxes_fliped.as_nparray()

        # Display images, shape, dtype
        print('images dtype:', images.dtype)
        print('images:', images.shape)

        print('boxes dtype:', boxes.dtype)
        print('boxes:', boxes)

        print('is_Flip dtype:', is_Flip.dtype)
        print('is_Flip:', is_Flip)

        print('boxes_fliped dtype:', boxes_fliped.dtype)
        print('boxes_fliped:', boxes_fliped)

    if __name__ == "__main__":
        main()

The following is the output of SSDMetadata operator:

images dtype: uint8
images:
(2, 3, 300, 300)
boxes dtype: float32
boxes:
[[[0.         0.         1.         0.9999999 ]
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]
...
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]]

[[0.58015853 0.14158018 0.9586268  0.8424292 ]
[0.         0.8407783  0.22718309 0.97094333]
[0.         0.         0.         0.        ]
...
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]]]
is_Flip dtype: uint8
is_Flip:
[1 1]
boxes_fliped dtype: float32
boxes_fliped:
[[[0.         0.         1.         0.9999999 ]
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]
...
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]]

[[0.04137319 0.14158018 0.41984147 0.8424292 ]
[0.7728169  0.8407783  1.         0.97094333]
[0.         0.         0.         0.        ]
...
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]
[0.         0.         0.         0.        ]]]