habana_frameworks.mediapipe.fn.SSDMetadata¶

Class:

habana_frameworks.mediapipe.fn.SSDMetadata(**kwargs)

Define graph call:

__call__(ids, sizes, boxes, labels, lengths, flip)

Parameter:

ids - Input tensor of image ids. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

sizes - Input tensor of image sizes. size=[batch, 2]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

boxes - Input tensor of bounding boxes for each image. size=[batch, 200, 4] Supported dimensions: minimum = 3, maximum = 3. Supported data types: FLOAT32.

labels - Input tensor of image labels for each bounding box. size=[batch, 200]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

lengths - Input tensor of number of bounding boxes per image. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

(optional)flip - Input tensor with predicate information for flip. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT8.

Description:

SSDMetadata operator takes the metadata output of Coco reader and performs operations such as crop, flip and encoding of bounding boxes. Crop and flip operations are optional in serialize parameter. If crop is not present, then the SSDMetadata operator will not produce crop windows in the output. If flip is not provided, then flip input tensor is not needed and image flipping will not be done by the SSDMetadata operator. If crop is present, then it should be the first operator in the list. Encode should be the last operator in the list.

Supported backend:

Keyword Arguments

kwargs	Description
workers	Number of threads used for SSD metadata processing. Type: int Default: 2 Optional: yes
serialize	Metadata operations to be done on a batch. Type: list[] Default: [MetadataOps.crop, MetadataOps.flip, MetadataOps.encode] Optional: no
cropping_iterations	Number of iterations to be used to get valid crop window. If valid crop window is not found in given iterations, then no cropping is done. Type: int Default: 1 Optional: yes
seed	Seed to be used for SSD crop randomization. Type: int Default: -1 Optional: yes

Output:

Output Value	Description
(optional) windows	Crop window coordinates.
ids	Image id from annotation file.
sizes	Image size after crop.
boxes	List of encoded (wrt 8732 anchors) bounding boxes for every image [x_start, y_start, width, height].
labels	List of labels for every encoded bounding box.
lengths	Number of ground truth boxes per image.
batch	Number of valid images in a batch.

Example: SSDMetadata Operator

The following code snippet shows usage of SSDMetadata operator:

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
from habana_frameworks.mediapipe.operators.cpu_nodes.cpu_nodes import media_function
from media_pipe_api import MetadataOps
import numpy as np

# Create MediaPipe derived class
class myMediaPipe(MediaPipe):
    def __init__(self, device, queue_depth, batch_size, img_h, img_w, dir, annfile):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            self.__class__.__name__)

        self.input = fn.CocoReader(root=dir,
                                   annfile=annfile,
                                   seed=0,
                                   shuffle=False,
                                   drop_remainder=True,
                                   num_slices=1,
                                   slice_index=0,
                                   partial_batch=False)

        self.random_flip_input = fn.MediaFunc(func=random_flip_func,
                                              shape=[batch_size],
                                              dtype=dt.UINT8)

        self.ssd_metadata = fn.SSDMetadata(workers=1,
                                           serialize=[MetadataOps.crop, MetadataOps.flip, MetadataOps.encode])

        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

        self.random_flip = fn.RandomFlip(horizontal=1)

        self.transpose = fn.Transpose(permutation=[2, 0, 1, 3],
                                      tensorDim=4,
                                      dtype=dt.UINT8)

    def definegraph(self):
        flip = self.random_flip_input()
        jpegs, ids, sizes, boxes, labels, lengths, batch = self.input()
        windows, ids, sizes, boxes, labels, num_boxes = self.ssd_metadata(ids, sizes, boxes, labels, lengths, flip)
        images = self.decode(jpegs, windows)
        images = self.random_flip(images, flip)
        images = self.transpose(images)
        return images, ids, sizes, boxes, labels, num_boxes, batch

class random_flip_func(media_function):
    def __init__(self, params):
        self.p = 0.5
        self.np_shape = params['shape'][::-1]
        self.np_dtype = params['dtype']
        self.seed = params['seed']
        self.rng = np.random.default_rng(self.seed)

    def __call__(self):
        a = self.rng.choice(a=[0, 1], size=(
            self.np_shape), p=[self.p, 1-self.p])
        a = np.array(a, dtype=self.np_dtype)
        return a

def main():
    batch_size = 2
    img_width = 300
    img_height = 300
    img_channel = 3
    img_dir = "/path/to/images"
    ann_file = "/path/to/annotationfile"
    queue_depth = 2

    # Create MediaPipe object
    pipe = myMediaPipe('hpu', queue_depth, batch_size,
                        img_height, img_width, img_dir, ann_file)

    # Build MediaPipe
    pipe.build()

    # Initialize MediaPipe iterator
    pipe.iter_init()

    # Run MediaPipe
    images, ids, sizes, boxes, labels, num_boxes, batch = pipe.run()

    # Copy data to host from device as numpy array
    images = images.as_cpu().as_nparray()
    ids = ids.as_cpu().as_nparray()
    sizes = sizes.as_cpu().as_nparray()
    boxes = boxes.as_cpu().as_nparray()
    labels = labels.as_cpu().as_nparray()
    num_boxes = num_boxes.as_cpu().as_nparray()
    batch = batch.as_cpu().as_nparray()

    # Display images, shape, dtype
    print('coco ids dtype:', ids.dtype)
    print('coco ids:\n', ids)

    print('coco sizes dtype:', sizes.dtype)
    print('coco sizes:\n', sizes)

    print('coco boxes dtype:', boxes.dtype)
    print('coco boxes:\n', boxes)

    print('coco labels dtype:', labels.dtype)
    print('coco labels:\n', labels)

    print('coco num_boxes dtype:', num_boxes.dtype)
    print('coco num_boxes:\n', num_boxes)

    print('coco batch dtype:', batch.dtype)
    print('coco batch:\n', batch)

if __name__ == "__main__":
    main()

The following is the output of SSDMetadata operator:

coco ids dtype: uint32
coco ids:
[391895 522418]
coco sizes dtype: uint32
coco sizes:
[[360 640]
[480 640]]
coco boxes dtype: float32
coco boxes:
[[[0.01333333 0.01333333 0.07       0.07      ]
  [0.04       0.01333333 0.07       0.07      ]
  [0.06666667 0.01333333 0.07       0.07      ]
  ...
  [0.5        0.5        0.9557719  0.9557719 ]
  [0.5        0.5        1.         0.6151829 ]
  [0.5        0.5        0.6151829  1.        ]]

[[0.01333333 0.01333333 0.07       0.07      ]
  [0.04       0.01333333 0.07       0.07      ]
  [0.06666667 0.01333333 0.07       0.07      ]
  ...
  [0.5        0.5        0.9557719  0.9557719 ]
  [0.5        0.5        1.         0.6151829 ]
  [0.5        0.5        0.6151829  1.        ]]]
coco labels dtype: uint32
coco labels:
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
coco num_boxes dtype: uint32
coco num_boxes:
[4 4]
coco batch dtype: uint32
coco batch:
[2]

Gaudi Documentation 1.16.2 documentation

habana_frameworks.mediapipe.fn.SSDMetadata

habana_frameworks.mediapipe.fn.SSDMetadata¶