habana_frameworks.mediapipe.fn.SSDMetadata

Class:
  • habana_frameworks.mediapipe.fn.SSDMetadata(**kwargs)

Define graph call:
  • __call__(ids, sizes, boxes, labels, lengths, flip)

Parameter:

  • ids - Input tensor of image ids. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

  • sizes - Input tensor of image sizes. size=[batch, 2]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

  • boxes - Input tensor of bounding boxes for each image. size=[batch, 200, 4] Supported dimensions: minimum = 3, maximum = 3. Supported data types: FLOAT32.

  • labels - Input tensor of image labels for each bounding box. size=[batch, 200]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

  • lengths - Input tensor of number of bounding boxes per image. size=[batch]. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT32.

  • (optional)flip - Input tensor with predicate information for flip. Supported dimensions: minimum = 1, maximum = 1. Supported data types: UINT8.

Description:

SSDMetadata operator takes the metadata output of Coco reader and performs operations such as crop, flip and encoding of bounding boxes. Crop and flip operations are optional in serialize parameter. If crop is not present, then the SSDMetadata operator will not produce crop windows in the output. If flip is not provided, then flip input tensor is not needed and image flipping will not be done by the SSDMetadata operator. If crop is present, then it should be the first operator in the list. Encode should be the last operator in the list.

Supported backend:
  • CPU

Keyword Arguments

kwargs

Description

workers

Number of threads used for SSD metadata processing.

  • Type: int

  • Default: 2

  • Optional: yes

serialize

Metadata operations to be done on a batch.

  • Type: list[]

  • Default: [MetadataOps.crop, MetadataOps.flip, MetadataOps.encode]

  • Optional: no

cropping_iterations

Number of iterations to be used to get valid crop window. If valid crop window is not found in given iterations, then no cropping is done.

  • Type: int

  • Default: 1

  • Optional: yes

seed

Seed to be used for SSD crop randomization.

  • Type: int

  • Default: -1

  • Optional: yes

Output:

Output Value

Description

(optional) windows

Crop window coordinates.

ids

Image id from annotation file.

sizes

Image size after crop.

boxes

List of encoded (wrt 8732 anchors) bounding boxes for every image [x_start, y_start, width, height].

labels

List of labels for every encoded bounding box.

lengths

Number of ground truth boxes per image.

batch

Number of valid images in a batch.

Example: SSDMetadata Operator

The following code snippet shows usage of SSDMetadata operator:

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
from habana_frameworks.mediapipe.operators.cpu_nodes.cpu_nodes import media_function
from media_pipe_api import MetadataOps
import numpy as np

# Create media pipeline derived class
class myMediaPipe(MediaPipe):
    def __init__(self, device, queue_depth, batch_size, img_h, img_w, dir, annfile):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            self.__class__.__name__)

        self.input = fn.CocoReader(root=dir,
                                   annfile=annfile,
                                   seed=0,
                                   shuffle=False,
                                   drop_remainder=True,
                                   num_slices=1,
                                   slice_index=0,
                                   partial_batch=False)

        self.random_flip_input = fn.MediaFunc(func=random_flip_func,
                                              shape=[batch_size],
                                              dtype=dt.UINT8)

        self.ssd_metadata = fn.SSDMetadata(workers=1,
                                           serialize=[MetadataOps.crop, MetadataOps.flip, MetadataOps.encode])

        self.decode = fn.ImageDecoder(device="hpu",
                                      output_format=it.RGB_P,
                                      resize=[img_w, img_h])

        self.random_flip = fn.RandomFlip(horizontal=1)

        self.transpose = fn.Transpose(permutation=[2, 0, 1, 3],
                                      tensorDim=4,
                                      dtype=dt.UINT8)

    def definegraph(self):
        flip = self.random_flip_input()
        jpegs, ids, sizes, boxes, labels, lengths, batch = self.input()
        windows, ids, sizes, boxes, labels, num_boxes = self.ssd_metadata(ids, sizes, boxes, labels, lengths, flip)
        images = self.decode(jpegs, windows)
        images = self.random_flip(images, flip)
        images = self.transpose(images)
        return images, ids, sizes, boxes, labels, num_boxes, batch

class random_flip_func(media_function):
    def __init__(self, params):
        self.p = 0.5
        self.np_shape = params['shape'][::-1]
        self.np_dtype = params['dtype']
        self.seed = params['seed']
        self.rng = np.random.default_rng(self.seed)

    def __call__(self):
        a = self.rng.choice(a=[0, 1], size=(
            self.np_shape), p=[self.p, 1-self.p])
        a = np.array(a, dtype=self.np_dtype)
        return a

def main():
    batch_size = 2
    img_width = 300
    img_height = 300
    img_channel = 3
    img_dir = "/path/to/images"
    ann_file = "/path/to/annotationfile"
    queue_depth = 2

    # Create media pipeline object
    pipe = myMediaPipe('hpu', queue_depth, batch_size,
                        img_height, img_width, img_dir, ann_file)

    # Build media pipeline
    pipe.build()

    # Initialize media pipeline iterator
    pipe.iter_init()

    # Run media pipeline
    images, ids, sizes, boxes, labels, num_boxes, batch = pipe.run()

    # Copy data to host from device as numpy array
    images = images.as_cpu().as_nparray()
    ids = ids.as_cpu().as_nparray()
    sizes = sizes.as_cpu().as_nparray()
    boxes = boxes.as_cpu().as_nparray()
    labels = labels.as_cpu().as_nparray()
    num_boxes = num_boxes.as_cpu().as_nparray()
    batch = batch.as_cpu().as_nparray()

    # Display images, shape, dtype
    print('coco ids dtype:', ids.dtype)
    print('coco ids:\n', ids)

    print('coco sizes dtype:', sizes.dtype)
    print('coco sizes:\n', sizes)

    print('coco boxes dtype:', boxes.dtype)
    print('coco boxes:\n', boxes)

    print('coco labels dtype:', labels.dtype)
    print('coco labels:\n', labels)

    print('coco num_boxes dtype:', num_boxes.dtype)
    print('coco num_boxes:\n', num_boxes)

    print('coco batch dtype:', batch.dtype)
    print('coco batch:\n', batch)

if __name__ == "__main__":
    main()

The following is the output of SSDMetadata operator:

coco ids dtype: uint32
coco ids:
[391895 522418]
coco sizes dtype: uint32
coco sizes:
[[360 640]
[480 640]]
coco boxes dtype: float32
coco boxes:
[[[0.01333333 0.01333333 0.07       0.07      ]
  [0.04       0.01333333 0.07       0.07      ]
  [0.06666667 0.01333333 0.07       0.07      ]
  ...
  [0.5        0.5        0.9557719  0.9557719 ]
  [0.5        0.5        1.         0.6151829 ]
  [0.5        0.5        0.6151829  1.        ]]

[[0.01333333 0.01333333 0.07       0.07      ]
  [0.04       0.01333333 0.07       0.07      ]
  [0.06666667 0.01333333 0.07       0.07      ]
  ...
  [0.5        0.5        0.9557719  0.9557719 ]
  [0.5        0.5        1.         0.6151829 ]
  [0.5        0.5        0.6151829  1.        ]]]
coco labels dtype: uint32
coco labels:
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
coco num_boxes dtype: uint32
coco num_boxes:
[4 4]
coco batch dtype: uint32
coco batch:
[2]