habana_frameworks.mediapipe.fn.ReadVideoDatasetFromDirGen¶

Class:

habana_frameworks.mediapipe.fn.ReadVideoDatasetFromDirGen(
    dir=/path/to/dataset/,
    format="mp4",
    seed=0,
    label_dtype=dt.UINT32,
    num_slices=1,
    slice_index=0,
    file_list=[],
    class_list=[],
    file_classes=[],
    frames_per_clip=1,
    stride=1,
    clips_per_video=1,
    step_between_clips=1,
    start_frame_index=0,
    sampler=cs.CONTIGUOUS_SAMPLER,
    last_batch_strategy=lbs.CYCLIC,
    slice_once=True,
    is_modulo_slice=True
)

Define graph call:

__call__()

Parameter:

None

Description:

This reader is designed for video classification tasks. There are two ways to provide input to ReadVideoDatasetFromDirGen:

By specifying the input directory path, the names of the subdirectories will be considered as class labels (class_label) for all the videos included.
By providing file_list, class_list and file_classes to the reader

The reader returns batches of video paths, ground truth labels, and a resample list. The output ground truth labels are a list of integers representing the class labels of the videos.

Supported backend:

CPU

Keyword Arguments:

kwargs	Description
dir	Input video directory path. Reads videos from all subdirectories within the specified directory. The name of each subdirectory is treated as the `class_label` for all the videos included. Type: str Default: None Optional: yes (either provide dir or provide file_list) Note: Arrange input files as `dir_path/class_label/video` files. For example, for training pipeline input, the videos are `/user/home/videos/train/<class_label>/<video_name>.mp4`, give dir= `/user/home/videos/train/`. There might be multiple subdirectories in `/user/home/videos/train/`, one for each `class_label`.
format	Format or extension of video file names. Lists all the videos in the subdirectories of “dir”. Type: str Default: “mp4” Optional: no Note: The supported video file extensions are “mp4” (for H.264, HEVC codec).
seed	Seed for randomization. If not provided, it will be generated internally. Type: int Default: None Optional: yes
label_dtype	Required data type of output ground truth labels. Reader returns batch of video file path and ground truth labels. Output ground truth labels are list of integers, which specifies index of respective video’s class label in sorted list of all class labels. `label_dtype` specifies data type of these integers. Type: habana_frameworks.mediapipe.media_types.dtype Default: UINT32 Optional: yes
num_slices	Indicates number of cards in multi-card training. Clips are divided into `num_slices` i.e. one slice for each card. Default value is one, which indicates single card training. Type: int Default: 1 Optional: yes
slice_index	In multi-card training, it indicates index of card. Type: int Default: 0 Optional: yes (if num_slices=1, Otherwise user must provide it) Note: Default value is zero for single card training. For multi-card it must be between 0 and `num_slices -1`.
file_list	In-place of providing dir (input video directory path), user can provide list of files to reader. Type: list Default: None Optional: yes Note: `file_list` must be ordered by `class_list` i.e. list all the files of class_1 followed by all the files of class_2 and so on. Note: `file_list` should have full path of all the files. Example [“/path/to/dataset/class_1/vid_c1_0.mp4”, “/path/to/dataset/class_1/vid_c1_1.mp4”, “/path/to/dataset/class_2/vid_c2_0.mp4”, “/path/to/dataset/class_2/vid_c2_1.mp4”, …..]
class_list	List of unique class labels must be provided along with `file_list`. It will be used as a look-up table to generate output ground truth labels. Type: list Default: None Optional: yes (If `file_list` is provided, `class_list` is not optional) Note: Output ground truth labels will be index of respective video’s `class_label` in this `class_list`. This means, it will act as a look-up table to generate output ground truth labels.
file_classes	List of class name for every file in `file_list`. It will be used to generate output ground truth labels. If not provided, last sub-directory name from `file_list` will be used to generate `file_classes`. Type: list Default: None Optional: yes Note: For example `file_list` is provided but `file_classes` is not provided, file_list=[“/path/to/dataset/class_1/vid_c1_0.mp4”, “/path/to/dataset/class_1/vid_c1_1.mp4”, “/path/to/dataset/class_2/vid_c2_0.mp4”, “/path/to/dataset/class_2/vid_c2_1.mp4”, …..] In that case Reader will generate file_classes = [“class_1”, “class_1”, “class_2”, “class_2”, …] from last sub-directory name of every video in file_list.
frames_per_clip	Number of frames in a clip. Type: int Default: 1 Optional: no
stride	Number of frames between consecutive frames of clip. Type: int Default: 1 Optional: yes
clips_per_video	Number of clips to generate from each video. It determines how many clips are generated per video file. Type: int Default: 1 Optional: yes
step_between_clip	Step size between clips in terms of frames. It defines how many frames to skip between consecutive clips. Type: int Default: 1 Optional: yes
start_frame_index	Each video is used to generate clips starting from frame index `start_frame_index`. Type: int Default: 0 Optional: yes
sampler	Type of sampler to use for selecting clips. It determines the strategy for sampling clips from the video. Type: habana_frameworks.mediapipe.media_types Default: `CONTIGUOUS_SAMPLER` Optional: yes Note: Types of Sampler supported are `RANDOM_SAMPLER`, `UNIFORM_SAMPLER`, `CONTIGUOUS_SAMPLER`, `CONTIGUOUS_RANDOM_SAMPLER`. `RANDOM_SAMPLER` : Sample (at most) `clips_per_video` clips for each video randomly. `UNIFORM_SAMPLER` : Sample `clips_per_video` clips for each video, equally spaced. `CONTIGUOUS_SAMPLER` : Sample (at most) consecutive `clips_per_video` clips for each video. `CONTIGUOUS_RANDOM_SAMPLER`: Sample (at most) consecutive `clips_per_video` clips for each video, and then shuffle the clips.
last_batch_strategy	Strategy for handling partial batch. Type: habana_frameworks.mediapipe.media_types.lastBatchStrategy Default: `CYCLIC` Optional: yes Note: Type of last batch strategy supported are `DROP`, `PAD`, `CYCLIC`. `DROP` : Drop the clips from partial batch `PAD` : Repeat the last clip in partial batch `CYCLIC`: Repeat the clips starting from first clip in partial batch
slice_once	Whether reader of each device should slice clips once (True) or slice in each epoch (False). Type: bool Default: True Optional: yes
is_modulo_slice	Whether the reader should slice clips in modulo pattern (True) or slice wise (False). Type: bool Default: True Optional: yes

Example #1: Use ReadVideoDatasetFromDirGen by Providing Input Directory¶

The following code snippet shows use of ReadVideoDatasetFromDirGen by providing input video directory path. Input mp4 videos are present in sub directories of “//path/to/dataset/”. For example:

“/path/to/dataset/class_1/vid_c1_0.mp4”
“/path/to/dataset/class_1/vid_c1_1.mp4”
“/path/to/dataset/class_2/vid_c2_0.mp4”
“/path/to/dataset/class_3/vid_c3_1.mp4”

fn.ReadVideoDatasetFromDirGen(dir="/path/to/dataset/", format="mp4")

Since the format="mp4", the reader will process all “mp4” files in the sub-directories of dir. Names of sub-directories are considered as class_label for all the videos in it. Reader internally creates class_list which is a sorted list of unique class_labels (i.e. sorted list of unique sub-directory names). class_list is used as a dictionary to generate output ground truth labels. Output ground truth label of every video is index of video class_label in class_list (i.e. index of sub-directory name in which that video is present in class_list). In the below example, reader is returning ground truth label for every video, which is displayed as align title of that Video.

import os
import matplotlib.pyplot as plt
from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
from habana_frameworks.mediapipe.media_types import clipSampler as cs
from habana_frameworks.mediapipe.media_types import lastBatchStrategy as lbs

g_stride = 3
g_step_between_clips = 2
g_clips_per_video = 2


def get_dec_max_frame_gen(frame_per_clip, stride):
    dec_max_frame = ((frame_per_clip - 1) * stride) + 1
    return dec_max_frame


class myMediaPipe(MediaPipe):

    def __init__(self,
                device,
                queue_depth,
                batch_size,
                num_threads,
                dir,
                resize_w,
                resize_h,
                frame_per_clip):

        super(myMediaPipe, self).__init__(device,
                                          queue_depth,
                                          batch_size,
                                          num_threads,
                                          self.__class__.__name__)

        self.input = fn.ReadVideoDatasetFromDirGen(dir=dir,
                                                  format="mp4",
                                                  frames_per_clip=frame_per_clip,
                                                  clips_per_video=g_clips_per_video,
                                                  stride=g_stride,
                                                  step_between_clips=g_step_between_clips,
                                                  last_batch_strategy=lbs.CYCLIC,
                                                  sampler=cs.CONTIGUOUS_SAMPLER)

        dec_max_frame = get_dec_max_frame_gen(frame_per_clip, g_stride)
        print("VideoDecoder max_frame_vid: {} resize: w {} h {}".format(dec_max_frame,
                                                                        resize_w,
                                                                        resize_h))

        self.decode = fn.VideoDecoder(device="hpu",
                                      output_format=it.RGB_I,
                                      resize=[resize_w, resize_h],
                                      frames_per_clip=frame_per_clip,
                                      max_frame_vid=dec_max_frame)

    def definegraph(self):
        videos, labels, resample = self.input()
        videos = self.decode(videos, resample)
        return videos, labels


def display_videos(videos, labels, batch_size, frame_per_clip, cols):
    rows = (batch_size * frame_per_clip) // cols
    plt.figure(figsize=(10, 10))
    frame_index = 0
    for i in range(batch_size):
        frm_num = 0
        for j in range(frame_per_clip):
            frm_num += 1
            ax = plt.subplot(rows, cols, frame_index + 1)
            plt.imshow(videos[i][j])
            plt.title("Label: " + str(labels[i]) + " Frame: " + str(frm_num))
            plt.axis("off")
            frame_index += 1
    plt.show()


def main():
    batch_size = 4
    img_width = 200
    img_height = 200
    queue_depth = 3
    frame_per_clip = 2
    num_threads = 1
    base_dir = os.environ['DATASET_DIR']
    dir = base_dir + "/vid_data/"

    pipe = myMediaPipe("legacy",
                      queue_depth,
                      batch_size,
                      num_threads,
                      dir,
                      img_width,
                      img_height,
                      frame_per_clip)
    pipe.build()
    pipe.iter_init()

    bcnt = 0
    while(bcnt < 2):
        try:
            videos, labels = pipe.run()
        except StopIteration:
            break
        videos = videos.as_cpu().as_nparray()
        labels = labels.as_cpu().as_nparray()

        display_videos(videos, labels, batch_size, frame_per_clip, cols=4)
        bcnt = bcnt + 1


if __name__ == "__main__":
    main()