habana_frameworks.mediapipe.fn.ReadVideoDatasetFromDir
On this Page
habana_frameworks.mediapipe.fn.ReadVideoDatasetFromDir¶
Class:
habana_frameworks.mediapipe.fn.ReadVideoDatasetFromDir( dir=/path/to/dataset/, format="mp4", seed=0, drop_remainder=False, pad_remainder=False, label_dtype=dt.UINT64, num_slices=1, slice_index=0, file_list=[], class_list=[], file_classes=[], frames_per_clip=1, clips_per_video=1, target_frame_rate=0, step_between_clips=1, sampler=rs.RANDOM_SAMPLER, fixed_clip_mode=False, start_frame_index=0, shuffle=True )
- Define graph call:
__call__()
- Parameter:
None
Description:
This reader is designed for video classification tasks. There are two ways to provide input to ReadVideoDatasetFromDir:
By specifying input directory path. Sub-directory names will be considered as class_label for all the videos in it.
By providing
file_list
,class_list
andfile_classes
to the reader
Reader returns batches of video path, ground truth labels, resample list and video offset list. Output ground truth labels are a list of integers representing class labels of videos.
- Supported backend:
CPU
Keyword Arguments:
kwargs |
Description |
---|---|
dir |
Input video directory path. Reader will read videos from all sub-directories of dir. Name of sub-directory is treated as class label for all the videos inside it.
|
format |
Format or extension of video file names. It will be used to list-out all the videos in sub-directories of “dir”.
|
seed |
Seed for randomization. If not provided it will be generated internally. It is used for shuffling the dataset and it is also used for randomly
selecting the videos to pad the last batch when the
|
drop_remainder |
If True, reader will drop the last partial batch of the epoch. If False, padding mechanism can be controlled by
|
pad_remainder |
If True, then reader will replicate last video of partial batch. If False, then partial batch is filled with videos randomly selected from dataset.
|
label_dtype |
Required data type of output ground truth labels. Reader returns batch of video file path and ground truth labels. Output ground truth labels are list of integers, which specifies index of respective video’s class label in sorted list of all class labels.
|
num_slices |
It indicates number of cards in multi-card training. Before first epoch, input data is divided into
|
slice_index |
In multi-card training, it indicates index of card.
|
file_list |
In-place of providing dir (input video directory path), user can provide list of files to reader.
|
class_list |
List of unique class labels must be provided along with
|
file_classes |
List of class name for every file in
|
frames_per_clip |
Number of frames in a clip.
|
clips_per_video |
Number of clips to generate from each video. It determines how many clips are generated per video file.
|
target_frame_rate |
If specified(i.e. not 0), resample the videos so that they have the same
|
step_between_clip |
Step size between clips in terms of frames. It defines how many frames to skip between consecutive clips.
|
sampler |
Type of sampler to use for selecting clips. It determines the strategy for sampling clips from the video.
|
fixed_clip_mode |
If True, clips are extracted starting from a fixed frame index (
|
start_frame_index |
Each video is used to generate a clip from
|
shuffle |
If set to True, the reader shuffles the entire dataset before each epoch. In case of multi-card training, it first creates random slices/chunk of files for every card then shuffle.
|
Example #1: Use ReadVideoDatasetFromDir by Providing Input Directory¶
The following code snippet shows use of ReadVideoDatasetFromDir by providing input video directory path. Input mp4 videos are present in sub directories of “//path/to/dataset/”. For example:
“/path/to/dataset/class_1/vid_c1_0.mp4”
“/path/to/dataset/class_1/vid_c1_1.mp4”
“/path/to/dataset/class_2/vid_c2_0.mp4”
“/path/to/dataset/class_3/vid_c3_1.mp4”
fn.ReadVideoDatasetFromDir(shuffle=False, dir="/path/to/dataset/", format="mp4")
Since the format="mp4"
, the reader will process all “mp4” files in the sub-directories of dir
.
Names of sub-directories are considered as class_label
for all the videos in it.
Reader internally creates class_list
which is a sorted list of unique class_labels
(i.e. sorted list of unique sub-directory names).
class_list
is used as a dictionary to generate output ground truth labels. Output ground truth label of every video is index
of video class_label
in class_list
(i.e. index of sub-directory name in which that video is present in class_list
).
In the below example, reader is returning ground truth label for every video, which is displayed as align title of that Video.
from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import matplotlib.pyplot as plt
import os
class myMediaPipe(MediaPipe):
def __init__(self, device, queue_depth, batch_size, num_threads, channel, dir, width, height, frame_per_clip):
super(
myMediaPipe,
self).__init__(
device,
queue_depth,
batch_size,
num_threads,
self.__class__.__name__)
self.input = fn.ReadVideoDatasetFromDir(shuffle=False,
dir=dir,
format="mp4",
frames_per_clip=frame_per_clip,
fixed_clip_mode=True)
self.decode = fn.VideoDecoder(device="hpu",
output_format=it.RGB_I,
resize=[width, height],
dtype=dt.UINT8,
frames_per_clip=frame_per_clip,
max_frame_vid=frame_per_clip)
def definegraph(self):
videos, labels, resample, video_offset = self.input()
videos = self.decode(videos, video_offset)
return videos, labels
def display_videos(videos, labels, batch_size, frame_per_clip, cols):
rows = (batch_size) // cols
plt.figure(figsize=(10, 10))
for i in range(batch_size):
for j in range(frame_per_clip):
ax = plt.subplot(rows, cols, i + 1)
plt.imshow(videos[i][j])
plt.title("label:"+str(labels[i]))
plt.axis("off")
plt.show()
def main():
batch_size = 4
img_width = 200
img_height = 200
frame_per_clip = 2
channels = 3
queue_depth = 3
num_threads = 1
base_dir = os.environ['DATASET_DIR']
dir = base_dir + "/vid_data/"
pipe = myMediaPipe('legacy', queue_depth, batch_size, num_threads,
channels, dir, img_width, img_height, frame_per_clip)
pipe.build()
pipe.iter_init()
bcnt = 0
while(bcnt < 1):
try:
videos, labels = pipe.run()
except StopIteration:
break
videos = videos.as_cpu().as_nparray()
labels = labels.as_cpu().as_nparray()
display_videos(videos, labels, batch_size, frame_per_clip, cols=2)
bcnt = bcnt + 1
if __name__ == "__main__":
main()
Example #1: Output Videos 1
Displaying 2 decoded frames of each video in a batch. Batch size here is 4.
- 1
Licensed under a CC BY SA 4.0 license. The videos used here are generated using images from https://data.caltech.edu/records/mzrjq-6wc02.
Example #2: Use ReadVideoDatasetFromDir by Providing file_list¶
The following code snippet shows an alternative method of using ReadVideoDatasetFromDir by providing file_list
and class_list
:
file_list
- Specifies list of files to process.class_list
- Must be provided along withfile_list
. It is list of unique class labels.file_classes
- :ist of class labels for every video.
Output ground truth labels will be index of video’s class label in class_list
, which is shown as title of output videos.
import matplotlib.pyplot as plt
from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import os
class myMediaPipe(MediaPipe):
def __init__(self, device, queue_depth, batch_size, num_threads, channel, width, height, frame_per_clip):
super(
myMediaPipe,
self).__init__(
device,
queue_depth,
batch_size,
num_threads,
self.__class__.__name__)
base_dir = os.environ['DATASET_DIR']
dir = base_dir + "/vid_data/"
file_list = [dir+"/butterfly/vid_01.mp4",
dir+"/butterfly/vid_02.mp4",
dir+"/dolphin/vid_03.mp4",
dir+"/sunflower/vid_04.mp4",
]
class_list = ["butterfly", "dolphin", "sunflower"]
file_classes = ["butterfly", "butterfly", "dolphin", "sunflower"]
self.input = fn.ReadVideoDatasetFromDir(shuffle=False,
file_list=file_list,
class_list=class_list,
file_classes=file_classes,
format="mp4",
frames_per_clip=frame_per_clip,
fixed_clip_mode=True)
self.decode = fn.VideoDecoder(device="hpu",
output_format=it.RGB_I,
resize=[height, width],
dtype=dt.UINT8,
frames_per_clip=frame_per_clip,
max_frame_vid=frame_per_clip)
def definegraph(self):
videos, labels, resample, video_offset = self.input()
videos = self.decode(videos, video_offset)
return videos, labels
def display_videos(videos, labels, batch_size, frame_per_clip, cols):
rows = (batch_size) // cols
plt.figure(figsize=(10, 10))
for i in range(batch_size):
for j in range(frame_per_clip):
ax = plt.subplot(rows, cols, i + 1)
plt.imshow(videos[i][j])
plt.title("label:"+str(labels[i]))
plt.axis("off")
plt.show()
def main():
batch_size = 4
img_width = 200
img_height = 200
channels = 3
queue_depth = 3
num_threads = 1
frames_per_clip = 2
pipe = myMediaPipe('legacy', queue_depth, batch_size, num_threads,
channels, img_width, img_height, frames_per_clip)
pipe.build()
pipe.iter_init()
bcnt = 0
while(bcnt < 1):
try:
videos, labels = pipe.run()
except StopIteration:
break
videos = videos.as_cpu().as_nparray()
labels = labels.as_cpu().as_nparray()
display_videos(videos, labels, batch_size, frames_per_clip, cols=2)
bcnt = bcnt + 1
if __name__ == "__main__":
main()
Example #2: Output Videos 2
- 2
Licensed under a CC BY SA 4.0 license. The videos used here are generated using images from https://data.caltech.edu/records/mzrjq-6wc02.