habana_frameworks.mediapipe.fn.VideoDecoder¶

Class:

habana_frameworks.mediapipe.fn.VideoDecoder(
    output_format=it.RGB_I,
    resize=[0, 0],
    crop_after_resize=[0, 0, 0, 0],
    resampling_mode=ft.BI_LINEAR,
    random_crop_type=rct.NO_RANDOM_CROP,
    frames_per_clip=1,
    max_frame_vid=1,
    dpb_size=16
)

Define graph call:

__call__(input, resample_idx, random_crop)

Parameter:

input - Video file list.
resample_idx - Indicates selected frames from decoded frames to be returned from decoder for each video. It should be an np array of size = [batch_size, frames_per_clip]. Supported data types: INT32.
(Optional) random_crop - Tensor containing crop coordinates of each video in a batch, size=[batch_size, 4]. Supported dimensions: minimum = 2, maximum = 2. Supported data types: UINT32.

Output:

Returns an HPUTensor of shape (N, F, C, H, W) or (N, F, H, W, C) depending on output_format, where W or C are FCD (Fastest Changing Dimension) respectively, N is the batch size, and F is the frames per clip.

Description:

Decodes and resizes batch of videos. Supported formats: MP4 (container format) for H.264 and HEVC codec.

Supported backend:

Legacy

Keyword Arguments:

kwargs	Description
output_format	Output color format produced by decoder. Type: habana_frameworks.mediapipe.media_types.imgtype Default: RGB_I Optional: yes Supported types: RGB_I (interleaved) RGB_P (planar)
resize	Video resizing dimension after decoding in width, height. Type: list[int] Default: [0, 0] Optional: No
crop_after_resize	Video cropping dimension after decoding and resizing. Crop dimensions are [left, top, width, height]. Type: list[int] Default: [0, 0, 0, 0] Optional: yes
resampling_mode	Resampling mode selection. Type: habana_frameworks.mediapipe.media_types.ftype Default: BI_LINEAR Optional: yes Supported types: LINEAR LANCZOS NEAREST BI_LINEAR BICUBIC SPLINE BOX
random_crop_type	Random crop mode selection. Type: habana_frameworks.mediapipe.media_types.randomCropType Default: NO_RANDOM_CROP Optional: yes Note: Only NO_RANDOM_CROP is supported for video.
frames_per_clip	Number of frames to output per clip. Type: int Default: 1 Optional: no
max_frame_vid	Max frames to be decoded for any video, considering that few frames will be dropped because of FPS resampling. Type: int Default: 1 Optional: yes
dpb_size	Number of output frames to be allocated by decoder depends on `dpb_size`. This value can be derived from SPS of video. Only needed in case `crop_after_resize` is not enabled. Type: int Default: 16 Optional: yes Note: Update only if finer control on number of decoder output frames to be allocated. Else default value is sufficient.

Note

Performance considerations:

Maximal performance is achieved when scale factor is up to 9x.

Cropping to less than 48x48 pixels results in performance degradation.

Gaudi Documentation 1.21.1 documentation

habana_frameworks.mediapipe.fn.VideoDecoder

habana_frameworks.mediapipe.fn.VideoDecoder¶