habana_frameworks.mediapipe.fn.CropMirrorNorm
habana_frameworks.mediapipe.fn.CropMirrorNorm¶
- Class:
habana_frameworks.mediapipe.fn.CropMirrorNorm(**kwargs)
- Define graph call:
__call__(input, mean, inv_std)
Parameter:
input - Input tensor to operator. Supported dimensions: minimum = 4, maximum = 4. Supported data types: INT8, UINT8, UINT16.
mean - Mean tensor for image normalization. Supported dimensions: minimum = 1, maximum = 1. Supported data types: FLOAT32.
inv_std - Inverse standard deviation tensor for image normalization. Supported dimensions: minimum = 1, maximum = 1. Supported data types: FLOAT32.
Description:
This operator performs fused cropping, mirroring, normalization, and type casting. It crops images with specified crop window dimensions and position. Normalization produces output using formula output = (input - mean) * inv_std. If input is a RGB24 image and normalized mean = [0.485, 0.456, 0.406], then input mean tensor = [(0.485 * 255), (0.456 * 255), (0.406 * 255)]. Similarly for a RGB24 input normalized standard deviation = [0.229, 0.224, 0.225] then inverse standard deviation tensor = [1 / (0.229 * 255), 1 / (0.224 * 255), 1 / (0.225 * 255)].
- Supported backend:
HPU
Keyword Arguments:
kwargs |
Description |
---|---|
mirror |
Flag for horizontal flip. 0 means do not perform horizontal flip. 1 means perform horizontal flip.
|
crop_w |
Specify width of crop window in pixels. crop_w should be non zero value and less than or equal to input tensor width.
|
crop_h |
Specify height of crop window in pixels. crop_h should be non zero value and less than or equal to input tensor height.
|
crop_d |
Cropping along depth axis is optional. crop_d should be set to 0 if there are no cropping along depth axis. crop_d specify depth of crop window in pixels, its default set to zero, only for volumetric data crop_d should be non zero value and less than or equal to input tensor depth.
|
crop_pos_x |
Normalized (0.0 - 1.0) position of the cropping window along width. Actual position is calculated as crop_x = crop_pos_x * (w - crop_w), where crop_pos_x is the normalized position, w is the width of the input tensor and crop_w is the width of the cropping window.
|
crop_pos_y |
Normalized (0.0 - 1.0) position of the cropping window along height. Actual position is calculated as crop_y = crop_pos_y * (h - crop_h), where crop_pos_y is the normalized position, h is the height of the input tensor and crop_h is the height of the cropping window.
|
crop_pos_z |
Only for volumetric data, normalized (0.0 - 1.0) position of the cropping window along depth. Actual position is calculated as crop_z = crop_pos_z * (d - crop_d), where crop_pos_z is the normalized position, d is the depth of the input tensor and crop_d is the depth of the cropping window.
|
dtype |
Output data type.
|
Example: CropMirrorNorm Operator
The following code snippet shows use of CropMirrorNorm operator with pre-computed normalized mean = [0.485, 0.456, 0.406] and normalized std [0.229, 0.224, 0.225]. Decoder resizes output images to 224x224, CropMirrorNorm crops them to crop_w=125, crop_h=125 size. To quantize data from FLOAT32 to UINT8, output_scale=0.03125, output_zerop=128 are used.
from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
import matplotlib.pyplot as plt
import numpy as np
class myMediaPipe(MediaPipe):
def __init__(self, device, dir, queue_depth, batch_size, img_h, img_w):
super(
myMediaPipe,
self).__init__(
device,
queue_depth,
batch_size,
self.__class__.__name__)
self.input = fn.ReadImageDatasetFromDir(shuffle=False,
dir=dir,
format="jpg")
mean_data = np.array([(0.485 * 255), (0.456 * 255), (0.406 * 255)],
dtype=dt.FLOAT32)
std_data = np.array([1 / (0.229 * 255), 1 / (0.224 * 255), 1 / (0.225 * 255)],
dtype=dt.FLOAT32)
self.std_node = fn.MediaConst(data=std_data,
shape=[1, 1, 3],
dtype=dt.FLOAT32)
self.mean_node = fn.MediaConst(data=mean_data,
shape=[1, 1, 3],
dtype=dt.FLOAT32)
# WHCN
self.decode = fn.ImageDecoder(device="hpu",
output_format=it.RGB_P,
resize=[img_w, img_h])
self.cmn = fn.CropMirrorNorm(crop_w=125,
crop_h=125,
output_scale=0.03125,
output_zerop=128)
# WHCN -> CWHN
self.transpose = fn.Transpose(permutation=[2, 0, 1, 3],
tensorDim=4,
dtype=dt.UINT8)
def definegraph(self):
images, labels = self.input()
std = self.std_node()
mean = self.mean_node()
images = self.decode(images)
images = self.cmn(images, mean, std)
images = self.transpose(images)
return images, labels
def display_images(images, batch_size, cols):
rows = (batch_size + 1) // cols
plt.figure(figsize=(10, 10))
for i in range(batch_size):
ax = plt.subplot(rows, cols, i + 1)
plt.imshow(images[i])
plt.axis("off")
plt.show()
def main():
batch_size = 6
img_width = 200
img_height = 200
img_dir = "/path/to/images"
queue_depth = 2
columns = 3
# Create media pipeline object
pipe = myMediaPipe('hpu', img_dir, queue_depth, batch_size,
img_height, img_width)
# Build media pipeline
pipe.build()
# Initialize media pipeline iterator
pipe.iter_init()
# Run media pipeline
images, labels = pipe.run()
# Copy data to host from device as numpy array
images = images.as_cpu().as_nparray()
labels = labels.as_cpu().as_nparray()
# Display images
display_images(images, batch_size, columns)
if __name__ == "__main__":
main()
Output Images from CropMirrorNorm Operation 1
- 1
Licensed under a CC BY SA 4.0 license. The images used here are taken from https://data.caltech.edu/records/mzrjq-6wc02.