Pull Prebuilt Containers

Prebuilt containers are provided in:

  • Habana Vault

  • Amazon ECR Public Library

  • AWS Deep Learning Containers (DLC)

Pull and Launch Docker Image - Habana Vault

Note

Before running docker, make sure to map the dataset as detailed in Map Dataset to Docker.

To pull and run the Habana Docker images use the below code examples. Update the parameters listed in the following table to run the desired configuration.

Parameter

Description

Values

$OS

Operating System of Image

[ubuntu20.04, ubuntu22.04, amzn2, rhel8.6]

$TF_VERSION

Desired TensorFlow Version

[2.13.1]

$PT_VERSION

PyTorch Version

[2.1.0]

Note

  • Include –ipc=host in the docker run command for PyTorch docker images. This is required for distributed training using the Habana Collective Communication Library (HCCL); allowing re-use of host shared memory for best performance.

  • To run the docker image with a partial number of the supplied Gaudi devices, make sure to set the Device to Module mapping correctly. See Multiple Dockers Each with a Single Workload for further details.

    docker pull vault.habana.ai/gaudi-docker/1.13.0/{$OS}/habanalabs/tensorflow-installer-tf-cpu-$2.13.1:latest
     docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host vault.habana.ai/gaudi-docker/1.13.0/{$OS}/habanalabs/tensorflow-installer-tf-cpu-${TF_VERSION}:latest
     docker pull vault.habana.ai/gaudi-docker/1.13.0/{$OS}/habanalabs/pytorch-installer-2.1.0:latest
     docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.13.0/{$OS}/habanalabs/pytorch-installer-2.1.0:latest

AWS Deep Learning Containers

To set up and use AWS Deep Learning Containers, follow the instructions detailed in AWS Available Deep Learning Containers Images.