Getting Started Guide Launch EC2 with Habana

This document provides guides and instructions for how to set up a Habana Deep Learning AMI on Amazon EC2 services, and provides release notes for the Habana image.

On the Amazon Web Services (AWS) platform Habana makes available different VMIs, known within the AWS ecosystem as an Amazon Machine Image (AMI). These are Deep Learning optimized AMIs for AWS instances with Habana Training Gaudi Cards.

There are two core types of Habana AMIs:

AMI Type

Description

Deep Learning AMI (DLAMI)

Preconfigured with Habana SynapseAI software and AWS’s modified machine learning frameworks TensorFlow and PyTorch.

Deep Learning Base AMI

A lightweight AMI preconfigured with Habana SynapseAI software and Docker engine.

To see a list of available Habana AMIs refer to AWS Habana MarketPlace

AWS supplies Habana’s Gaudi through dl1 instance types. Combining a Habana AMI and a dl1 instance type provides a platform for accelerated machine learning and development

For those familiar with the AWS platform, the process of launching the instance is as simple as logging in, selecting the Habana AMI of choice, configuring settings as needed, then launching the Instance. After launching the instance, you can SSH into it and start building a host of AI applications in deep learning, machine learning and data science by leveraging the Gaudi Hardware to achieve optimal accelerated training and development.

Prerequisites

These instructions assume the following:

Getting Started

Perform these preliminary setup steps before creating a EC2 Instance.

Dl1 Availability Zones

DL1 instance types are available in US East (N. Virginia) and US West (Oregon) regions. Within those regions, dl1s are only supported in specific availability zones. Follow the steps below to find out the supported availability zones:

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/v2/home.

  2. In the navigation bar at the top of the screen, the current Region is displayed (for example, US East (Ohio)). Choose either US East (N. Virginia) or US West (Oregon) region.

  3. Select Instance Types.

../../_images/EC2_dl1_instance_type.png
  1. Search for dl1.

../../_images/EC2_dl1_availability_zone.png
  1. Click on the dl1.24xlarge instance type.

  2. Find Availability Zone in the Networking Section and keep note of the listed zone. Please Note that Availability Zone will be different depending on the current region specified.

../../_images/EC2_dl1_availability_zone_networking.png

Network Security

This ensures a level of network security to prevent unwanted intruders.

  1. Create a Virtual Private Cloud (Get started with Amazon VPC)

    1. Follow only Step 1: Creating the VPC

    2. Follow Create a Subnet in your VPC with the specified Availability Zone found above Dl1 Availability Zone

  2. Create a Security Group for VPC (Create a Security Group)

    1. Follow Authorize inbound traffic for your Linux instances to get access to the EC2 Instance

Create an EC2 Instance

Follow the below step-by-step instructions to launch a Habana EC2 Instance. For more information on launching EC2 instances refer to Get Started with Amazon EC2 Linux.

Initiate Instance Launch

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.

  2. From the Amazon EC2 console dashboard, choose Launch instance.

Choose an Amazon Machine Image (AMI)

Choose a Habana AMI to launch:

Habana Deep Learning Base AMI

This is a lightweight AMI that can be leveraged as a base for custom developed AMIs. It contains the essentials setup to run and train on Gaudi to achieve accelerated deep learning training.

The major difference between the Base AMI and DLAMI, is that the Base AMI does not have any Machine Learning Frameworks such as TensorFlow or PyTorch installed. To perform training, one can develop on Habana Prebuilt Containers or AWS Deep Learning Containers, see Pull Prebuilt Containers or install the desired Machine Learning framework, see Install Native Frameworks.

Search for “Habana Deep Learning Base AMI” and choose the desired operating system. These are located in the AWS Marketplace AMIs.

../../_images/step_1_choose_base_AMI.png

Deep Learning AMI Habana

This is AWS’s constructed Habana AMI that comes preconfigured with Habana SynapseAI software and AWS’s modified machine learning frameworks TensorFlow and PyTorch. Deep Learning training can be performed directly on the instance without the use of Docker.

Search for “Deep Learning AMI Habana” and choose the desired Machine Learning Framework, SynapseAI version and Operating System (These are located in the Community AMIs section).

Note

AMI Names will look like “Deep Learning AMI Habana [TensorFlow, PyTorch] * SynapseAI *.

../../_images/step_1_choose_DLAMI.png

Choose Instance Type

Choose the dl1.24xlarge instance type to run on Habana Gaudi.

../../_images/step_2_choose_dl1.png

Configure Instance Details

  • Choose valid Network and Subnet for secure connections.

  • Ensure that the dl1.24xlarge is available in the selected subnet region.

Add Storage

Modify storage to suit your needs. dl1.24xlarge comes allocated with four local nvme each 1 TB. However, these are local to the allocated dl1 and will not persist on stopping and starting the instance.

Add Tags

Add any needed tags for EC2 Instance.

Configure Security Group

Choose desired security group for your EC2 instance.

Review

  1. Verify your configuration is correct and proceed to Launch.

  2. Choose/Create a key pair and Launch Instance.

You have now launched an EC2 Instance.

Connect to your Instance

Follow Connect to your Linux instance using an SSH Client Guide

Once you establish an ssh connection, run the below command to verify the dl1.24xlarge instance is working:

hl-smi

Now you have the power of Habana’s Gaudi for AI applications.

Pull Habana Docker Images

Follow the instructions outlined in Installation Guide to to train on Habana TensorFlow or PyTorch Containers.

Setup Complete

Refer to TensorFlow User Guide for instructions on developing with TensorFow.

Refer to PyTorch User Guide for instructions on developing with PyTorch.

Release Notes for Habana Amazon Machine Images on AWS

Habana Deep Learning Base AMI provides a foundational platform for deep learning on Amazon EC2 instances with Habana® Gaudi® and Docker. The Habana Gaudi processor is designed to maximize training throughput and efficiency, while providing developers with optimized software and tools that scale to many workloads and systems.

This AMI is suitable for deploying your own custom deep learning environment at scale.

The Habana Deep Learning Base AMI is provided at no additional charge to Amazon EC2 users.

Below are the core components of Habana Deep Learning Base AMI:

  • Habana SynapseAI®

  • Containerization platforms including Docker and habanalabs-container-runtime to run Gaudi accelerated Docker containers.

For an in depth guide of getting started with Gaudi follow the guides located at https://developer.habana.ai/resources/getting-started-with-gaudi/

Versioning

Refer to the Release Notes of the Gaudi Architecture and Software.

Image Name

Available Habana AMIs can be found at AWS Marketplace Habana

Habana Base Deep Learning AMI Packages

Packages installed in the Habana Base AMI.

  • habanalabs-dkms

  • habanalabs-firmware

  • habanalabs-firmware-tools

  • habanalabs-container-runtime

  • habanalabs-thunk

  • habanalabs-graph

  • Docker Software