Python Package (habana_frameworks.tensorflow)

This package provides Python-level interface of the TensorFlow bridge for training on Gaudi. The most significant module inside is library_loader which functions load_habana_module(). It is also exposed directly after import and initializes the module properly. See Loading the Habana Module for more information about loading function.

Example:

import habana_frameworks.tensorflow as htf
htf.load_habana_module()

The following sections provide a brief description of each module.

Note

Any modules/namespaces, that are part of habana-tensorflow, but are not listed below are internal.

distribute

distribute module contains HPUStrategy class, a drop-in replacement for tensorflow.distribute.MultiWorkerMirroredStrategy class. See Distributed Training with TensorFlow.

grads

grads module contains gradients for public Ops implemented inside the TensorFlow bridge library. These gradients are automatically registered in TensorFlow when calling load_habana_module().

All functions are private.

habana_device

habana_device module contains Python interface for extra features of habana_device library. It also contains custom Events handling support. The following lists the API functions:

  • get_hw_capabilities() - Reads HW device capabilities (useful to distinguish different generations). Example:

import habana_frameworks.tensorflow.habana_device as hd

with tf.device(habana_device):
    caps = hd.get_hw_capabilities()
    print("hw_capabilities = {}".format(caps))
  • get_type() - Returns (as a text) name of Gaudi generation present in the platform, e.g. GAUDI2.

  • enable_synapse_logger() - Enables logging of Synapse and HCCL API calls to JSON file. This is a private API and should not be used. For tracing, use mechanisms described in Profiling with TensorFlow.

  • enable_synapse_api() - Forward API calls to Synapse normally - default state. If run after enable_synapse_logger(), disables logging of Synapse and HCCL API calls.

  • enable_null_hw() - Replaces Synapse API calls with stubs resulting with no execution on HW. Used only for internal measurements.

  • enable_allocator_stats() - Function to enable gathering allocator statistics for HPU. To obtain the statistics, run tf.config.experimental.get_memory_info() with the HPU device name as an argument. Gathering statistics requires a mutex so it may affect performance. If after calling this function statistics are still empty (i.e. always shows 0 allocations), try calling it earlier in the lifetime of your program. You can use the following code:

import habana_frameworks.tensorflow as htf

htf.load_habana_module()
logical_devices = tf.config.list_logical_devices()
for device in logical_devices:
    if device.device_type == "HPU":
        with tf.device(device.name):
            tf.constant(0, dtype=tf.int32, name="forced_hpu_initialization").numpy()
            htf.habana_device.enable_allocator_stats()
  • log.info(message: str) - Pass message as an info to internal TF logging system. The message can be printed on screen if sufficient log level is set.

  • log.warning(message: str) - Pass message as a warning to internal TF logging system. The message can be printed on screen if sufficient log level is set.

  • log.error(message: str) - Pass message as an error to internal TF logging system. The message can be printed on screen if sufficient log level is set.

  • log.fatal(message: str) - Pass message as a fatal to internal TF logging system and stop further script execution. The message can be printed on screen if sufficient log level is set.

  • event_log.enable_event_log(log_output_directory) - Enables logging for TF bridge events into an event file.

  • log_output_directory - Destination directory.

  • event_log.create_graph_compilation_tuple_from_event(event) - Converts graph compilation event protobuf message into GraphCompilationEvent tuple.

  • event_log.set_event_dispatcher(func) - Configures function to be called on each event as it is produced by habana device.

  • event_log.log_custom_event(message: str) - Function to log custom TF Event in EventLog enabled via enable_event_log(). It creates TF event with custom_event tag and stores message as value string tensor inside.

habana_estimator

habana_estimator module is a custom tf.estimator.Estimator that allows data pre-fetching to Gaudi. For more information check https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator .

library_loader

library_loader module is a main entry point to the Python Package. It contains load_habana_module() function, that needs to be called to initialize TensorFlow bridge library and enable training on Gaudi. Additionally, there is load_op_library() function to be used with TensorFlow CustomOp API. The following lists the API functions:

  • is_loaded() - Checks if Habana TensorFlow bridge binary has been already loaded.

  • is_op_override_allowed() - Checks if allowed override of subgraph-based ops to Habana optimized version.

  • set_op_override_allowed(allow_op_override) - Set allow_op_override to True to allow override of subgraph-based ops to Habana optimized version. Set False to prevent the override. Override may be required for explicit device placement in some cases (eg. LayerNorm).

  • load_habana_module(allow_op_override: bool = None) - Searches Habana TensorFlow bridge binary compatible with current TF version and loads it.

    • Parameter: allow_op_override - Allow override of subgraph-based ops to Habana optimized version. Enabled by default. May be required for explicit device placement in some cases (eg. LayerNorm). Setting is not altered if None is set. Bool values will override previous setting.

  • load_op_library(lib_path) - Load op library function for habana custom ops libraries. In order to ensure proper initialization of TensorFlow and Habana-TensorFlow, custom ops libs have to be loaded with this function. load_habana_module() needs to be called before this function.

    • Parameter: lib_path - Path to custom ops library to be loaded (including a filename). The library must be a compiled .so file.

lib_utils

lib_utils module validates the environment of the installed habana-tensorflow package and searches libraries needed for initialization. The following lists the API functions:

  • get_includes_location() - Returns location of API headers.

multinode_helpers

multinode_helpers module initializes multinode environment during call to load_habana_module().

ops

ops module contains public custom Ops implemented inside the TensorFlow bridge library. The following lists the OPs:

profiling

profiling module contains utility profiling functions that can be enabled via environment variables. All APIs are private.

py_synapse_logger

py_synapse_logger module contains a Python wrapper for synapse logger library. All APIs are private.

synapse_logger_helpers

synapse_logger_helpers module contains helper functions to use py_synapse_logger module. All APIs are private.

sysconfig

sysconfig module, similarly to tf.sysconfig, contains __version__ and functions to retrieve information needed for compilation of custom ops:

  • Library location

  • Include location

  • Compiler flags

  • Linker flags

For more details on CustomOp API, see TensorFlow CustomOp API.