Runtime Environment Variables

The following table describes runtime flags that are set in the environment to change the behavior as well as enable or disable some features.

Table 1 Runtime Environment Variables

Flag

Default

Description

Consumer

PT_HPU_LAZY_MODE

1

Controls Execution mode:

  • 0x1 - Lazy mode

  • 0x2 - Eager mode

Habana PyTorch Bridge Modules

GRAPH_VISUALIZATION

False

Creates of graph visualization files. The output dump graphs are in ./.graph_dumps folder

SynapseAI

PT_RECIPE_CACHE_PATH

Unset

Path (directory), where compiled graph recipes are stored to accelerate a scale up scenario. Only one process compiles the recipe, and other processes read it from disk.

If unset (default), compiled graph recipes are not stored on disk (recipe disk caching disabled).

Note: If a recipe cache is shared among a few processes (scale up), it must be stored on a local physical disk. Avoid using remote drives (such as NFS) where file locks are not supported, as it it may lead to instability and unpredictable behavior.

Habana PyTorch Bridge Modules

PT_CACHE_FOLDER_DELETE

False

By default (PT_CACHE_FOLDER_DELETE=False), if recipe caching is enabled (PT_RECIPE_CACHE_PATH is set), cached recipes from previous executions of the workload will be used if graphs remain the same. If set to True, the directory provided in PT_RECIPE_CACHE_PATH will be cleared when the workload starts.

Habana PyTorch Bridge Modules

PT_HPU_MAX_COMPOUND_OP_SIZE

INT64_MAX

Limits internal graph size to specified number of opsReduces the lazy mode memory overhead. This will be improved in future releases.

Note: This may affect performance.

Habana PyTorch Bridge Modules

HABANA_PGM_LRU_MAX

30000

If cache evictions cause performance degradation, increasing the cache size will increase performance. The default value is 30000. Note: Boost in performance may cause an increase in host memory consumption.

Habana PyTorch Bridge Modules

PT_HPU_LAZY_ACC_PAR_MODE

1

This flag turns on host time optimization of lazy ops accumulation. It offloads ops accumulation to a separate thread, thus reducing computation time of the main thread.

Habana PyTorch Bridge Modules

PT_HPU_METRICS_FILE

Unset

Path (file), where the collected metrics are stored. Metrics are stored in a file only when PT_HPU_METRICS_FILE flag is set.

Habana PyTorch Bridge Modules

PT_HPU_METRICS_DUMP_TRIGGER

process_exit

Once PT_HPU_METRICS_FILE flag is set to automatically dump the metrics, it triggers Habana PyTorch Bridge to specify precisely when to store the metrics into the file.

Supported values:

  • process_exit - stores metrics in a file during exit process.

  • mark_step - stores metrics in a file during mark_step.

  • metric_change - stores the metrics modified during the execution of the training script.

Multiple triggers can be enabled together by separating them with a comma, for example: PT_HPU_METRICS_DUMP_TRIGGERS=process_exit,metric_change

Habana PyTorch Bridge Modules

PT_HPU_METRICS_FILE_FORMAT

json

Metrics file format. Both JSON and TEXT formats are supported:

  • json

  • text

Habana PyTorch Bridge Modules

The following table describes runtime flags that are set in the environment to obtain SynapseAI and PyTorch Habana Bridge level logs.

Table 2 Runtime Environment Variables for Logging Mechanism

Flag

Default

Description

Consumer

PT_FORCED_TRACING_MASK

0

A Bitmask specifying components inside Habana PyTorch Bridge module that are allowed to use profilers. Note that certain profilers may require additional environment variables to be set.

  • PT_DEVICE - 0x1

  • PT_KERNEL - 0x2

  • PT_BRIDGE - 0x4

  • PT_SYNHELPER - 0x8

  • PT_DISTRIBUTED - 0x10

  • PT_LAZY - 0x20

  • PT_TRACE - 0x40

  • PT_FALLBACK - 0x80

  • PT_STATS - 0x100

  • PT_TEST - 0x200

  • PT_DYNAMIC_SHAPE - 0x400

  • PT_DEVMEM - 0x800

  • PT_HABHELPER - 0x1000

  • PT_IRGRAPH - 0x2000

  • PT_VIEWTABLE - 0x4000

  • PT_REFINEMENT - 0x8000

  • PT_HOSTSTAT - 0x10000

  • PT_LAYOUTS - 0x20000

  • PT_PARALLEL_ACC - 0x40000

  • PT_LAZY_EAGER - 0x80000

  • PT_MEMLOG - 0x100000

  • PT_EXEC_THREAD - 0x200000

  • PT_EAGER - 0x400000

  • PT_RECIPE_STATS - 0x800000

Habana PyTorch Bridge Modules

ENABLE_CONSOLE

False

If set to true, enables printing SynapseAI and Habana PyTorch Bridge logs to the console.

SynapseAI and Habana PyTorch Bridge

LOG_LEVEL_ALL

5

Logging level from SynapseAI, perf_lib and Habana PyTorch Bridge.

  • 6 is no logs

  • 0 is verbose

By default, logs are placed either in the console (if ENABLE_CONSOLE=true) or under ~/.habana_logs/.

SynapseAI and Habana PyTorch Bridge

LOG_LEVEL_ALL_PT

5

Logging level for Habana PyTorch Bridge.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_DEVICE

5

Logging level for PT_DEVICE component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_KERNEL

5

Logging level for PT_KERNEL component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_BRIDGE

5

Logging level for PT_BRIDGE component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_SYNHELPER

5

Logging level for PT_SYNHELPER component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_DISTRIBUTED

5

Logging level for PT_DISTRIBUTED component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_LAZY

5

Logging level for PT_LAZY component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_TRACE

0

Logging level for PT_TRACE component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_FALLBACK

5

Logging level for PT_FALLBACK component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_STATS

5

Logging level for PT_STATS component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_TEST

5

Logging level for PT_TEST component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_DYNAMIC_SHAPE

5

Logging level for PT_DYNAMIC_SHAPE component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_DEVMEM

5

Logging level for PT_DEVMEM component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_HABHELPER

5

Logging level for PT_HABHELPER component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_IRGRAPH

5

Logging level for PT_IRGRAPH component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_VIEWTABLE

5

Logging level for PT_VIEWTABLE component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_REFINEMENT

5

Logging level for PT_REFINEMENT component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_HOSTSTAT

5

Logging level for PT_HOSTSTAT component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_LAYOUTS

5

Logging level for PT_LAYOUTS component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_PARALLEL_ACC

5

Logging level for PT_PARALLEL_ACC component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_LAZY_EAGER

5

Logging level for PT_LAZY_EAGER component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_MEMLOG

5

Logging level for PT_MEMLOG component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_EXEC_THREAD

5

Logging level for PT_EXEC_THREAD component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

LOG_LEVEL_PT_EAGER

5

Logging level for PT_EAGER component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge

PT_RECIPE_STATS

5

Logging level for PT_RECIPE_STATS component of Habana PyTorch Bridge. If unset, LOG_LEVEL_ALL_PT will be used.

  • 6 is no logs

  • 0 is verbose

Habana PyTorch Bridge