# Debugging Possible Model Errors¶

This section provides suggested courses of action to take for debugging general TensorFlow model issues you may encounter.

## Generate Logs¶

If you encounter problems while training a model on Gaudi, it is frequently useful to generate and inspect your log files. By inspecting log files, you can pinpoint where a model failure is occurring, and alter your model or training script to resolve or work around defects.

The generation of logging information and the location of logged information is controlled by environment variables. For example, if you set the following environment variables before training your model, a large amount of information will be generated under ~/.habana_logs/:

$export HABANA_LOGS=~/.habana_logs$ export LOG_LEVEL_ALL=0
$# Train your model as usual  The below details the various environment variables and the description of their values. ### Location of Log Files¶ ENABLE_CONSOLE=true outputs the logs to the console. If ENABLE_CONSOLE is not set at all or not set to true, logs are output in the directory specified by HABANA_LOGS. For example, if you set the following environment variables, all SynapseAI errors will be logged to the console: $ export ENABLE_CONSOLE=true
$export LOG_LEVEL_ALL=4$ # Train your model as usual


### Log Levels¶

 0 Trace Log everything including traces of progress 1 Debug Log all errors, warnings and all information useful for debugging 2 Info Log errors, warnings and some informative messages 3 Warning Log all errors and warnings 4 Error Log all errors 5 Critical Log only critical errors 6 Off Log nothing

### Component-Level Logs¶

The value of LOG_LEVEL_ALL=[log level] sets the logging level for all components. However, it is sometimes useful to view detailed information for a single component.

To specify the log level for a particular component, append the name of the component to LOG_LEVEL_.

For example, if you set the following environment variable, all components will log only critical errors (set with LOG_LEVEL_ALL=5) except for the SynapseAI API (set with LOG_LEVEL_SYN_API=3), which will log all errors and warnings:

$export HABANA_LOGS=~/.habana_logs$ export LOG_LEVEL_ALL=5
$export LOG_LEVEL_SYN_API=3$ # Train your model as usual


### Names of Components that Produce Logs¶

 The SynapseAI API SYN_API The profiling subsystem SYN_PROF, PROF_hl[0-7] and HLPROF The graph compiler PARSER, GC, and GRAPH_DATA The Habana performance library PERF_LIB The Habana Communication Library HCL and HCL_SUBMISSIONS

## Generate TensorFlow Logs¶

You can set the following environment variables to obtain TensorFlow Habana Bridge level logs:

$export TF_CPP_MIN_LOG_LEVEL=xxxx$ export TF_CPP_MIN_VLOG_LEVEL=yyyy


Please refer to the Runtime Environment Variables section for a description of the above environment variables.

## Log TensorFlow Operation Assignments¶

tf.debugging.set_log_device_placement(True) prints assignments of tensors and operations to devices.

## Move TensorFlow Operators¶

Under certain circumstances, a Habana operator will not support a tensor having the given shape or data type. If the tensor cannot be reshaped or cast to a supported type, for TensorFlow models, it is useful to use the tf.device method to schedule the operator for execution on the CPU.

## Let TensorFlow Choose the Device¶

Adding tf.config.set_soft_device_placement(True) may prevent some compilation errors.

## Error Codes¶

When making calls directly to the SynapseAI API, it is useful to check the return codes against the following symbolic or integer values to understand the outcome of the operation.