Debugging Slow Convergence

This section provides suggested courses of action to take if your TensorFlow model converges slowly.

TensorBoard Usage


SynapseAI® Software can generate data representing HPU clusters to be visualized by TensorBoard. When TensorBoard visualization is enabled, SynapseAI adds a tag, post_optimization_graph, visualizing the clustered TF graph. Furthermore, if the environment variable GRAPH_VISUALIZATION=1, additional tags will be created for each Habana Op cluster, visualizing the cluster’s pre and post Synapse graphs with respect to the Graph Compiler’s graph compilation.

Trace viewer

Profiling with TensorBoard is supported with the use of standard callbacks injected into the model. See Collect performance data. By default, SynapseAI® Profiling Subsystem dumps events from the Habana accelerator which will then be displayed in the trace generated by the TensorBoard Profiler. See Tensorboard profiling keras. However, only few iterations can be profiled in a single session since trace buffer size is limited to 128 MB. This will be addressed in subsequent releases.

Traces from HPU might be displayed incorrectly in TensorBoard, for better experience you can follow the steps outlined in Viewing Instructions section.

Model Graph

Set the following environment variables to generate a dump of the TensorFlow training graph:

$ # Train your model as usual

TensorFlow graphs will be written to the current directory.