Debugging Slow Convergence¶
This section provides suggested courses of action to take if your TensorFlow model converges slowly.
Use TensorBoard to visualize the model and its training progression
View the Model Graph
Use the Profiler to identify bottlenecks
SynapseAI® Software can generate data representing HPU clusters to be visualized by TensorBoard.
When TensorBoard visualization is enabled, SynapseAI adds a tag,
post_optimization_graph, visualizing the clustered TF graph.
Furthermore, if the environment variable
GRAPH_VISUALIZATION=1, additional tags will be created for each Habana Op cluster, visualizing the cluster’s pre and post Synapse graphs with respect to the Graph Compiler’s graph compilation.
Profiling with TensorBoard is supported with the use of standard callbacks injected into the model. See Collect performance data. By default, SynapseAI® Profiling Subsystem dumps events from the Habana accelerator which will then be displayed in the trace generated by the TensorBoard Profiler. See Tensorboard profiling keras. However, only few iterations can be profiled in a single session since trace buffer size is limited to 128 MB. This will be addressed in subsequent releases.
Traces from HPU might be displayed incorrectly in TensorBoard, for better experience you can follow the steps outlined in Viewing Instructions section.
Set the following environment variables to generate a dump of the TensorFlow training graph:
$ export LOG_LEVEL_GRAPH_DATA=0 GRAPH_VISUALIZATION=1 HBN_TF_GRAPH_DUMP=2 $ # Train your model as usual
TensorFlow graphs will be written to the current directory.