Troubleshooting your Model
On this Page
Troubleshooting your Model¶
This section provides troubleshooting instructions that can be referred to for common issues when training TensorFlow models.
Runtime Errors¶
The following table outlines possible runtime errors:
Error |
Description |
Workaround |
---|---|---|
model/tf.nn.e lu/Elu/elu_fwd_f32_n398: TPC kernel with guid “elu_fwd_f32” doesn’t support DS |
One of operators (Elu in the example) implementation does not support dynamic shapes |
Set TF_ENA BLE_DYNAMIC_SHAPES=False |
INFO:tensorflow:Error reported to Coordinator: <class ‘tensorfl ow.python.framework.erro rs_impl.InternalError’>, Graph execution error: Old node tow er_0/v/gpu_cached_inputs is not mapped |
Model uses legacy variables (TensorFlow 1.x) which are not supported by Habana stack. Recommended script upgrade to use TensorFlow2 resource variables. |
Place legacy variables on CPU, significantly reducing performance by setting TF_HABANA_ALLOW _LEGACY_VARIABLES _ON_CPU=true |
Performance Issues¶
For details on how to get best performance on HPU, refer to Model Performance Optimization Guide for TensorFlow.