Troubleshooting your Model

This section provides troubleshooting instructions that can be referred to for common issues when training PyTorch models. The following are common functional issues that may occur when running on HPU and not on CPU/GPU.

Runtime Errors

Please ensure that both model and inputs are moved to the device in the script before the training loop begins. The most common symptoms of these could manifest as runtime errors from the python stack which will result in a backtrace:

model_inputs = model_inputs.to("hpu")
model = model.to("hpu")

Performance Issues

For details on how to get best performance on HPU, refer to Model Performance Optimization Guide for PyTorch