Placement of Ops on HPU

Avoid execution of ops on CPU to get optimal performance on HPU. When a model is ported to run on HPU, the software stack decides which ops are placed on CPU and which are placed on the HPU.

This decision is based on whether the op is registered with PyTorch with HPU as the backend and whether the requested datatype is supported on HPU. Execution of an op automatically falls back to CPU if the op is not registered with its backend as HPU or if op is registered but the requested datatype is not supported on HPU.

To enable CPU fallback logs to check whether op execution fell back to CPU, set the environment variables as shown below:


For example when aten::digamma op falls back once to CPU, you will see logs as shown below:

CPU fallback digamma : self=HPUBFloat16Type

Frequency of op and op name that were executed on CPU:
1       aten::digamma