1. TensorFlow Operators

1.1. Overview

This document summarizes the SynapseAI TensorFlow supported operators for GAUDI.

1.2. TensorFlow Operators Support Summary

Table 1.1 Supported TensorFlow Operators

TF OP

Registration

Constraints

Notes / Limitations

Abs

HPU

T={bfloat16,float32,int32}

Acos

HPU

T={float32}

Acosh

HPU

T={float32}

Add

HPU

T={bfloat16,float32,int32}

AddN

HPU

T={bfloat16,float32,int32}

AddV2

HPU

T={bfloat16,float32,int32}

Addons>Resampler

HPU

T={bfloat16,float32}

Addons>ResamplerGrad

HPU

T={bfloat16,float32}

AdjustContrastv2

HPU

T={float32}

All

HPU

Tidx={int32,int64}

AnonymousIterator

HPU

AnonymousIteratorV2

HPU

Any

HPU

Tidx={int32,int64}

ApplyAdaMax

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyAdadelta

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyAdagrad

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyAdagradV2

HPU

  • Refrain from using legacy variables on HPU.

  • In case user would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyAdam

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyAddSign

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyCenteredRMSProp

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyFtrl

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variables TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyFtrlV2

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyGradientDescent

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overriden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyMomentum

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyPowerSign

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ApplyRMSProp

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

ArgMax

HPU

T={bfloat16,float32}; Tidx={int32}; output_type={int32}

ArgMin

HPU

T={bfloat16,float32}; Tidx={int32}; output_type={int32}

Asin

HPU

T={float32}

Asinh

HPU

T={float32}

Assert

DEVICE_DEFAULT

Assign

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

AssignAdd

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

AssignAddVariableOp

HPU

dtype={float32,int32}

AssignSub

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

AssignSubVariableOp

HPU

dtype={float32,int32}

AssignVariableOp

HPU

dtype={float32,int32}

Atan

HPU

T={float32}

Atanh

HPU

T={float32}

AvgPool

HPU

T={bfloat16,float32}

AvgPool3D

HPU

T={bfloat16,float32}

AvgPool3DGrad

HPU

T={bfloat16,float32}

AvgPoolGrad

HPU

T={bfloat16,float32}

BatchMatMul

HPU

T={bfloat16,float32}

BatchMatMulV2

HPU

T={bfloat16,float32}

BiasAdd

HPU

T={bfloat16,float32}

BiasAddGrad

HPU

T={bfloat16,float32}

BitwiseAnd

HPU

T={int16,int32,int8,uint16,uint32,uint8}

BitwiseOr

HPU

T={int16,int32,int8,uint16,uint32,uint8}

BitwiseXor

HPU

T={int16,int32,int8,uint16,uint32,uint8}

BroadcastArgs

HPU

T={int32,int64}

BroadcastGradientArgs

HPU

T={int32,int64}

BroadcastTo

HPU

T={bfloat16,float32,int32}; Tidx={int32,int64}

Cast

HPU

SrcT={bfloat16,bool,float32,int16,int32, int64,int8}; DstT={bfloat16,bool,float32,int16,int32, int8}

Ceil

HPU

T={bfloat16,float32}

ClipByValue

HPU

T={bfloat16,float32}

CollectiveGather

HPU

CollectiveReduce

HPU

CombinedNonMaxSuppression

HPU

Concat

HPU

T={bfloat16,bool,float32,int32}

ConcatOffset

HPU

ConcatV2

HPU

T={bfloat16,bool,float32,int32}; Tidx={int32,int64}

Const

HPU

dtype={bfloat16,bool,float32,int32, int64,int8}

ConsumeMutexLock

HPU

ControlTrigger

DEVICE_DEFAULT

Conv2D

HPU

T={bfloat16,float32}

Conv2DBackpropFilter

HPU

T={bfloat16,float32}

Conv2DBackpropInput

HPU

T={bfloat16,float32}

Cos

HPU

T={bfloat16,float32}

Cosh

HPU

T={bfloat16,float32}

CropAndResize

HPU

T={float32}

CropAndResizeGradImage

HPU

T={float32}

Cumprod

HPU

T={bfloat16,float32,int32}; Tidx={int32,int64}

Cumsum

HPU

T={bfloat16,float32,int32}; Tidx={int32,int64}

DebugIdentityV2

HPU

DeleteIterator

HPU

DepthwiseConv2dNative

HPU

T={bfloat16,float32}

DepthwiseConv2dNativeBackpropFilter

HPU

T={bfloat16,float32}

  • data_format should be NHWC

DepthwiseConv2dNativeBackpropInput

HPU

T={bfloat16,float32}

  • data_format should be NHWC

DestroyResourceOp

HPU

Div

HPU

T={bfloat16,float32}

DivNoNan

HPU

T={float32}

DynamicStitch

HPU

T={bfloat16,float32,int32}

Elu

HPU

T={bfloat16,float32}

EluGrad

HPU

T={bfloat16,float32}

Enter

DEVICE_DEFAULT

Equal

HPU

T={bfloat16,bool,float32,int32,int8}

EuclideanNorm

HPU

T={bfloat16,float32}; Tidx={int32,int64}

Exit

DEVICE_DEFAULT

Exp

HPU

T={bfloat16,float32}

ExpandDims

HPU

T={bfloat16,bool,float32,int32,int8}; Tdim={int32,int64}

ExperimentalSleepDataset

HPU

FIFOQueueV2

DEVICE_DEFAULT

Fill

HPU

T={bfloat16,float32,int32}; index_type={int32,int64}

Floor

HPU

T={bfloat16,float32}

FloorDiv

HPU

T={bfloat16,float32,int32}

FloorMod

HPU

T={int32}

FusedBatchNorm

HPU

T={float32}

FusedBatchNormGrad

HPU

T={float32}

FusedBatchNormGradV2

HPU

T={bfloat16,float32}

FusedBatchNormGradV3

HPU

T={bfloat16,float32}; U={float32}

FusedBatchNormV2

HPU

T={bfloat16,float32}; U={float32}

FusedBatchNormV3

HPU

T={bfloat16,float32}; U={float32}

GatherNd

HPU

Tparams={bfloat16,float32,int32}; Tindices={int32,int64}

GatherV2

HPU

Tparams={bfloat16,bool,float32,int32, int8}; Tindices={int32,int64}; Taxis={int32,int64}

GeneratorDataset

HPU

Greater

HPU

T={bfloat16,float32,int32,int8}

GreaterEqual

HPU

T={bfloat16,float32,int32,int8}

HabanaBarrier

HPU

HabanaClampBwd

HPU

T={bfloat16,float32}

HabanaClampFwd

HPU

T={bfloat16,float32}

HabanaCluster

HPU

HabanaConv2DWithPadding

HPU

T={bfloat16,float32}

HabanaConv2DWithPaddingBackpropFilter

HPU

HabanaConv2DWithPaddingBackpropInput

HPU

HabanaCopy

HPU

T={bfloat16,float32,int32}

HabanaCorrelation

HPU

T={bfloat16,float32}

HabanaCorrelationGrad

HPU

T={bfloat16,float32}

HabanaDevConst

HPU

dtype={bfloat16,bool,float32,int32,int8}

HabanaDropout

HPU

T={bfloat16,float32}

HabanaDropoutGrad

HPU

T={bfloat16,float32}

HabanaDropoutStateful

HPU

T={bfloat16,float32}

HabanaFusedBatchNormV3

HPU

HabanaGelu

HPU

HabanaGeluGrad

HPU

HabanaInstanceNorm

HPU

HabanaLaunch

HPU

HabanaLaunchCpuBackend

HPU

HabanaLayerNorm

HPU

T={bfloat16,float32}

HabanaLayerNormGrad

HPU

HabanaLayerNormGradV2

HPU

HabanaLayerNormV2

HPU

T={bfloat16,float32}

HabanaLogSoftmaxGrad

HPU

T={bfloat16,float32}

HabanaMaskedSoftmax

HPU

HabanaMaxGrad

HPU

Tidx={int32,int64}

HabanaMinGrad

HPU

Tidx={int32,int64}

HabanaPreprocessScatterNd

HPU

T={bfloat16,float32}; Tindices={int32,int64}

HabanaRandomSeed

HPU

HabanaRandomShuffle

HPU

T={int32}

HabanaRandomUniformWithMaxval

HPU

dtype={bfloat16,float32}; T={int32,int64}

HabanaRandomUniformWithScale

HPU

dtype={bfloat16,float32}; T={int32,int64}

HabanaResizeNearestNeighbor

HPU

T={bfloat16,float32}

HabanaResizeNearestNeighborGrad

HPU

T={bfloat16,float32}

HabanaScatterNd

HPU

T={bfloat16,float32}; Tindices={int32,int64}

HabanaSize

HPU

T={bfloat16,float32,int32,int8}; out_type={int32}

HabanaSoftmaxGrad

HPU

T={bfloat16,float32}

HabanaSparseSegmentSum

HPU

T={bfloat16,float32}; Tidx={int32}

HabanaSparseSegmentSumBwd

HPU

T={bfloat16,float32}; Tidx={int32}; Tnumsegments={int32,int64}

HabanaSum

HPU

T={bfloat16,float32,int32}

HabanaWhere

HPU

HostConst

DEVICE_DEFAULT

HpuCollectiveGather

HPU

HpuCollectiveGroupInit

HPU

HpuCollectiveReduce

HPU

Identity

HPU,DEVICE_DEFAULT

T={bfloat16,bool,float32,int32,int64, int8,resource,string,variant}

IdentityN

DEVICE_DEFAULT

Inv

HPU

T={bfloat16,float32}

Invert

HPU

T={int16,int32,int8,uint16,uint32,uint8}

InvertPermutation

HPU

T={int32,int64}

IsFinite

HPU

T={bfloat16,float32}

IsInf

HPU

T={bfloat16,float32}

IsNan

HPU

T={bfloat16,float32}

IteratorFromStringHandleV2

HPU

IteratorGetNext

HPU

IteratorGetNextAsOptional

HPU

IteratorGetNextSync

HPU

IteratorToStringHandle

HPU

IteratorV2

HPU

L2Loss

HPU

T={bfloat16,float32}

  • data_format should be NHWC

LeakyRelu

HPU

T={bfloat16,float32}

LeakyReluGrad

HPU

T={bfloat16,float32}

LeftShift

HPU

T={int16,int32,int8,uint16,uint32,uint8}

Less

HPU

T={bfloat16,float32,int32,int8}

LessEqual

HPU

T={bfloat16,float32,int32,int8}

Log

HPU

T={bfloat16,float32}

Log1p

HPU

T={bfloat16,float32}

LogSoftmax

HPU

T={bfloat16,float32}

LogicalAnd

HPU

LogicalNot

HPU

LogicalOr

HPU

LoopCond

DEVICE_DEFAULT

MakeIterator

HPU

MatMul

HPU

T={bfloat16,float32}

MatrixBandPart

HPU

T={bfloat16,float32}; Tindex={int32,int64}

MatrixDiag

HPU

T={bfloat16,float32}

MatrixDiagPart

HPU

T={bfloat16,float32}

Max

HPU

T={bfloat16,float32}; Tidx={int32,int64}

MaxPool

HPU

T={bfloat16,float32}

MaxPool3D

HPU

T={bfloat16,float32}

MaxPool3DGrad

HPU

T={bfloat16,float32}

MaxPoolGrad

HPU

T={bfloat16,float32}

MaxPoolGradV2

HPU

T={bfloat16,float32}

MaxPoolV2

HPU

T={bfloat16,float32}

Maximum

HPU

T={bfloat16,float32,int32}

Mean

HPU

T={bfloat16,float32}; Tidx={int32,int64}

Merge

DEVICE_DEFAULT

Min

HPU

T={bfloat16,float32}; Tidx={int32,int64}

Minimum

HPU

T={bfloat16,float32,int32}

Mod

HPU

T={int16,int32,int8}

  • Input data type should be int32, int16 and int8.

Mul

HPU

T={bfloat16,float32,int32}

MulNoNan

HPU

T={float32}

MutexLock

HPU

MutexV2

HPU

Neg

HPU

T={bfloat16,float32,int32}

NextIteration

DEVICE_DEFAULT

NoOp

HPU,DEVICE_DEFAULT

NonMaxSuppressionV3

HPU

T={float32}

NonMaxSuppressionV4

HPU

T={float32}

NotEqual

HPU

T={bfloat16,bool,float32,int32,int8}

OneHot

HPU

T={float32}; TI={int32}

OnesLike

HPU

T={bfloat16,float32,int32,int8}

OptionalFromValue

HPU

OptionalGetValue

HPU

OptionalHasValue

HPU

OptionalNone

HPU

Pack

HPU

T={bfloat16,bool,float32,int32}

Pad

HPU

T={bfloat16,float32,int32}; Tpaddings={int32,int64}

PadV2

HPU

T={bfloat16,float32,int32}; Tpaddings={int32,int64}

PartitionedCall

HPU

Placeholder

DEVICE_DEFAULT

PlaceholderV2

DEVICE_DEFAULT

PlaceholderWithDefault

HPU

dtype={bfloat16,bool,float32,int32,int8, resource}

Pow

HPU

T={float32}

PrefetchDataset

HPU

PreventGradient

HPU

T={bfloat16,bool,float32,int32,int64,int8, resource}

Prod

HPU

T={bfloat16,float32,int32}; Tidx={int32,int64}

PyramidRoiAlign

HPU

T={bfloat16,float32}

PyramidRoiAlignGradImages

HPU

T={bfloat16,float32}

QueueCloseV2

DEVICE_DEFAULT

QueueDequeueV2

DEVICE_DEFAULT

QueueEnqueueV2

DEVICE_DEFAULT

QueueIsClosedV2

DEVICE_DEFAULT

QueueSizeV2

DEVICE_DEFAULT

RandomShuffle

HPU

T={bfloat16,float32,int32,int8}

RandomStandardNormal

HPU

dtype={bfloat16,float32}; T={int32,int64}

RandomUniform

HPU

dtype={bfloat16,float32}; T={int32,int64}

RandomUniformInt

HPU

Tout={int32}; T={int32}

Range

HPU

Tidx={int32}

Rank

HPU

ReadVariableOp

HPU

RealDiv

HPU

T={bfloat16,float32}

Reciprocal

HPU

T={bfloat16,float32}

Recv

DEVICE_DEFAULT

Relu

HPU

T={bfloat16,float32}

Relu6

HPU

T={bfloat16,float32}

Relu6Grad

HPU

T={bfloat16,float32}

ReluGrad

HPU

T={bfloat16,float32}

RemoteCall

HPU

Reshape

HPU

T={bfloat16,bool,float32,int32}; Tshape={int32,int64}

ResizeBilinear

HPU

T={bfloat16,float32}

ResizeBilinearGrad

HPU

T={bfloat16,float32}

ResizeNearestNeighbor

HPU

T={float32}

ResizeNearestNeighborGrad

HPU

T={float32}

ResourceApplyAdaMax

HPU

T={float32}

ResourceApplyAdadelta

HPU

T={float32}

ResourceApplyAdam

HPU

T={float32}

ResourceApplyAdamWithAmsgrad

HPU

T={float32}

ResourceApplyCenteredRMSProp

HPU

T={float32}

ResourceApplyFtrl

HPU

T={float32}

ResourceApplyFtrlV2

HPU

T={float32}

ResourceApplyGradientDescent

HPU

T={float32}

ResourceApplyKerasMomentum

HPU

T={float32}

ResourceApplyMomentum

HPU

T={float32}

ResourceApplyRMSProp

HPU

T={float32}

ResourceGather

HPU

dtype={float32}; Tindices={int32,int64}

ResourceGatherNd

HPU

Tindices={int32}

ResourceScatterAdd

HPU

dtype={float32}; Tindices={int32,int64}

RightShift

HPU

T={int16,int32,int8,uint16,uint32,uint8}

Rint

HPU

T={bfloat16,float32}

Round

HPU

T={bfloat16,float32}

Rsqrt

HPU

T={bfloat16,float32}

RsqrtGrad

HPU

T={bfloat16,float32}

ScatterNd

HPU

T={bfloat16,float32}; Tindices={int32,int64}

Select

HPU

T={bfloat16,float32,int32}

SelectV2

HPU

T={bfloat16,float32,int32}

Selu

HPU

T={bfloat16,float32}

SeluGrad

HPU

T={bfloat16,float32}

Send

DEVICE_DEFAULT

Shape

HPU

T={bfloat16,float32,int32,int8}; out_type={int32,int64}

ShapeN

HPU

T={bfloat16,float32,int32,int8}; out_type={int32,int64}

Sigmoid

HPU

T={bfloat16,float32}

SigmoidGrad

HPU

T={bfloat16,float32}

Sign

HPU

T={bfloat16,float32}

Sin

HPU

T={bfloat16,float32}

Sinh

HPU

T={float32}

Size

HPU

T={bfloat16,float32,int32,int8}; out_type={int32,int64}

SleepDataset

HPU

Slice

HPU

T={bfloat16,float32,int32,int8}

Snapshot

HPU

T={bfloat16,bool,float32,int32}

Softmax

HPU

T={bfloat16,float32}

SoftmaxCrossEntropyWithLogits

HPU

T={bfloat16,float32}

Softplus

HPU

T={bfloat16,float32,int32,int8}

SoftplusGrad

HPU

T={bfloat16,float32}

Softsign

HPU

T={bfloat16,float32}

SoftsignGrad

HPU

T={float32}

SparseMatMul

HPU

Ta={bfloat16,float32}; Tb={bfloat16,float32}

SparseSegmentSum

HPU

T={bfloat16,float32}; Tidx={int32}

SparseSegmentSumWithNumSegments

HPU

T={bfloat16,float32}; Tidx={int32}; Tnumsegments={int32,int64}; Tsegmentids={int32}

SparseSoftmaxCrossEntropyWithLogits

HPU

T={bfloat16,float32}; Tlabels={int32,int64}

Split

HPU

T={bfloat16,float32}

SplitV

HPU

Tlen={int32,int64}

Sqrt

HPU

T={bfloat16,float32}

SqrtGrad

HPU

T={bfloat16,float32}

Square

HPU

T={bfloat16,float32}

SquaredDifference

HPU

T={bfloat16,float32}

Squeeze

HPU

T={bfloat16,float32,int32,int64}

Stack

DEVICE_DEFAULT

StackClose

DEVICE_DEFAULT

StackCloseV2

DEVICE_DEFAULT

StackPop

DEVICE_DEFAULT

elem_type={bfloat16,bool,complex128,complex64, float16,float32,float64,int16,int32,int64, int8,uint16,uint32,uint64,uint8}

StackPopV2

DEVICE_DEFAULT

elem_type={bfloat16,bool,complex128,complex64, float16,float32,float64,int16,int32,int64,int8, uint16,uint32,uint64,uint8}

StackPush

DEVICE_DEFAULT

T={bfloat16,bool,complex128,complex64,float16, float32,float64,int16,int32,int64,int8,uint16, uint32,uint64,uint8}

StackPushV2

DEVICE_DEFAULT

T={bfloat16,bool,complex128,complex64,float16, float32,float64,int16,int32,int64,int8,uint16, uint32,uint64,uint8}

StackV2

DEVICE_DEFAULT

StatefulPartitionedCall

HPU

StopGradient

HPU

T={bfloat16,bool,float32,int32,int64,int8, resource}

StridedSlice

HPU

T={bfloat16,bool,float32,int32,int8}

StridedSliceGrad

HPU

T={bfloat16,bool,float32,int32,int8}

Sub

HPU

T={bfloat16,float32,int32}

Sum

HPU

T={bfloat16,float32,int32}; Tidx={int32,int64}

Switch

DEVICE_DEFAULT

Tan

HPU

T={float32}

Tanh

HPU

T={bfloat16,float32}

TanhGrad

HPU

T={bfloat16,float32}

TensorScatterUpdate

HPU

T={bfloat16,float32}; Tindices={int32,int64}

Tile

HPU

T={bfloat16,bool,float32,int32,int8}

TopK

HPU

T={float32,int32}

TopKV2

HPU

T={float32,int32}

Transpose

HPU

T={bfloat16,bool,float32,int16,int32}; Tperm={int32,int64}

TruncatedNormal

HPU

dtype={bfloat16,float32}; T={int32,int64}

Unpack

HPU

T={bfloat16,float32,int32}

UnravelIndex

HPU

Tidx={int32,int64}

UnsortedSegmentSum

HPU

T={bfloat16,float32}; Tindices={int32,int64}; Tnumsegments={int32}

UnwrapDatasetVariant

HPU

VarHandleOp

HPU

dtype={float32,int32}

VarIsInitializedOp

HPU

Variable

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

VariableShape

HPU

out_type={int32,int64}

VariableV2

HPU

  • Refrain from using legacy variables on HPU.

  • In case you would like to explicitly use legacy variables, this restriction can be overridden by the following environment variable TF_HABANA_ALLOW_LEGACY_VARIABLES_ON_CPU=true.

Where

HPU

WrapDatasetVariant

HPU

Xdivy

HPU

T={float32}

ZerosLike

HPU

T={bfloat16,bool,float32,int16,int32,int8, uint16,uint8}

_Arg

HPU

_DeviceArg

HPU

_DeviceRetval

HPU

_FusedConv2D

HPU

T={bfloat16,float32}

_HostCast

DEVICE_DEFAULT

_HostRecv

DEVICE_DEFAULT

_HostSend

DEVICE_DEFAULT

_Recv

DEVICE_DEFAULT

_Retval

HPU

T={bfloat16,bool,float32,int16,int32,int64, int8,resource,string,uint16,uint32,uint64, uint8}

_ScopedAllocator

HPU

_ScopedAllocatorConcat

HPU

_ScopedAllocatorSplit

HPU

_Send

DEVICE_DEFAULT

_SwitchN

DEVICE_DEFAULT

HPU = Op registered directly for Gaudi DEVICE_DEFAULT = Default implementation provided by TF Core is available and used