ScaledDotProductAttentionOperation
class ScaledDotProductAttentionOperation(parameters: Map<String, Any> = emptyMap()) : BaseOperation(source)
Scaled dot-product attention operation for tape recording. Output shape = query shape: batch, nHeads, seqLen, headDim