LLMFusedOpHandlers

CPU fallback implementations for fused LLM operations.

These handlers decompose fused ops back into sequences of TensorOps calls. They produce correct results on any backend but don't provide the performance benefit of a true fused kernel. Platform-specific backends (Metal, CUDA) should register their own handlers to override these.

Register via:

LLMFusedOpHandlers.registerAll()

Functions

Link copied to clipboard

Register all CPU fallback handlers with ComputeGraphExecutor. Call once at application startup.