Minimal optimizer surface for training.
Register a raw module parameter to be optimized.
Register a parameter to be optimized.
Perform one optimization step, updating all registered parameters in-place (via reassigning their tensor values where needed).
Zero accumulated gradients on all registered parameters.