TRAINING

Training mode uses BackwardValue which tracks gradients for backpropagation. This is necessary when gradient computation is needed.