adamw
fun adamw(lr: Double = 0.001, beta1: Double = 0.9, beta2: Double = 0.999, epsilon: Double = 1.0E-8, weightDecay: Double = 0.01): Optimizer(source)
Creates an AdamW optimizer (Adam with decoupled weight decay).
Parameters
lr
Learning rate (default: 0.001)
beta1
First moment decay rate (default: 0.9)
beta2
Second moment decay rate (default: 0.999)
epsilon
Numerical stability constant (default: 1e-8)
weightDecay
Weight decay coefficient (default: 0.01)