adamw

fun adamw(lr: Double = 0.001, beta1: Double = 0.9, beta2: Double = 0.999, epsilon: Double = 1.0E-8, weightDecay: Double = 0.01): Optimizer(source)

Factory function for AdamW optimizer (Adam with decoupled weight decay). This is equivalent to adam() with decoupledWeightDecay=true.

Parameters

lr

Learning rate (default: 0.001)

beta1

Exponential decay rate for first moment (default: 0.9)

beta2

Exponential decay rate for second moment (default: 0.999)

epsilon

Numerical stability constant (default: 1e-8)

weightDecay

Weight decay coefficient (default: 0.01)