LAMB

Layer-wise Adaptive Moments optimizer for Batch training is an optimizer that combines the benefits of both the Adam optimizer and the LARS optimizer.

Learn more about LAMB (opens in a new tab)

Usage

from tinygrad.nn.state import get_parameters
from tinygrad.nn import optim
from models.resnet import ResNet
 
model = ResNet()
 
for _ in range(5):
  optimizer = optim.LAMB(get_parameters(model), lr=0.01)
  # train, eval ...

Arguments

params

The parameters of the model to optimize.

lr (default: 0.001)

The learning rate of the optimizer.

b1 (default: 0.9)

The exponential decay rate for the first moment estimates.

b2 (default: 0.999)

The exponential decay rate for the second-moment estimates.

eps (default: 1e-8)

The epsilon value for numerical stability.

wd (default: 0.0)

The weight decay value to apply.

Adam (default: False)

The Adam optimizer is used instead of LAMB if set to True.

AdamW Optimizer