Batch Norm

Paper: https://arxiv.org/pdf/1502.03167.pdf

Batch Normalization transformation:

Training:

when training update gamma and beta using batch samples and also update running mean E[x], running variance Var[x]

When inference, use running mean E[x], running variance Var[x], rather than sample mean and sample var, to compute output

Gradient Computation during training and updating var, mean

Coding in pytorch

Last updated

Was this helpful?