You have finished training a convolutional neural network that places a BatchNormalization layer after every convolution. Before exporting the model you call the network with training=False so that all BatchNormalization layers run in inference mode.
Which statement correctly describes how those BatchNormalization layers normalize their inputs when training=False?
It subtracts the layer's moving mean, divides by the moving variance (plus ε), and then applies the learned γ and β parameters that were updated during training.
It normalizes each feature by second-order moment estimates (e.g., Adam's vₜ) stored in the optimizer, so no internal moving averages are required.
It recomputes the mean and variance of the current inference batch, applies them for normalization, and updates the layer's moving averages without back-propagating gradients.
It skips normalization altogether and only performs the affine transform γx + β using the parameters learned during training.
In inference mode a BatchNormalization layer no longer uses statistics computed from the current batch. Instead, it applies the exponential moving averages that were accumulated during training to re-center and re-scale each feature, and then it applies the learnable scale (γ) and shift (β). Choices that claimed the layer recomputes statistics on the inference data, ignores normalization entirely, or relies on optimizer moment estimates are incorrect because none of those actions occur when training=False in standard TensorFlow/Keras or PyTorch implementations.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the purpose of BatchNormalization in neural networks?
Open an interactive chat with Bash
What are the 'moving mean' and 'moving variance' in BatchNormalization?
Open an interactive chat with Bash
Why is normalization skipped when `training=False` in some alternatives?