e.g. when you add a L2 regularization term to the loss function:
- the regularization increases the loss, which encourages the network to have smaller weights
- smaller weights makes the model less complex and reduces overfitting
- Cause less complex model encourages it to learn only the most important patterns