The Great Stabilizer: Batch Normalization and the Magic of Transfer Learning
By using He Initialization and the Adam Optimizer, we have built neural networks that are mathematically stable and incredibly fast.
But there is still a massive underlying problem with deep networks.
rishiii2.hashnode.dev6 min read