Visualizing the Activation, Gradients, Gradient to data ratio, and Update to Data Distributions in Neural Networks Training.
Jan 21, 2025 · 21 min read · In my last three blog posts, we explained about how to reduce the wrong but high confidence of neural network, how to identify dead neurons, and The Kaiming initialization and batch normalization. The core of these three blog post is, the correct ini...
Join discussion

