Identifying and Removing Dead Neurons in Training Neural Networks: Mechanistic Interpretability Part 2
My last blog explained how to correctly initialize the last softmax layer of a neural network to reduce high but incorrect confidence when predicting certain classes. It also covered how to achieve uniform logits, a proper probability distribution, a...
aabidkarim.hashnode.dev9 min read
Abdul Karim
do you have a GitHub link for the code file?