Tag feed

#grokking

3 posts0 followers

Explore Hashnode

Alternatives

TTTerry Taotaot.hashnode.devFeb 8 · 3 min read

Grokking of Transformer on Modular Addition

Background Grokking is a phenomenon where a model quickly achieves near-perfect training accuracy (memorization), while validation accuracy remains near chance for a long time, and then later transitions sharply to strong generalization after extende...

0

TTTerry Taotaot.hashnode.devJan 24 · 4 min read

Grokking of MLP on Modular Addition

Background Grokking is a phenomenon where a model quickly achieves near-perfect training accuracy (memorization), while validation accuracy remains near chance (or even worse-than-chance) for a long time, and then later transitions sharply to strong ...

0

CSCode Skycodesky.cloudhero.inNov 6, 2025 · 4 min read

Inside the Mind of Machines: Induction Heads, Grokking, and Memorization

Imagine a student sitting in a classroom. At first, he memorizes facts without truly understanding them — repeating history dates, formulas, or definitions. But then, one day, something clicks. He suddenly sees patterns — how one idea connects to ano...

0

#grokking

Search Hashnode

#grokking

Explore Hashnode

Grokking of Transformer on Modular Addition

Grokking of MLP on Modular Addition

Inside the Mind of Machines: Induction Heads, Grokking, and Memorization

Trending tags this week