Discussion

Terry Tao

Jan 24

Grokking of MLP on Modular Addition

Background Grokking is a phenomenon where a model quickly achieves near-perfect training accuracy (memorization), while validation accuracy remains near chance (or even worse-than-chance) for a long time, and then later transitions sharply to strong ...

taot.hashnode.dev4 min read

#grokking #mlp

Responses

No responses yet.

Search Hashnode

Grokking of MLP on Modular Addition

Responses

Recent in Forum