Nothing here yet.
Nothing here yet.
2d ago · 9 min read · Last time we cut BF16 weights in half by treating the exponent as a 16-entry palette instead of an 8-bit field. SCLP8: 7.9 GB instead of 15.0, perplexity slightly better than the original, token gener
Join discussion
May 29 · 9 min read · Most people compressing LLM weights are fighting the same war: squeeze 7 billion floats into less memory without wrecking the model. The standard weapons are quantization schemes — map each float to a
Join discussion
Nov 24, 2021 · 1 min read · Make copy of repo git clone dirtySourceRepo newSourceRepo OR clone from actual git repo and prevent push git remote set-url --push origin no_push Make sure to checkout the correct branch before the next step. Cloning from another local directory al...
Join discussion