Nagen Kthinkboundlessai.hashnode.dev·Feb 4, 2025Unlocking Efficiency in AI : Quantization & DistillationWith the noise and disruption created in the AI space with the release of DeepSeek R1, two terms are now heard prominently in the AI landscape: Quantization and Distillation. Today in our blog, let’s see what these terms are and how these processes h...58 readsDeepseek
Nitin Agarwalcognibits.hashnode.dev·Nov 17, 2024A Beginner's Guide to Generative AIGenerative AI refers to algorithms that can generate new content, whether it's text, images, audio, or even video. Unlike traditional AI, which typically focuses on classification or prediction tasks, generative models create new data instances that ...263 readsgenerative ai
Robert Collinsselfenrichment.hashnode.dev·Oct 28, 2024On KL-Divergence and Context Size Optimization in GGUF Quantizationnote: This paper was written with AI, and the exploration it describes was done collaboratively with AI. The "we" described here is us; me and a few models. With llama.cpp model quantization, properly adjusting models to keep their performance after ...57 readsgguf
Hemanth BalgiforACM-VITblog.acmvit.in·Oct 26, 2024Honey I shrunk the AI : Quantizing LLM's for Edge HardwareOne could argue that humanity’s rise to power on this planet came from its ability to walk on two legs, or the ability to throw sharp rocks at food, or even the ability to touch, hear and see at a deeper level than any other animal. However, one abil...73 likes·225 readsaitools
Pronod Bharatiyadata-intelligence.hashnode.dev·Oct 18, 2024Discrete and Continuous Models in Machine Learning: Understanding CDF as a BridgeMachine learning (ML) models are often categorized as either discrete or continuous, based on the nature of the data they handle. Discrete models work with distinct, countable values, while continuous models operate on variables from a continuous ran...242 readsDiscrete Models
Gopinath Balublog.gopinathbalu.com·Sep 26, 2024Understanding Quantization: Part IIntroduction Quantization in general can be defined as mapping values from a large set of real numbers i.e., FP32 or even FP16 to values in a small discrete set most likely Int8 or Int4. There are recent works trying to map to 1bit models. Typically ...1 like·29 readsQuantizationfaster inference
Siddartha Pullakhandamsiddartha10.hashnode.dev·Sep 5, 2024Getting Started with QuantizationWhat is Quantization? It is the process of reducing/mapping higher precision weights and activations into lower precision. In simple terms shrinking a model to smaller size that can be used to run on resources with limited memory. Linear Quantizatio...11 likes·52 readsquantization
RJ Honickylearning-exhaust.hashnode.dev·Aug 11, 2024Is it the model or the data that's low rank?I mentioned this little bit of analysis that I recently did during the Latent Space Paper Club, and got a lot of positive feedback, so I did a quick writeup. The recently released Apple Intelligence Foundation Language Models paper has spark a lot of...1 like·100 readsquantization
Venkat Rvenkatr.hashnode.dev·Jul 27, 2024Learning from Less Data and Building Smaller ModelsLearning from Less Data: Techniques and Applications Introduction In the age of big data, machine learning models typically need large datasets to perform well. However, gathering and labeling vast amounts of data can be tough and costly. Data-effici...Active Learning
RJ Honickylearning-exhaust.hashnode.dev·Jul 12, 2024Can we improve quantization by fine tuning?As a followup to my previous post Are All Large Language Models Really in 1.58 Bits?, I've been wondering if we could apply the same ideas to post-training quantization. The authors trained models from scratch in The Era of 1-bit LLMs: All Large Lang...62 readsquantization