RJ Honickylearning-exhaust.hashnode.dev·Apr 12, 2024Are All Large Language Models Really in 1.58 Bits?Introduction This post is my learning exhaust from reading an exciting pre-print paper titled The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits about very efficient representations of high-performing LLMs. I am trying to come up to s...Dima G and 1 other are discussing this2 people are discussing thisDiscuss·3 likes·947 readsllm
TECHcommunity_SAGtechcommsag.hashnode.dev·Mar 15, 2024Leveraging Hyperscaler Clouds for Machine Learning Inferencing on Cumulocity IoT DataAuthors: @kanishk.chaturvedi@Nick_Van_Damme1 Introduction In the fast-paced world of IoT, processing and analyzing data in real-time is crucial. With billions of devices generating vast amounts of data, leveraging Machine Learning (ML) is key to turn...Discusscumulocity
Kaushal Powarwrittenbykaushal.hashnode.dev·Jan 4, 2024How to convert HF (safetensors) 🤗 model to ggufYou want to convert Huggingface model to gguf format?I was struggling to tackle the same problem a few days ago. I finetuned a Llama 7B model and the model was saved in safetensor format. I wanted to use gguf model so I searched a lot and found a sol...Discuss·1 like·884 readsLLMllamacpp
Nosana CInosana.hashnode.dev·Oct 18, 2023Nosana's New Direction: AI InferenceToday, we’re excited to share a significant update about the future of Nosana. After careful consideration, we’ve decided to pivot away from CI/CD services. Instead, Nosana will now focus on providing a massive GPU-compute grid for AI inference. The ...DiscussGPU
aansh savlaaanshsavla.hashnode.dev·Aug 9, 2023Inferring using Prompt EngineeringInferring means deducing or concluding using some evidence or reasoning. In terms of AI, inferring is also known as making decisions based on available information or data. A Machine Learning model takes input and performs some analysis such as extra...Discussinference
Mathias Winther Madsentodayilearned.hashnode.dev·Dec 30, 2022Parameter Estimation for the Multivariate t DistributionThe t distribution has long tails and is therefore robust against outliers. If we can fit a multivariate t distribution to a data set of input-output pairs, we can therefore perform outlier-robust linear regression. This post provides code that does ...Discuss·130 readsprobability
shesshes.hashnode.dev·Oct 12, 2022Why I decided to use NVIDIA Triton for inferenceNeural networks are cool. They can solve various tasks and are used everywhere. Let's image you have trained one for medicine. It performs well. What's next? Nobody has access to it yet. You need somehow to inference it. I've faced the same problem. ...Discuss·187 readstriton