Kavyaportkey-llm-elo-rating.hashnode.dev·Oct 14, 2024⭐ Reducing LLM Costs & Latency with Semantic CacheImplementing semantic cache from scratch for production use cases. Vrushank Vyas Jul 11, 20235 min Image credits: Our future AI overlords. (No, seriously, Stability AI) Latency and Cost are significant hurdles for developers building on top of Large...semantic cache
James Perkinsunkey.hashnode.dev·Jun 26, 2024Semantic cachingLarge language models are getting faster and cheaper. The below charts show progress in OpenAI's GPT family of models over the past year: Cost per million tokens ($) Tokens per second Recent releases like Meta's Llama 3 and Gemini Flash have pushed...Unkey Launchweek 1semantic cache