James PerkinsforUnkeyunkey.hashnode.devยทJun 26, 2024Semantic cachingLarge language models are getting faster and cheaper. The below charts show progress in OpenAI's GPT family of models over the past year: Cost per million tokens ($) Tokens per second Recent releases like Meta's Llama 3 and Gemini Flash have pushed...Unkey Launchweek 1semantic cacheAdd a thoughtful commentNo comments yetBe the first to start the conversation.