Hi Elon Musk . RAG is essentially taking text data and vectorizing the data so that it can be searched.
In the case of a website, you need to scrape the text of the website and then follow the RAG process as described in this article.
You don't need to always use RAG, if you use Langchain tools for example, you can easily just query an API or scrape a page and then provide that data to the model in real-time.
RAG helps to narrow down large chunks of text. Remember you pay for each token and models usually have a limitation on the maximum number of tokens you can send in a prompt, therefore using a RAG system - you minimize the amount of tokens sent.
OpenAI uses a token-based costing system. A token is roughly 4 characters (in English).
You get billed a different rate for text fed into the model and text generated from the model.
GPT3.5-Turbo is generally the most cost-effective model, it's not the best model but for most use cases it's usually fine.
With GPT3.5 Turbo expect to pay $0.50 /
1M tokens for text you pass into the model, and $1.50 / 1M tokens the model generates.
Elon Musk
Software Developer
Thank you Kevin Naidoo for this useful guide.
If someone wants to develop a chatbot whose knowledge base is drawn from the content (posts articles, and comments) of a website, how can they dynamically update the
memoryofchain?By the way, can you tell me about the pricing of openai API key usage with some examples for using the gpt-3.5-turbo model?
Thank you!