Comment by Soothfy App on "23ai Vector Search & RAG"

To build a Retrieval-Augmented Generation (RAG) solution combining 23ai Vector Search and OpenAI in Oracle APEX, follow this process:

Overview of RAG

Retrieval-Augmented Generation enhances large language models (LLMs) by providing access to external knowledge via retrieval methods. The idea is to retrieve relevant information before generating a contextually accurate response.

Steps to Implement RAG Solution

Query Reduction: Simplify user queries to enhance search accuracy. For example:
- Original: "Explain APEX Template Components"
- Reduced: "APEX Template Components" Utilize advanced models like GPT-4o for better performance in query reduction.
Run Vector Search: Use the previously established methods from the earlier post to conduct semantic searches on the relevant blog content.

Send Instructions to OpenAI: Construct a prompt to guide OpenAI in generating a response based on the user's question and the retrieved blog post chunks. Ensure to include:

Instructions for context
The original user question
Relevant text chunks from the vector search
Blog post IDs for source referencing

Example Prompt:

INSTRUCTIONS:
You are an expert on Oracle technology. Answer the users QUESTION: using the BLOG POSTS: text below. Always Include the number enclosed in ##number## to reference sources used from the included BLOG POSTS:. Prioritize the content of the BLOG POSTS: when determining your answer.
QUESTION: Can APEX be used to Extend Oracle Fusion Applications?
BLOG POSTS:
##2272## I enjoyed returning to the APEX Alpe Adria conference...
...

Supplement with Links: Ensure the AI includes blog post IDs in its response. This allows for easy reference generation with links to the original content.
Display Response: Present the final answer to users, incorporating the relevant sources for transparency and follow-up.

Code Example for RAG Function

Here’s a simplified version of a PL/SQL function that encapsulates the RAG logic:

FUNCTION rag_blog_chunks
 (p_question   IN VARCHAR2,
  p_months_ago IN NUMBER,
  p_max_chunks IN NUMBER,
  p_max_bytes  IN NUMBER,
  p_ai_model   IN VARCHAR2,
  p_ai_temp    IN NUMBER,
  p_qr_model   IN VARCHAR2,
  p_ai_qr_temp IN NUMBER) RETURN VARCHAR2 IS

  CURSOR cr_content (cp_question_vector IN VECTOR) IS
    WITH vs AS (
      SELECT cbp.post_id, cbc.id AS chunk_id, cbc.chunk_text,
             vector_distance(cbc.embedding, cp_question_vector) AS vector_distance
      FROM   cnba_post_chunks cbc, cnba_blog_posts cbp
      WHERE  cbp.post_id = cbc.post_id
      AND    cbp.post_published_date_utc > (SYSDATE - NUMTOYMINTERVAL(p_months_ago, 'MONTH')))
    )
    SELECT vs.post_id, vs.chunk_id, vs.chunk_text, vs.vector_distance
    FROM   vs
    ORDER BY vector_distance
    FETCH EXACT FIRST p_max_chunks ROWS ONLY WITH TARGET ACCURACY 95;

  l_question_vector  VECTOR;
  l_chunks           CLOB;
  l_openai_params    JSON;

BEGIN
  -- Capture Parameter Values and execute the retrieval and generation process
  -- ... Additional code logic ...

  RETURN l_generated_response; -- The final response from OpenAI
END;

Conclusion

This RAG architecture leverages both semantic search and the generative capabilities of AI to provide users with precise, contextually informed responses while ensuring proper sourcing of information.

Search Hashnode

Overview of RAG

Steps to Implement RAG Solution

Code Example for RAG Function

Conclusion