RAG Query Engine Evaluations
The RAG query engine plays a crucial role in retrieving context and generating responses. To ensure its performance and response quality, we conduct the following evaluations:
- Context Relevance: Determines if the context extracted from the query is relevant to the response.
- Factual Accuracy: Assesses if the LLM is hallcuinating or providing incorrect information.
- Response Completeness: Checks if the response contains all the information requested by the query.
How to do it?
Install UpTrain and LlamaIndex
Import required libraries
Setup UpTrain Open-Source Software (OSS)
You can use the open-source evaluation service to evaluate your model. In this case, you will need to provie an OpenAI API key. You can get yours here.
Parameters:
key_type
=“openai”api_key
=“OPENAI_API_KEY”project_name_prefix
=“PROJECT_NAME_PREFIX”
Load and Parse Documents
Load documents from Paul Graham’s essay “What I Worked On”.
Parse the document into nodes.
RAG Query Engine Evaluation
UpTrain callback handler will automatically capture the query, context and response once generated and will run the following three evaluations (Graded from 0 to 1) on the response:
- Context Relevance: Determines if the context extracted from the query is relevant to the response.
- Factual Accuracy: Assesses if the LLM is hallcuinating or providing incorrect information.
- Response Completeness: Checks if the response contains all the information requested by the query.
Was this page helpful?