- Context Relevance: Determines if the context extracted from the query is relevant to the response.
- Factual Accuracy: Assesses if the LLM is hallcuinating or providing incorrect information.
- Response Completeness: Checks if the response contains all the information requested by the query.
You can check out the complete list of evaluations UpTrain supports here
How to do it?
Setup UpTrain Open-Source Software (OSS)
You can use the open-source evaluation service to evaluate your model. In this case, you will need to provie an OpenAI API key. You can get yours here.Parameters:
key_type=“openai”api_key=“OPENAI_API_KEY”project_name_prefix=“PROJECT_NAME_PREFIX”
Load and Parse Documents
Load documents from Paul Graham’s essay “What I Worked On”.Parse the document into nodes.
RAG Query Engine Evaluation
UpTrain callback handler will automatically capture the query, context and response once generated and will run the following three evaluations (Graded from 0 to 1) on the response:
- Context Relevance: Determines if the context extracted from the query is relevant to the response.
- Factual Accuracy: Assesses if the LLM is hallcuinating or providing incorrect information.
- Response Completeness: Checks if the response contains all the information requested by the query.

