Sub-Query Completeness checks whether the sub-queries generated from a question are complete. This check considers all the sub-queries and evaluate if all of them taken together answers all aspects of the question or not
Columns required:
question
: The question asked by the user
sub_questions
: Sub questions generated from the question
How to use it?
from uptrain import EvalLLM, Evals
OPENAI_API_KEY = "sk-********************"
data = [
{
'question': 'What is the Taj Mahal? When was it built, where and by whom?',
'sub_questions': '1. What is the Taj Mahal? 2. When was the Taj Mahal built? 3. Where is the Taj Mahal? 4. Who built the Taj Mahal?'
}
]
eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)
res = eval_llm.evaluate(
data = data,
checks = [Evals.SUB_QUERY_COMPLETENESS]
)
By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this
tutorial.
Sample Response:
[
{
"question": "What is the Taj Mahal? When was it built, where and by whom?",
"sub_questions": "1. What is the Taj Mahal? 2. When was the Taj Mahal built? 3. Where is the Taj Mahal? 4. Who built the Taj Mahal?",
"score_sub_query_completeness": 1.0,
"explanation_sub_query_completeness": "Step by step reasoning:\n\n1. What is the Taj Mahal? - This sub-question covers the aspect of understanding what the Taj Mahal is, providing information about its nature and purpose.\n2. When was the Taj Mahal built? - This sub-question covers the aspect of the time of construction, addressing the historical timeline of the Taj Mahal's creation.\n3. Where is the Taj Mahal? - This sub-question covers the aspect of location, providing information about the geographical placement of the Taj Mahal.\n4. Who built the Taj Mahal? - This sub-question covers the aspect of the creator, addressing the individuals or entities responsible for the construction of the Taj Mahal.\n\nConclusion:\nThe sub-questions collectively cover all the aspects of the main question.\n\n[Choice]: (A) Sub Questions collectively all the aspects of the main question."
}
]
A higher Sub-Query Completeness score reflects that the generated sub-questions cover all aspects of the question asked.
The sub_questions
do not contain some parts of the question
such as: “When was the Taj Mahal?”, “Who built the Taj Mahal?”, “Where is the the Taj Mahal?”
Resulting in low Sub-Query Completeness score.
How it works?
We evaluate Sub-Query Completeness by determining which of the following three cases apply for the given task data:
- Sub Questions collectively cover all the aspects of the main question.
- Sub Questions collectively cover only a few aspects of the main question.
- Sub Questions collectively does not cover any aspects of the main question.