Sub-Query Completeness checks whether the sub-queries generated from a question are complete. This check considers all the sub-queries and evaluate if all of them taken together answers all aspects of the question or not

Columns required:

  • question: The question asked by the user
  • sub_questions: Sub questions generated from the question

How to use it?

from uptrain import EvalLLM, Evals

OPENAI_API_KEY = "sk-********************"  # Insert your OpenAI key here

data = [
  {
    'question': 'What is the Taj Mahal? When was it built, where and by whom?',
    'sub_questions': '1. What is the Taj Mahal? 2. When was the Taj Mahal built? 3. Where is the Taj Mahal? 4. Who built the Taj Mahal?'        
  }
]

eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)

res = eval_llm.evaluate(
    data = data,
    checks = [Evals.SUB_QUERY_COMPLETENESS]
)
By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this tutorial.

Sample Response:

[
  {
    "question": "What is the Taj Mahal? When was it built, where and by whom?",
    "sub_questions": "1. What is the Taj Mahal? 2. When was the Taj Mahal built? 3. Where is the Taj Mahal? 4. Who built the Taj Mahal?",
    "score_sub_query_completeness": 1.0,
    "explanation_sub_query_completeness": "Step by step reasoning:\n\n1. What is the Taj Mahal? - This sub-question covers the aspect of understanding what the Taj Mahal is, providing information about its nature and purpose.\n2. When was the Taj Mahal built? - This sub-question covers the aspect of the time of construction, addressing the historical timeline of the Taj Mahal's creation.\n3. Where is the Taj Mahal? - This sub-question covers the aspect of location, providing information about the geographical placement of the Taj Mahal.\n4. Who built the Taj Mahal? - This sub-question covers the aspect of the creator, addressing the individuals or entities responsible for the construction of the Taj Mahal.\n\nConclusion:\nThe sub-questions collectively cover all the aspects of the main question.\n\n[Choice]: (A) Sub Questions collectively all the aspects of the main question."
  }
]
A higher Sub-Query Completeness score reflects that the generated sub-questions cover all aspects of the question asked.

The sub_questions do not contain some parts of the question such as: “When was the Taj Mahal?”, “Who built the Taj Mahal?”, “Where is the the Taj Mahal?”

Resulting in low Sub-Query Completeness score.

How it works?

We evaluate Sub-Query Completeness by determining which of the following three cases apply for the given task data:

  • Sub Questions collectively cover all the aspects of the main question.
  • Sub Questions collectively cover only a few aspects of the main question.
  • Sub Questions collectively does not cover any aspects of the main question.