Response Quality Evals
Response Completeness
Checks whether the response has answered all the aspects of the question specified
Response completeness score measures if the generated response has adequately answered all aspects to the question being asked.
This check is important to ensure that the model is not generating incomplete responses.
Columns required:
question
: The question asked by the userresponse
: The response given by the model
How to use it?
By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this tutorial.
Sample Response:
A higher response completeness score reflects that the response has answered all aspects of the user’s questions to a greater extent.
The question can be divided in following 2 parts:
- Where is the Taj Mahal located?
- When was the Taj Mahal built?
Though the response provides answer to “Where is the Taj Mahal located?”, it does not state when was it built.
Ultimately, resulting in a low response completeness score.
How it works?
We evaluate response completeness by determining which of the following three cases apply for the given task data:
- The generated answer doesn’t answer the given question at all.
- The generated answer only partially answers the given question.
- The generated answer adequately answers the given question.
Was this page helpful?