question
: The question asked by the usercontext
: Information retrieved to answer the questionresponse
: The response given by the model
How to use it?
By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this tutorial.
A higher context utilization score reflects that the generated response has completely utilized the retrieved context.
- Where is the Taj Mahal located?
- When was the Taj Mahal built?
How it works?
We evaluate context utilization by determining which of the following three cases apply for the given task data:- The generated response incorporates all the relevant information present in the context.
- The generated response incorporates some of the information present in the context, but misses some of the information in context which is relevant for answering the given question.
- The generated response doesn’t incorporate any information present in the context.