Code Related Evals
Code hallucination
Checks whether the code present in the generated response is grounded by the context.
Code Hallucination score checks whether the code mentioned in the generated response is grounded to the retrieved context.
Columns required:
question
: The question asked by the usercontext
: Information retrieved to answer the questionresponse
: The response given by the model
How to use it?
By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this tutorial.
Sample Response:
A higher code hallucination score reflects that the generated response contains code that is not grounded by the context.
The context mentioned pip install pandas
as the code required to install Pandas package on Python.
While the generated response mentions import pandas as pd
which is not mentioned in the context.
Resulting in a low code hallucination score.
Was this page helpful?