Creating Custom Evals
Custom Prompts
Allows you to create your own set of evaluations
Each LLM application has its unique needs and it is not possible to have a one-size-fits-all evaluation tool.
A sales assistant bot needs to be evaluated differently as compared to a calendar automation bot.
Custom prompts help you grade your model the way you want it.
Parameters:
prompt
: Evaluation prompt used to generate the gradechoices
: List of choices/grades to choose fromchoices_scores
: Scores associated with each choiceeval_type
: One of [“classify”, “cot_classify”], determines if chain-of-thought prompting is to be applied or notprompt_var_to_column_mapping (optional)
: mapping between variables defined in the prompt vs column names in the data
How to use it?
By Default we are using GPT 3.5 Turbo. If you want to use some other model check out this tutorial
Sample Response:
Here, we have evaluated the data according to the above mentioned prompt.
The response though seems correct, does not answers the question completely according to the information provided in the context.
Was this page helpful?