Response Matching

Response Matching compares the LLM-generated text with the gold (ideal) response using the defined score metric. Columns required:

question: The question asked by the user
response: The response given by the model
ground_truth: The ideal response

Optional Parameters:

method: Different methods to check for response matching
- llm (default): Uses LLM to check if the response matches the ground truth
- exact: Checks if the response is exactly the same as the ground_truth
- rouge: Uses ROUGE SCORE to check if the response matches the ground truth

How to use it?

from uptrain import EvalLLM, ResponseMatching

OPENAI_API_KEY = "sk-********************"  # Insert your OpenAI key here

data = [{
    "question": "Who were the two finalists of the 2023 ICC Cricket World Cup?",
    "ground_truth": "The finalists of the 2023 ICC Cricket World Cup were India and Australia.",
    "response": "Australia was a finalist in the 2023 ICC Cricket World Cup."
}]

eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)

res = eval_llm.evaluate(
    data = data,
    checks = [ResponseMatching(method = 'llm')]    # method: llm/exact/rouge
)

By default, we are using GPT 3.5 Turbo for evaluations. If you want to use a different model, check out this tutorial.

Sample Response:

[
   {
      "response_match_precision": 1.0,
      "response_match_recall": 0.5,
      "score_response_match": 0.57,
      "response_match_method": "llm"
   }
]

A higher response matching score reflects that the generated response matches the ground truth.

The response generated contains information on Australia being a finalist and does not match the ground truth about both the finalists of the 2023 ICC Cricket World Cup. Hence, resulting in a low response matching score.

Tutorial

Open this tutorial in GitHub

Have Questions?

Join our community for any questions or requests

Getting Started

Pre-configured Evaluations

Supported LLMs

Integrations

Tutorials

FAQ

Response Matching

How to use it?

Tutorial

Have Questions?

Getting Started

Pre-configured Evaluations

Supported LLMs

Integrations

Tutorials

FAQ

​How to use it?

Tutorial

Have Questions?

How to use it?