This tutorial will walk you through the process of getting started with UpTrain using our open source Evaluator. You can use this Evaluator to log and evaluate your data, and get results in a few simple steps.

Get your OpenAI API key

You can get your OpenAI API key here.

Install UpTrain

Run the following commands in your terminal to install UpTrain:

pip install uptrain

Create an EvalLLM Evaluator

Before we can start using UpTrain, we need to create an EvalLLM Evaluator. You can do this by passing your API key to the EvalLLM constructor.

from uptrain import EvalLLM, Evals, CritiqueTone
import json

eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)

Create your data

You can define your data as a simple dictionary with the following keys:

  • question: The question you want to ask
  • context: The context relevant to the question
  • response: The response to the question
data = [{
    "question": [
        "Which is the most popular global sport?",
        "Who created the Python language?",
    ]
    "context": [
        "The popularity of sports can be measured in various ways, including TV viewership, social media presence, number of participants, and economic impact. Football is undoubtedly the world's most popular sport with major events like the FIFA World Cup and sports personalities like Ronaldo and Messi, drawing a followership of more than 4 billion people.",
        "Python, created by Guido van Rossum in the late 1980s, is a high-level general-purpose programming language. Its design philosophy emphasizes code readability, and its language constructs aim to help programmers write clear, logical code for both small and large-scale software projects.",
    ]
    "response":[
        "Football is the most popular sport with around 4 billion followers worldwide",
        "Python language was created by Guido van Rossum.",
    ]
}]

Evaluate your data

Now that we have our data, we can log it and evaluate it using UpTrain. We use the evaluate method to do this. This method takes the following arguments:

  • data: The data you want to log and evaluate
  • checks: The evaluations you want to perform on your data

You can find the list of all available evaluations here.

results = eval_llm.evaluate(
    data=data,
    checks=[Evals.CONTEXT_RELEVANCE, Evals.FACTUAL_ACCURACY, Evals.RESPONSE_RELEVANCE, CritiqueTone(persona="teacher")]
)

Get your results

print(json.dumps(results, indent=3))