Overview: In this example, we will see how you can use UpTrain’s framework to create customized evaluations for your models. We will create a Check that runs two Operators on a given column of data and generates a table and a histogram as the output.

When to use: You can use UpTrain’s framework to create customized evaluations for your models. This is useful when you want to run a specific set of checks on your data and generate a specific set of plots. You can also use UpTrain’s pre-built Checks and Operators to create your own Checks and Operators.

Install UpTrain with all dependencies

pip install uptrain
uptrain-add --feature full

Choose and create Operators

An Operator runs a function on a given column of data and returns the result in a new column.

In our case, we clearly want to use the GrammarScore and TextLength Operators. So, we will create them by specifying the input and output coloumn names as follows:

from uptrain.operators import GrammarScore, TextLength

grammar_score = GrammarScore(
    col_in_text="model_response",
    col_out="grammar_correctness_score"
)

text_length = TextLength(
    col_in_text="model_response",
    col_out="response_length"
)

Create the Checks

A Check runs the given list of operators in the order in which they are specified by the user.

A Check in UpTrain takes three arguments:

  1. name - The name of the check
  2. operators - The operators that are to be run when the check is executed
  3. plots - The plots that are to be generated when the check is executed
from uptrain.framework import Check

response_analysis = Check(
    name="response_analysis"
    operators=[grammar_score, text_length],
    plots=[Table()]
)

Here, we show a table with all the columns and a histogram that depicts the relationship between the length of the responses and their grammatical correctness. You can create more than one Check, but for the sake of simplicity, we will just create one.

Using pre-built Checks

UpTrain comes with many pre-built Checks that you can use. These have been pre-configured with operators and plots. You can use them as follows:

from uptrain.framework.builtins import CheckResponseCompleteness

response_completeness_check = CheckResponseCompleteness()

To learn more about the pre-built Checks, check out their documentation.

Configure the Settings

We need to tell UpTrain where we want our results to be stored (logs_folder). Since, the GrammarScore uses OpenAI’s API, we also need to enter the OpenAI API key.

To learn more about how to get an OpenAI API key and set it as an environment variable, check out this guide. It is recommended to store your API key as an environment variable for security reasons.

We configure the settings for our checks as follows:

import os
from uptrain.framework import Settings

LOGS_DIR = "/tmp/uptrain_logs"
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]

settings = Settings(
    logs_folder=LOGS_DIR,
    openai_api_key=OPENAI_API_KEY
)

Create a CheckSet

You can create as many Checks as you want and add them to a CheckSet. It will run all the Checks and you can view them individually. For now, we just add the response_analysis Check we created above.

Here, we use UpTrain’s CsvReader to read the content from our CSV file containing the responses.

from uptrain.framework import CheckSet
from uptrain.operators import CsvReader

check_set = CheckSet(
    checks=[response_analysis, response_completeness_check],
    source=CsvReader(fpath="responses.csv")
)

Set up and run the CheckSet

Now, we will set up the CheckSet with the settings we created above and run it.

check_set.setup(settings)
check_set.run()

Visualize the results

Finally, we use UpTrain’s StreamlitRunner to visualize the results.

from uptrain.dashboard import StreamlitRunner

st_runner = StreamlitRunner(LOGS_DIR)
st_runner.start()

This will open a new tab in your browser with the UpTrain dashboard. You can view the results of the Check we created above.

Here is a screenshot of the dashboard:

dashboard.png