Ollama is a great solution to run large language models (LLMs) on your local system.

How will this help?

Using Ollama you can run models like Llama, Gemma locally on your system.

In this tutorial we will walk you though running evaluations on UpTrain using your local models hosted on Ollama.

Prerequisites

  1. Install Ollama to your system, you can download it from here

  2. Pull the model using the command:

    ollama pull <model_name>
    

    For the list of models supported by Ollama you can refer here

  3. You can enter http://localhost:11434/ in your web browser to confirm Ollama is running

How to integrate?

First, let’s import the necessary packages

# %pip install uptrain
from uptrain import EvalLLM, Evals, Settings
import json

Create your data

You can define your data as a simple dictionary with the following keys:

  • question: The question you want to ask
  • context: The context relevant to the question
  • response: The response to the question
data = [
   {
        'question': 'What is the capital of France?',
        'context': 'France, context for its exquisite pastries and fashion, has a capital city called Paris. It\'s a place where people speak French and enjoy baguettes. I once heard that the Eiffel Tower was built by aliens, but don\'t quote me on that.',
        'response': 'The capital of France is Paris.'
    }
]

Define the model

We will be using Stable LM 2 1.6B for this example. You can refer the documentation on Ollama.

Remember to add “ollama/” at the beginning of the model name to let UpTrain know that you are using an Ollama model.

You can check if you have downloaded Stable LM 2 1.6B by running !ollama list Else you can download it by !ollama pull stablelm2
settings = Settings(model='ollama/stablelm2')

Create an EvalLLM Evaluator

Before we can start using UpTrain, we need to create an EvalLLM Evaluator.

eval_llm = EvalLLM(settings)

We have used the following 3 metrics from UpTrain’s library:

  1. Context Relevance: Evaluates how relevant the retrieved context is to the question specified.

  2. Response Conciseness: Evaluates how concise the generated response is or if it has any additional irrelevant information for the question asked..

  3. Response Relevance: Evaluates how relevant the generated response was to the question specified.

You can look at the complete list of UpTrain’s supported metrics here

results = eval_llm.evaluate(
    project_name = 'Ollama-Demo',
    data=data,
    checks=[Evals.CONTEXT_RELEVANCE, Evals.RESPONSE_CONCISENESS, Evals.RESPONSE_RELEVANCE]
)

View your results

results = eval_llm.evaluate(
    project_name = 'Ollam-tutorial-test',
    data=data,
    checks=[Evals.CONTEXT_RELEVANCE, Evals.RESPONSE_CONCISENESS, Evals.RESPONSE_RELEVANCE]
)

Sample Reponse:

[  
  {  'context': 'France, context for its exquisite pastries and fashion, has '
                 "a capital city called Paris. It's a place where people speak "
                 'French and enjoy baguettes. I once heard that the Eiffel '
                 "Tower was built by aliens, but don't quote me on that.",
      'explanation_context_relevance': '{\n'
                                       '     "Reasoning": "The extracted '
                                       'context can answer the given query '
                                       'because it provides information about '
                                       "France's capital city, Paris. This "
                                       'information is relevant to answering '
                                       'the question about the capital of '
                                       'France.",\n'
                                       '     "Choice": "A"\n'
                                       '}',
      'explanation_response_conciseness': '{\n'
                                          '     "Reasoning": "The given '
                                          'response provides information about '
                                          'the capital of France, but it lacks '
                                          'the detail that could be expected '
                                          'in this context.",\n'
                                          '     "Choice": "B"\n'
                                          '}',
      'explanation_response_relevance': 'Response Precision: 0.5{\n'
                                        '     "Reasoning": "The given response '
                                        'provides information about the '
                                        'capital of France, but it lacks the '
                                        'detail that could be expected in this '
                                        'context.",\n'
                                        '     "Choice": "B"\n'
                                        '}\n'
                                        'Response Recall: 1.0{\n'
                                        '     "Reasoning": "The given response '
                                        'adequately answers the question about '
                                        'the capital of France because it '
                                        'provides the correct answer, i.e., '
                                        'Paris.",\n'
                                        '     "Choice": "A"\n'
                                        '}',
      'question': 'What is the capital of France?',
      'response': 'The capital of France is Paris.',
      'score_context_relevance': 1.0,
      'score_response_conciseness': 0.5,
      'score_response_relevance': 0.6666666666666666
  }
]