Anyscale
Anyscale Endpoints provides fully managed API endpoints for open-source LLMs, allowing developers to seamlessly integrate these powerful models into their applications without managing underlying infrastructure.
How will this help?
Anyscale offers API endpoints for models like Llama-2, Mistral-7B, CodeLlama, and more. You can use these endpoints to evaluate the performance of these models using UpTrain.
Before we start you will need an Anyscale API key. You can get it here
How to integrate?
First, let’s import the necessary packages and define Anyscale API Key
The model name should start with anyscale/
for UpTrain to recognize you are using models hosted on Anyscale.
For example if you are using mistralai/Mistral-7B-Instruct-v0.1
via Anyscale, the model name should be anyscale/mistralai/Mistral-7B-Instruct-v0.1
We have used Mistral-7B-Instruct-v0.1 for this example. You can find a full list of available models here.
Let’s define a dataset on which we want to perform the evaluations
Now, let’s use UpTrain to evaluate for Context Relevance and Factual Accuracy. You can find the complete list of metrics supported by UpTrain here
Let’s look at the output of the above code:
According to these evaluations:
- Context Relevance: Since the context has information on the most popular sport globally, UpTrain has rated the context to be relevant to the question.
- Factual Accuracy: Since the facts mentioned in the response are grounded to the context, UpTrain has rated the response as factually accurate.
Was this page helpful?