Qdrant
Qdrant is a vector similarity search engine and vector database. While working on RAG based applications, you can use QDrant to retrieve information from your context documents.
How will this help?
Vector databases store data as high-dimensional vectors, enabling fast and efficient similarity search and retrieval of data based on their vector representations. You can use UpTrain along with vector databases such as QDrant for evaluations such as the relevance of the retrieved context ensuring a good retrieval quality.
How to integrate?
First, let’s import the necessary packages
from uptrain import EvalLLM, Evals, APIClient
from qdrant_client import models, QdrantClient
from sentence_transformers import SentenceTransformer
import json
Let’s define a dataset to create embeddings
texts = [
{
"name": "A Gift From The Stars",
"description": "This is the true story of an abduction and a rescue by benevolent extraterrestrials, various direct contacts Elena Danaan had throughout the years with UFOs and visitors from other worlds.",
"author": "Elena Dannan",
"year": 2020,
},
{
"name": "The Royal Abduction",
"description": "Shreya Singh, a princess from Rajasthan, has been abducted! A woman of beauty and substance, she is living a lavish life. But while there are abundant riches in her palace, there are also dark secrets about her family buried in the past.",
"author": "Vikram Singh",
"year": 2023,
}
]
Creating a memory instance using QDrant:
# Create sentence transformer model to generate embeddings
encoder = SentenceTransformer("all-MiniLM-L6-v2")
# Create in-memory Qdrant instance
qdrant = QdrantClient(":memory:")
# Create collection to store books
qdrant.recreate_collection(
collection_name="my_books",
vectors_config=models.VectorParams(
size=encoder.get_sentence_embedding_dimension(),
distance=models.Distance.COSINE,
)
)
Generate embedding and vectorise from the defined text
# Generate embeddings for the texts
embeddings = encoder.encode(texts)
# Upload embeddings to the specified collection
qdrant.upload_points(
collection_name="my_books",
points=[
models.Record(id=idx, vector=embedding.tolist())
for idx, embedding in enumerate(embeddings)
],
)
# Vectorize descriptions and upload to qdrant
qdrant.upload_points(
collection_name="my_books",
points=[
models.Record(
id=idx, vector=encoder.encode(doc["description"]).tolist(), payload=doc
)
for idx, doc in enumerate(texts)
],
)
Define question and fetch relevant information from Qdrant
question = "What are some books that talk about alien abductions?"
hits = qdrant.search(
collection_name="my_books", query_vector=encoder.encode(question).tolist()
)
Create a list with the fetched information to perform evaluations
data = []
for hit in hits:
data.append(
{
"question": question,
"context": "Book Description for "
+ hit.payload.get("name", "")
+ " : "
+ hit.payload.get("description", ""),
}
)
Evaluate the retrieval quality using UpTrain
OPENAI_API_KEY = "sk-*****************" # Insert your OpenAI key here
eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)
res = eval_llm.evaluate(
data=data,
checks=[Evals.CONTEXT_RELEVANCE]
)
print(json.dumps(res, indent=3))
Let’s look at the retrieval quality of the context documents
[
{
"question": "What are some books that talk about alien abductions?",
"context": "Book Description for A Gift From The Stars : This is the true story of an abduction and a rescue by benevolent extraterrestrials, various direct contacts Elena Danaan had throughout the years with UFOs and visitors from other worlds.",
"score_context_relevance": 1.0,
"explanation_context_relevance": "The extracted context provides a specific book description that directly addresses the question about books on alien abductions. Hence, the selected choice is (A)"
},
{
"question": "What are some books that talk about alien abductions?",
"context": "Book Description for The Royal Abduction : Shreya Singh, a princess from Rajasthan, has been abducted! A woman of beauty and substance, she is living a lavish life. But while there are abundant riches in her palace, there are also dark secrets about her family buried in the past.",
"score_context_relevance": 0.0,
"explanation_context_relevance": "The extracted context does not contain any information about books that talk about alien abductions. It discusses a book called 'The Royal Abduction' which is about a princess being abducted, but it does not relate to alien abductions. Therefore, the selected choice is (C)"
}
]
According to these evaluations:
- Example 1: The context contains information about a book “A Gift From The Stars”, which is related to the question asked i.e. books about alien abductions.
- Example 2: The context contains information about a book “The Royal Abduction”, which is related to a abduction but there is no reference specific to alien abduction, making the context irrelevant.