Analyzing RAG Failure Cases
Helps analyse failure causes in a RAG pipeline
By the end of this tutorial, you will be able to:
- Understand the failure cases in a RAG pipeline
- Perform Root Cause Analysis on your RAG pipeline
- Get actionable insights to improve your RAG pipeline
Let’s start by understanding what RAG is and how this tutorial will help you.
What is RAG?
RAG is the process of utilising external knowledge in your LLM-based application.
For example: Imagine you have a knowledge document outlining various scenarios for handling customer queries (question). With an LLM-powered bot at your disposal, the goal is to provide users with accurate responses based on the information in the knowledge document.
You can give an LLM relevant chunks of this information (retrieved context) to provide a better answer to user’s query. The LLM can utilize certain portion of this retrieved context to generate a response.
How will this tutorial help?
Let’s say you already have a RAG pipeline but you are not satisfied with the quality of responses you are getting.
Figuring out the root cause of this failure might be a bit difficult as RAG involves multiple steps and you would have to go through each step to figure out what went wrong.
Through this tutorial we will try to walk you an easy way to figure out the failure cases in your RAG pipeline. Let’s look at some major failure cases first:
Failure Case | Explanation | Example |
---|---|---|
Poor Context Utilization | The citations from the context are irrelevant to a user’s query | The LLM cites information on offers rather than refunds from a context containing information on both refunds and offers, for a question “Can I get a refund?” |
Poor Retrieval | The context given to an LLM does not have information relevant to the question | The user asks “Do you deliver to Bangalore?” but the context does not have any information to deliveries in Bangalore |
Hallucinations | The generated response is not supported by information present in the context | The LLM generates a response “We deliver to Bangalore” when the information present in the context is: “We are going to start delivers in Bangalore soon” |
Poor Citation | The response generated can not be verified with the citation | The LLM cites “We deliver to Delhi” from the context for a response saying “We deliver to Bangalore” |
Incomplete Question | The user’s question itself does not make sense | The user asks something like: “When delivery?”, “What location?” |
How does it Work?
Let’s jump to the code
Install UpTrain
%pip install uptrain
Let's define a sample dataset to run evaluations
UpTrain uses these 4 parameters to perform RCA on your RAG pipeline:
Parameter | Explanation |
---|---|
question | This is the query asked by your user. |
context | This is the context that you pass to an LLM (retrieved context) |
response | The response generated by the LLM |
cited_context | The relevant portion of the retrieved context that the LLM cites to generate response. |
data = [
{
"question": "Can FedL send a delivery to Bangalore?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "Yes, FedL can send a delivery to Bangalore.",
"cited_context": "FedL was established in 2020.",
},
{
"question": "Can I send a delivery to my friend living in the USA?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "No, FedL does not operate outside of India.",
"cited_context": "Using FedL you can send deliveries to over 1000+ cities in India",
},
{
"question": "Can I send a car through FedL?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "No, you can't use FedL to send a car.",
"cited_context": "Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family.",
},
{
"question": "Can I use FedL to send my car which weighs 1000kg?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "Yes, you can use FedL to deliver your 1000kg car .",
"cited_context": "Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family.",
},
{
"question": "Delivery?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "Your question seems incomplete, can you provide some more context about your query.",
"cited_context": "",
}
]
Perform failure analysis using UpTrain
Here we will be using an instance of EvalLLM
to perform RCA on your RAG pipeline.
You need an OpenAI key to generate evaluations using UpTrain.
from uptrain import RcaTemplate, EvalLLM
OPENAI_API_KEY = "sk-***********" # Insert your OpenAI API key here
eval_client = EvalLLM(openai_api_key=OPENAI_API_KEY)
res = eval_client.perform_root_cause_analysis(
data = data,
rca_template = RcaTemplate.RAG_WITH_CITATION
)
Let's look at the results
Key information present in your results:
Parameter | Explanation |
---|---|
error_mode | The specific failure reason identified in your data |
error_resolution_suggestion | Actionable insights to improve your RAG pipeline |
Besides this the results also provide scores to different aspects of your data along with reasoning.
You can also look at our docs to know more about these evaluations.
import json
print(json.dumps(res[0], indent = 3))
Here’s a sample response:
{
"question": "Can FedL send a delivery to Bangalore?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "Yes, FedL can send a delivery to Bangalore.",
"cited_context": "FedL was established in 2020.",
"error_mode": "Poor Context Utilization",
"error_resolution_suggestion": "Add intermediary steps so as the LLM can better understand context and generate a complete response",
"score_question_completeness": 1,
"score_valid_response": 1.0,
"explanation_valid_response": "The response \"Yes, FedL can send a delivery to Bangalore\" contains information that directly addresses the question. It confirms that FedL can indeed send a delivery to Bangalore. Therefore, the selected choice is A.\n\n[Choice]: A",
"score_context_relevance": 1.0,
"explanation_context_relevance": "['The extracted context can answer the given user query completely, as it explicitly states that using FedL, deliveries can be sent to over 1000+ cities in India, including major cities like Bangalore, Mumbai, and Delhi. Therefore, it confirms that FedL can send a delivery to Bangalore, fulfilling the user query completely.']",
"score_factual_accuracy": 1.0,
"explanation_factual_accuracy": "1. FedL can send a delivery to Bangalore.\nReasoning for yes: The context explicitly states that using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore.\nReasoning for no: No arguments.\nJudgement: yes. as the context explicitly supports the fact.\n\n",
"score_cited_context_relevance": 0.0,
"explanation_cited_context_relevance": " \"The extracted context doesn't contain any information about FedL's delivery services or their coverage area. Therefore, it is impossible to determine whether FedL can send a delivery to Bangalore based on the given context alone.\"\n",
"score_factual_accuracy_wrt_cited": 0.5,
"explanation_factual_accuracy_wrt_cited": "1. FedL can send a delivery to Bangalore.\nReasoning for yes: The context does not explicitly mention whether FedL can send a delivery to Bangalore or not.\nReasoning for no: The context only provides information about the establishment of FedL and does not mention anything about its delivery services to Bangalore.\nJudgement: unclear. as the fact cannot be explicity supported or contradicted by the context.\n\n"
}
This is the example of response generated on the the datapoint:
{
"question": "Can FedL send a delivery to Bangalore?",
"context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
"response": "Yes, FedL can send a delivery to Bangalore.",
"cited_context": "FedL was established in 2020.",
}
Here we can see that the user is asking about a specific delivery location but the LLM has cited irrelevant information on when FedL was established.
Hence the failure case is Poor Context Utilization
Was this page helpful?