Analyzing RAG Failure Cases

By the end of this tutorial, you will be able to:

Understand the failure cases in a RAG pipeline
Perform Root Cause Analysis on your RAG pipeline
Get actionable insights to improve your RAG pipeline

Let’s start by understanding what RAG is and how this tutorial will help you.

What is RAG?

RAG is the process of utilising external knowledge in your LLM-based application. For example: Imagine you have a knowledge document outlining various scenarios for handling customer queries (question). With an LLM-powered bot at your disposal, the goal is to provide users with accurate responses based on the information in the knowledge document. You can give an LLM relevant chunks of this information (retrieved context) to provide a better answer to user’s query. The LLM can utilize certain portion of this retrieved context to generate a response.

How will this tutorial help?

Let’s say you already have a RAG pipeline but you are not satisfied with the quality of responses you are getting. Figuring out the root cause of this failure might be a bit difficult as RAG involves multiple steps and you would have to go through each step to figure out what went wrong. Through this tutorial we will try to walk you an easy way to figure out the failure cases in your RAG pipeline. Let’s look at some major failure cases first:

Failure Case	Explanation	Example
Poor Context Utilization	The citations from the context are irrelevant to a user’s query	The LLM cites information on offers rather than refunds from a context containing information on both refunds and offers, for a question “Can I get a refund?”
Poor Retrieval	The context given to an LLM does not have information relevant to the question	The user asks “Do you deliver to Bangalore?” but the context does not have any information to deliveries in Bangalore
Hallucinations	The generated response is not supported by information present in the context	The LLM generates a response “We deliver to Bangalore” when the information present in the context is: “We are going to start delivers in Bangalore soon”
Poor Citation	The response generated can not be verified with the citation	The LLM cites “We deliver to Delhi” from the context for a response saying “We deliver to Bangalore”
Incomplete Question	The user’s question itself does not make sense	The user asks something like: “When delivery?”, “What location?”

How does it Work?

Let’s jump to the code

Install UpTrain

%pip install uptrain

Let's define a sample dataset to run evaluations

UpTrain uses these 4 parameters to perform RCA on your RAG pipeline:

Parameter	Explanation
question	This is the query asked by your user.
context	This is the context that you pass to an LLM (retrieved context)
response	The response generated by the LLM
cited_context	The relevant portion of the retrieved context that the LLM cites to generate response.

data = [
  {
      "question": "Can FedL send a delivery to Bangalore?",
      "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
      "response": "Yes, FedL can send a delivery to Bangalore.",
      "cited_context": "FedL was established in 2020.",        
  },
  {
      "question": "Can I send a delivery to my friend living in the USA?",
      "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
      "response": "No, FedL does not operate outside of India.",
      "cited_context": "Using FedL you can send deliveries to over 1000+ cities in India",        
  },
  {
      "question": "Can I send a car through FedL?",
      "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
      "response": "No, you can't use FedL to send a car.",
      "cited_context": "Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family.",        
  },
  {
      "question": "Can I use FedL to send my car which weighs 1000kg?",
      "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
      "response": "Yes, you can use FedL to deliver your 1000kg car .",
      "cited_context": "Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family.",        
  },
  {
      "question": "Delivery?",
      "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
      "response": "Your question seems incomplete, can you provide some more context about your query.",
      "cited_context": "",        
  }
]

Perform failure analysis using UpTrain

Here we will be using an instance of EvalLLM to perform RCA on your RAG pipeline. You need an OpenAI key to generate evaluations using UpTrain.

from uptrain import RcaTemplate, EvalLLM

OPENAI_API_KEY = "sk-***********"  # Insert your OpenAI API key here

eval_client = EvalLLM(openai_api_key=OPENAI_API_KEY)

res = eval_client.perform_root_cause_analysis(
    data = data,
    rca_template = RcaTemplate.RAG_WITH_CITATION
)

Let's look at the results

Key information present in your results:

Parameter	Explanation
error_mode	The specific failure reason identified in your data
error_resolution_suggestion	Actionable insights to improve your RAG pipeline

Besides this the results also provide scores to different aspects of your data along with reasoning.You can also look at our docs to know more about these evaluations.

  import json
  print(json.dumps(res[0], indent = 3))   

Here’s a sample response:

  {
    "question": "Can FedL send a delivery to Bangalore?",
    "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
    "response": "Yes, FedL can send a delivery to Bangalore.",
    "cited_context": "FedL was established in 2020.",
    "error_mode": "Poor Context Utilization",
    "error_resolution_suggestion": "Add intermediary steps so as the LLM can better understand context and generate a complete response",
    "score_question_completeness": 1,
    "score_valid_response": 1.0,
    "explanation_valid_response": "The response \"Yes, FedL can send a delivery to Bangalore\" contains information that directly addresses the question. It confirms that FedL can indeed send a delivery to Bangalore. Therefore, the selected choice is A.\n\n[Choice]: A",
    "score_context_relevance": 1.0,
    "explanation_context_relevance": "['The extracted context can answer the given user query completely, as it explicitly states that using FedL, deliveries can be sent to over 1000+ cities in India, including major cities like Bangalore, Mumbai, and Delhi. Therefore, it confirms that FedL can send a delivery to Bangalore, fulfilling the user query completely.']",
    "score_factual_accuracy": 1.0,
    "explanation_factual_accuracy": "1. FedL can send a delivery to Bangalore.\nReasoning for yes: The context explicitly states that using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore.\nReasoning for no: No arguments.\nJudgement: yes. as the context explicitly supports the fact.\n\n",
    "score_cited_context_relevance": 0.0,
    "explanation_cited_context_relevance": " \"The extracted context doesn't contain any information about FedL's delivery services or their coverage area. Therefore, it is impossible to determine whether FedL can send a delivery to Bangalore based on the given context alone.\"\n",
    "score_factual_accuracy_wrt_cited": 0.5,
    "explanation_factual_accuracy_wrt_cited": "1. FedL can send a delivery to Bangalore.\nReasoning for yes: The context does not explicitly mention whether FedL can send a delivery to Bangalore or not.\nReasoning for no: The context only provides information about the establishment of FedL and does not mention anything about its delivery services to Bangalore.\nJudgement: unclear. as the fact cannot be explicity supported or contradicted by the context.\n\n"
}

This is the example of response generated on the the datapoint:

  {
    "question": "Can FedL send a delivery to Bangalore?",
    "context": "FedL was established in 2020. Using FedL you can send deliveries to over 1000+ cities in India including major cities like Bangalore, Mumbai and Delhi. Recently we crossed a milestone by completing 1 million deliveries. Using FedL you can deliver any goods under 10kg(not more than that) to anyone whether it be your friends or family. P.S.: We can't deliver electronic devices. You can also use FedL to deliver a car.",
    "response": "Yes, FedL can send a delivery to Bangalore.",
    "cited_context": "FedL was established in 2020.",        
  }

Here we can see that the user is asking about a specific delivery location but the LLM has cited irrelevant information on when FedL was established.Hence the failure case is Poor Context Utilization

Blog

Read our blog on failure cases in RAG pipeline

Tutorial

Open this tutorial in GitHub

Have Questions?

Join our community for any questions or requests

Getting Started

Pre-configured Evaluations

Supported LLMs

Integrations

Tutorials

FAQ

Analyzing RAG Failure Cases

What is RAG?

How will this tutorial help?

How does it Work?

Blog

Tutorial

Have Questions?

Getting Started

Pre-configured Evaluations

Supported LLMs

Integrations

Tutorials

FAQ

​What is RAG?

​How will this tutorial help?

​How does it Work?

Blog

Tutorial

Have Questions?

What is RAG?

How will this tutorial help?

How does it Work?