Our Blog

Smarter Query Handling: Multi-Level Prompting in a RAG-Based Chatbot (Part 3)

Nick FursaPublished: 13 Jun 2025

building smarter rag chatbot multi level prompting part 3

In Part 1, we built a basic RAG-based chatbot using OpenAI and FAISS to answer questions from a custom knowledge base. In Part 2, we added conversational memory to make the chatbot aware of previous messages.

Now, in Part 3, we’re making the chatbot even smarter — by giving it the ability to understand the user’s intent and route the query accordingly.

Let’s dive into multi-level prompting and intent routing.

The Problem

As PointsCrowd.com grows beyond blog content and becomes a full-fledged award redemption engine, users may ask more than just “how many points do I need for a Hyatt stay?” They might want to:

Find redemptions using their personal points wallet
Discover redemptions based on credit card bonuses
Or simply learn about loyalty programs

Our original chatbot treated all queries the same — searching for a matching answer in the knowledge base. This approach made it difficult to handle queries that required personalized logic or data.

The Solution: Multi-Level Prompting

To solve this, we introduced a lightweight intent classifier — a small LLM-powered module that detects what the user is trying to do, before we pass the query to our response engine.

Based on the classified intent, the chatbot now supports three routing paths:

Knowledge Base (RAG): For general research and learning
Wallet Search: For users with stored loyalty points
Credit Card Bonus Search: For users exploring redemptions based on signup bonuses

This architecture makes the bot modular and scalable.

How It Works

We start with a simple prompt-based classifier:

from typing import Literal
from openai import OpenAI

Intent = Literal["research", "wallet", "credit_card", "unknown"]

class IntentClassifier:
    def __init__(self, model="gpt-4.1-nano"):
        self.model = OpenAI(model=model)

    def classify(self, user_input: str) -> Intent:
        prompt = (
            "Classify the following query into one of: "
            "'research', 'wallet', 'credit_card', or 'unknown'.\n\n"
            f"User query: {user_input}\nIntent:"
        )
        response = self.model.chat.completions.create(
            messages=[{"role": "user", "content": prompt}],
            temperature=0,
        )
        return response.choices[0].message.content.strip().lower()

Once we know the intent, we route accordingly:

def route_query(user_input: str, intent: Intent):
    if intent == "research":
        return handle_research_query(user_input)
    elif intent == "wallet":
        return handle_wallet_query(user_input)
    elif intent == "credit_card":
        return handle_credit_card_query(user_input)
    else:
        return "Sorry, I couldn't determine what you're looking for."

Each of the handle_* functions can be developed independently — and use entirely different logic paths or external data.

Why It Matters

This routing layer helps the bot stay:

Focused — no vague answers from irrelevant data
Efficient — skips unnecessary retrievals or computations
Extensible — easily add more paths in the future (e.g., bookings, alerts)

It’s a big step toward real-world AI agents that can work across mixed datasets and services — not just regurgitate documentation.

Watch the Video

We walk through the full implementation and demo the smart routing system in Part 3 of our YouTube series:

Watch on YouTube

Code on GitHub

Join the Beta

Coming Up Next

In future parts, we’ll explore:

Security & Guardrails — protecting your bot from malicious prompts
Quality Feedback Loop — evaluating and improving response quality automatically

Until then — keep building!

Share this with your friends