Smarter Query Handling: Multi-Level Prompting in a RAG-Based Chatbot (Part 3)

In Part 1, we built a basic RAG-based chatbot using OpenAI and FAISS to answer questions from a custom knowledge base. In Part 2, we added conversational memory to make the chatbot aware of previous messages.
Now, in Part 3, we’re making the chatbot even smarter — by giving it the ability to understand the user’s intent and route the query accordingly.
Let’s dive into multi-level prompting and intent routing.
The Problem
As PointsCrowd.com grows beyond blog content and becomes a full-fledged award redemption engine, users may ask more than just “how many points do I need for a Hyatt stay?” They might want to:
- Find redemptions using their personal points wallet
- Discover redemptions based on credit card bonuses
- Or simply learn about loyalty programs
Our original chatbot treated all queries the same — searching for a matching answer in the knowledge base. This approach made it difficult to handle queries that required personalized logic or data.
The Solution: Multi-Level Prompting
To solve this, we introduced a lightweight intent classifier — a small LLM-powered module that detects what the user is trying to do, before we pass the query to our response engine.
Based on the classified intent, the chatbot now supports three routing paths:
- Knowledge Base (RAG): For general research and learning
- Wallet Search: For users with stored loyalty points
- Credit Card Bonus Search: For users exploring redemptions based on signup bonuses
This architecture makes the bot modular and scalable.
How It Works
We start with a simple prompt-based classifier:
from typing import Literal
from openai import OpenAI
Intent = Literal["research", "wallet", "credit_card", "unknown"]
class IntentClassifier:
def __init__(self, model="gpt-4.1-nano"):
self.model = OpenAI(model=model)
def classify(self, user_input: str) -> Intent:
prompt = (
"Classify the following query into one of: "
"'research', 'wallet', 'credit_card', or 'unknown'.\n\n"
f"User query: {user_input}\nIntent:"
)
response = self.model.chat.completions.create(
messages=[{"role": "user", "content": prompt}],
temperature=0,
)
return response.choices[0].message.content.strip().lower()
Once we know the intent, we route accordingly:
def route_query(user_input: str, intent: Intent):
if intent == "research":
return handle_research_query(user_input)
elif intent == "wallet":
return handle_wallet_query(user_input)
elif intent == "credit_card":
return handle_credit_card_query(user_input)
else:
return "Sorry, I couldn't determine what you're looking for."
Each of the handle_* functions can be developed independently — and use entirely different logic paths or external data.
Why It Matters
This routing layer helps the bot stay:
- Focused — no vague answers from irrelevant data
- Efficient — skips unnecessary retrievals or computations
- Extensible — easily add more paths in the future (e.g., bookings, alerts)
It’s a big step toward real-world AI agents that can work across mixed datasets and services — not just regurgitate documentation.
Watch the Video
We walk through the full implementation and demo the smart routing system in Part 3 of our YouTube series:
Coming Up Next
In future parts, we’ll explore:
- Security & Guardrails — protecting your bot from malicious prompts
- Quality Feedback Loop — evaluating and improving response quality automatically
Until then — keep building!