Our Blog

Building a GPT-Powered Chatbot: A Step-by-Step Journey

Nick FursaPublished: 26 Dec 2024

AI can sometimes feel like an intimidating “black box,” but my recent project helped me understand that it doesn’t have to be. With some curiosity and hands-on experimentation, I built a GPT-4-powered chatbot trained on a specific knowledge base. In this article, I’ll walk you through the process and share what I learned along the way.

For a detailed walkthrough, check out my YouTube video where I demonstrate how I replicated the functionality of commercial AI tools using Python, FAISS, and OpenAI’s GPT-4.

You can also explore the code on GitHub: GitHub Repository. Want to try the chatbot? It’s live in beta here: PointsCrowd Beta.

What I Built

I aimed to replicate the functionality of tools like WonderChat, which train AI models on custom knowledge bases for use in websites or apps. The chatbot uses GPT-4 to answer questions based on a collection of blog articles and FAQs from my website, PointsCrowd.com. The bot delivers accurate, conversational responses tailored to this specific knowledge base.

The Technical Stack

Here’s the technology I used for this project:

• Python: The core programming language for scripting.

• FAISS: A vector database developed by Meta to efficiently store and retrieve data.

• OpenAI GPT-4: The pre-trained language model for generating responses.

• FastAPI: To make the chatbot accessible via API.

• EC2 Instance: For deploying the chatbot.

Step-by-Step Implementation

Before diving into the project, ensure you have all necessary dependencies installed. Below is a quick guide to set up your environment:

Install Python

Make sure you have Python 3.12 or a later version installed on your machine. You can download it from the official Python website.

Install Required Libraries

The following Python libraries are used in this project:

• fastapi: For building the chatbot API.

• uvicorn: For running the API server.

• faiss-cpu: For vector similarity search.

• langchain and langchain-openai: For working with GPT-4 and FAISS integration.

Install the libraries by running:

pip install fastapi uvicorn faiss-cpu langchain langchain-openai

Alternatively, use a requirements.txt file to manage dependencies:

fastapi
uvicorn
faiss-cpu==1.8.0.post1
langchain
langchain-openai

Run the following command to install everything:

pip install -r requirements.txt

1. Preparing the Knowledge Base

The first step is to collect articles and FAQs from PointsCrowd.com to serve as the bot’s knowledge base. Each document consists of:

• A .txt file for the content.

• Metadata (e.g., title, URL) at the top of the file.

Here’s how I structured the dataset:

data/
  ├── article1.txt
  ├── article2.txt
  ├── ...

This step converts the documents into vector embeddings and stores them in a FAISS vector store for fast similarity search. The following script performs this task:

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.schema import Document
import json
from pathlib import Path
import os
# Load articles and prepare documents
def collect_articles():
    articles_folder = Path("data")
    documents = []
    for file_path in articles_folder.rglob("*.txt"):
        # Read article content
        with open(file_path, "r") as f:
            article_content = f.read()
        # Create a Document object
        documents.append(Document(
            page_content=article_content
        ))
    return documents
# Create FAISS vector store
def create_faiss_index():
    # Collect articles
    documents = collect_articles()
    # Generate embeddings
    embeddings = OpenAIEmbeddings(openai_api_key="YOUR_OPENAI_API_KEY")
    # Create FAISS database
    db = FAISS.from_documents(documents, embeddings)
    # Save the database for future use
    db.save_local("model")
# Run the indexing process
if __name__ == "__main__":
    create_faiss_index()

2. Building the Chatbot API

Next, let’s set up FastAPI to create a simple chatbot API that accepts user queries and returns answers. The API uses the FAISS vector store for data retrieval and GPT-4 for response generation.

Here’s the code:

from fastapi import FastAPI
from pydantic import BaseModel
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
app = FastAPI()
# Load FAISS vector store
db = FAISS.load_local("model", OpenAIEmbeddings(openai_api_key="YOUR_OPENAI_API_KEY"))
retriever = db.as_retriever(search_kwargs={"k": 3})
llm = ChatOpenAI(temperature=0.0, model_name="gpt-4o-mini", openai_api_key="YOUR_OPENAI_API_KEY")
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
class Query(BaseModel):
    query: str
@app.post("/ask/")
def ask(query: Query):
    response = qa_chain.run(query.query)
    return {"response": response}

3. Setting Up the Environment and Deploying

Let’s set up an AWS EC2 instance to host the chatbot API.

Instance details:

amazon/al2023-ami-2023.5.20241001.1-kernel-6.1-arm64

Instance type:

t4g.nano

Here’s how to configure the environment:

# Update system packages
sudo yum update -y
# Install Python 3.12
sudo yum install gcc openssl-devel bzip2-devel libffi-devel zlib-devel wget -y
wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz
tar xzf Python-3.12.0.tgz
cd Python-3.12.0
./configure --enable-optimizations
make -j 4
sudo make altinstall
# Install pip and dependencies
curl -O https://bootstrap.pypa.io/get-pip.py
python3.12 get-pip.py
pip install fastapi uvicorn faiss-cpu langchain langchain-openai

To deploy the API use uvicorn:

uvicorn handler:app --host 0.0.0.0 --port 8000

4. Testing the Chatbot

To test send test queries to the chatbot API using curl or Postman:

curl --location 'http://<your-ec2-instance-ip>:8000/ask/' \
--header 'Content-Type: application/json' \
--data '{"query": "What’s the fastest way to earn Marriott Bonvoy points?"}'

What’s Next?

The chatbot is now live and in beta on PointsCrowd.com. If you are interested, participate in the Beta and share your feedback.

Watch the Full Walkthrough

Want a step-by-step explanation? Watch the YouTube video:

👉 YouTube Video

Check Out the Code

Everything is on GitHub:

👉 GitHub Repository

Final Thoughts

This project was a way for me to learn and experiment with AI in a hands-on way. My goal is to show that AI-powered tools are achievable with the right resources and mindset. Whether you’re a beginner or an experienced developer, I hope this inspires you to explore and build.

If you have any feedback or questions, feel free to reach out—I’d love to hear from you! 🚀

Share this with your friends