What You'll Build
By the end of this 7-day tutorial, you'll have a production-ready AI chatbot that can handle customer inquiries, integrate with your existing systems, and provide intelligent, context-aware responses. This isn't a toy project—it's a real implementation you can deploy immediately.
What You'll Learn:
- ✓How to architect a scalable chatbot using OpenAI's GPT-4 API (or alternatives)
- ✓Building a custom knowledge base with vector embeddings for context-aware responses
- ✓Implementing conversation memory and context management
- ✓Integrating with your website, Slack, or Microsoft Teams
- ✓Cost optimization strategies to keep API expenses under control
- ✓Security best practices and compliance considerations
Prerequisites: Basic JavaScript/Python knowledge, access to OpenAI API (free tier works), and 2-3 hours per day for 7 days. Total budget: $10-50 depending on testing volume.
Technical Foundation: Understanding the Architecture
Before we start coding, let's understand what we're building. According to Gartner's 2024 Conversational AI Report, successful chatbot implementations share three core components: natural language processing, knowledge management, and integration layer. We'll build all three.
High-Level Architecture
Frontend Interface
A React-based chat widget that embeds on your website. Handles user input, displays responses, and manages conversation UI. We'll use React Chat UI Kit (MIT licensed) for rapid development.
Backend API Server
A Node.js/Express server that receives user messages, processes them through the AI model, manages conversation history, and returns intelligent responses. Hosted on Vercel or Railway (both offer free tiers).
AI Processing Layer
OpenAI GPT-4 Turbo API for natural language understanding and generation. We'll implement RAG (Retrieval-Augmented Generation) using Pinecone vector database for custom knowledge retrieval—a pattern recommended by OpenAI's documentation for production chatbots.
Knowledge Base
Your company's FAQs, product docs, and policies converted into vector embeddings stored in Pinecone. When users ask questions, we retrieve relevant context before generating responses—dramatically improving accuracy vs. vanilla GPT-4.
Database & Logging
PostgreSQL (Supabase free tier) for conversation history, user sessions, and analytics. Essential for debugging and improving the bot over time.
Tech Stack (All Free/Open Source)
Frontend:
- • React 18 + TypeScript
- • TailwindCSS for styling
- • Socket.io-client for real-time
- • React Chat UI Kit
Backend:
- • Node.js 20 + Express
- • OpenAI SDK v4
- • Pinecone client
- • PostgreSQL (Supabase)
The 7-Day Build Plan
Day 1: Setup & First API Call
Time: 2-3 hours | Goal: Get OpenAI working locally
Step 1: Create OpenAI Account & Get API Key
- Sign up at platform.openai.com
- Navigate to API Keys section
- Create new key → Save it securely (you won't see it again)
- Add $5-10 credit (gives you ~100K tokens, enough for development)
Source: OpenAI Platform Documentation - "Quickstart" (2024)
Step 2: Initialize Your Project
# Create project directory mkdir ai-chatbot && cd ai-chatbot # Initialize Node.js project npm init -y # Install dependencies npm install openai express dotenv cors # Create environment file echo "OPENAI_API_KEY=your_key_here" > .env
Step 3: Your First AI Response
Create test.js to verify your API connection:
require('dotenv').config();
const OpenAI = require('openai');
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
async function testChat() {
try {
const completion = await openai.chat.completions.create({
model: "gpt-4-turbo",
messages: [
{
role: "system",
content: "You are a helpful customer service assistant."
},
{
role: "user",
content: "What are your business hours?"
}
],
max_tokens: 150,
temperature: 0.7
});
console.log("AI Response:", completion.choices[0].message.content);
console.log("Tokens used:", completion.usage.total_tokens);
console.log("Est. cost: $" + (completion.usage.total_tokens * 0.00003).toFixed(4));
} catch (error) {
console.error("Error:", error.message);
}
}
testChat();Run with: node test.js
Day 1 Success Criteria
- ✓ OpenAI API key configured and working
- ✓ Successfully received first AI response
- ✓ Understand token usage and cost structure
- ✓ Project dependencies installed
Day 2: Build the Backend API
Time: 2-3 hours | Goal: Create REST API for chat
Create Express Server
Create server.js:
require('dotenv').config();
const express = require('express');
const cors = require('cors');
const OpenAI = require('openai');
const app = express();
const port = process.env.PORT || 3001;
app.use(cors());
app.use(express.json());
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
// Store conversation history in memory (we'll move to DB on Day 5)
const conversationHistory = {};
// POST /api/chat - Main chatbot endpoint
app.post('/api/chat', async (req, res) => {
try {
const { message, sessionId = 'default' } = req.body;
// Initialize conversation for this session
if (!conversationHistory[sessionId]) {
conversationHistory[sessionId] = [
{
role: "system",
content: "You are a helpful customer service assistant for Intgr8AI, an AI integration company. Be concise, friendly, and professional."
}
];
}
// Add user message to history
conversationHistory[sessionId].push({
role: "user",
content: message
});
// Keep only last 10 messages to control token usage
if (conversationHistory[sessionId].length > 10) {
conversationHistory[sessionId] = [
conversationHistory[sessionId][0], // Keep system message
...conversationHistory[sessionId].slice(-9) // Keep last 9 messages
];
}
// Call OpenAI API
const completion = await openai.chat.completions.create({
model: "gpt-4-turbo",
messages: conversationHistory[sessionId],
max_tokens: 500,
temperature: 0.7
});
const aiResponse = completion.choices[0].message.content;
// Add AI response to history
conversationHistory[sessionId].push({
role: "assistant",
content: aiResponse
});
// Return response with usage stats
res.json({
success: true,
response: aiResponse,
usage: {
tokens: completion.usage.total_tokens,
estimatedCost: (completion.usage.total_tokens * 0.00003).toFixed(4)
}
});
} catch (error) {
console.error('Chat error:', error);
res.status(500).json({
success: false,
error: 'Failed to process message'
});
}
});
// Health check endpoint
app.get('/api/health', (req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
app.listen(port, () => {
console.log(`Server running on http://localhost:${port}`);
});Test the API
Start server: node server.js
Test with curl:
curl -X POST http://localhost:3001/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello! What services do you offer?", "sessionId": "test123"}'Day 2 Success Criteria
- ✓ Express server running on port 3001
- ✓ POST /api/chat endpoint responding correctly
- ✓ Conversation memory working within a session
- ✓ Token usage being tracked and returned
Day 3: Build React Chat Widget
Time: 2-3 hours | Goal: Create user-facing chat interface
Create Chat Component
Create src/components/ChatWidget.jsx:
import { useState } from 'react';
import axios from 'axios';
export default function ChatWidget() {
const [messages, setMessages] = useState([]);
const [input, setInput] = useState('');
const [loading, setLoading] = useState(false);
const sendMessage = async () => {
if (!input.trim()) return;
const userMessage = { role: 'user', content: input };
setMessages([...messages, userMessage]);
setInput('');
setLoading(true);
try {
const response = await axios.post('http://localhost:3001/api/chat', {
message: input,
sessionId: 'user-' + Date.now()
});
setMessages(prev => [...prev, {
role: 'assistant',
content: response.data.response
}]);
} catch (error) {
console.error('Error:', error);
} finally {
setLoading(false);
}
};
return (
<div className="chat-widget">
<div className="messages">
{messages.map((msg, idx) => (
<div key={idx} className={`message ${msg.role}`}>
<strong>{msg.role}:</strong> {msg.content}
</div>
))}
{loading && <div>Thinking...</div>}
</div>
<div className="input-area">
<input
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
placeholder="Ask me anything..."
/>
<button onClick={sendMessage}>Send</button>
</div>
</div>
);
}Day 3 Success Criteria
- ✓ Chat UI displays messages in conversation format
- ✓ User can send messages and see AI responses
- ✓ Loading states work properly
- ✓ Enter key sends messages
Day 4: Add Knowledge Base (RAG)
Time: 3 hours | Goal: Integrate custom knowledge
Setup Pinecone & Create Embeddings
// Install: npm install @pinecone-database/pinecone
const { Pinecone } = require('@pinecone-database/pinecone');
const OpenAI = require('openai');
const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// 1. Create embeddings for your docs
async function createEmbeddings(text) {
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: text
});
return response.data[0].embedding;
}
// 2. Store in Pinecone
const index = pinecone.index('knowledge-base');
await index.upsert([{
id: 'doc-1',
values: await createEmbeddings("Your FAQ or product info here"),
metadata: { text: "Your FAQ or product info here" }
}]);
// 3. Query knowledge base in chat endpoint
app.post('/api/chat', async (req, res) => {
const { message } = req.body;
// Search knowledge base
const queryEmbedding = await createEmbeddings(message);
const results = await index.query({
vector: queryEmbedding,
topK: 3,
includeMetadata: true
});
// Build context from results
const context = results.matches
.map(m => m.metadata.text)
.join('\n\n');
// Add context to system message
const completion = await openai.chat.completions.create({
model: "gpt-4-turbo",
messages: [
{
role: "system",
content: `You are a helpful assistant. Use this context to answer: \n${context}`
},
{ role: "user", content: message }
]
});
res.json({ response: completion.choices[0].message.content });
});Pro Tip: RAG (Retrieval-Augmented Generation) drastically improves accuracy for company-specific questions. Without RAG, GPT-4 "hallucinates" ~15% of the time. With RAG, error rate drops to <3%.
Day 5: Add Database Persistence
Time: 2 hours | Goal: Save conversations to PostgreSQL
Setup Supabase & Save Conversations
// Install: npm install @supabase/supabase-js
const { createClient } = require('@supabase/supabase-js');
const supabase = createClient(
process.env.SUPABASE_URL,
process.env.SUPABASE_KEY
);
// Create table (run in Supabase SQL editor):
/*
CREATE TABLE conversations (
id SERIAL PRIMARY KEY,
session_id TEXT,
message TEXT,
role TEXT,
tokens_used INT,
created_at TIMESTAMP DEFAULT NOW()
);
*/
// Save each message
app.post('/api/chat', async (req, res) => {
const { message, sessionId } = req.body;
// Save user message
await supabase.from('conversations').insert({
session_id: sessionId,
message: message,
role: 'user'
});
// Get AI response
const completion = await openai.chat.completions.create({...});
const aiResponse = completion.choices[0].message.content;
// Save AI response
await supabase.from('conversations').insert({
session_id: sessionId,
message: aiResponse,
role: 'assistant',
tokens_used: completion.usage.total_tokens
});
res.json({ response: aiResponse });
});
// Retrieve conversation history
app.get('/api/history/:sessionId', async (req, res) => {
const { data } = await supabase
.from('conversations')
.select('*')
.eq('session_id', req.params.sessionId)
.order('created_at', { ascending: true });
res.json({ history: data });
});Day 6: Security & Rate Limiting
Time: 2 hours | Goal: Protect against abuse
Implement Rate Limiting & Input Validation
// Install: npm install express-rate-limit
const rateLimit = require('express-rate-limit');
// Rate limiter: 20 requests per 15 minutes
const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 20,
message: 'Too many requests, please try again later.'
});
app.use('/api/chat', limiter);
// Input validation
function sanitizeInput(text) {
// Limit length
if (text.length > 500) {
throw new Error('Message too long (max 500 chars)');
}
// Remove potential injection attempts
const cleaned = text
.replace(/<script[^>]*>.*?<\/script>/gi, '')
.replace(/javascript:/gi, '')
.trim();
return cleaned;
}
app.post('/api/chat', limiter, async (req, res) => {
try {
const message = sanitizeInput(req.body.message);
// Set max token budget per request
const MAX_TOKENS = 500;
const completion = await openai.chat.completions.create({
model: "gpt-4-turbo",
messages: conversationHistory,
max_tokens: MAX_TOKENS,
temperature: 0.7
});
// Track costs
const cost = completion.usage.total_tokens * 0.00003;
console.log(`Request cost: $${cost.toFixed(4)}`);
res.json({ response: completion.choices[0].message.content });
} catch (error) {
res.status(400).json({ error: error.message });
}
});Day 7: Deploy to Production
Time: 1-2 hours | Goal: Go live!
Deploy Backend to Vercel
# 1. Install Vercel CLI
npm i -g vercel
# 2. Create vercel.json
{
"version": 2,
"builds": [{ "src": "server.js", "use": "@vercel/node" }],
"routes": [{ "src": "/(.*)", "dest": "/server.js" }],
"env": {
"OPENAI_API_KEY": "@openai-key",
"SUPABASE_URL": "@supabase-url",
"SUPABASE_KEY": "@supabase-key",
"PINECONE_API_KEY": "@pinecone-key"
}
}
# 3. Deploy
vercel --prod
# 4. Add environment variables in Vercel dashboard
# Settings > Environment VariablesDeploy Frontend to Netlify/Vercel
# Update API endpoint in frontend const API_URL = 'https://your-api.vercel.app'; # Deploy with Vercel vercel --prod # Or push to GitHub and connect to Netlify
🎉 You're Live!
- ✓ Backend deployed and accessible
- ✓ Frontend deployed and connected to backend
- ✓ Environment variables configured
- ✓ Test with real users and monitor costs
Production Best Practices
Security & Compliance
- Never expose API keys client-side: Always proxy through your backend. OpenAI charges per token, and exposed keys can be abused (Source: OWASP API Security Top 10, 2023).
- Implement rate limiting: Use
express-rate-limitto prevent abuse (max 20 requests/minute per IP). - Sanitize user input: Filter out injection attempts, profanity, and personally identifiable information (PII) before sending to AI.
- GDPR/Privacy compliance: Log conversations with user consent, provide data deletion API, don't train models on customer data without permission.
Cost Optimization
Token Management Strategies:
- • Limit conversation history: Keep only last 10 messages (saves ~70% tokens vs full history)
- • Use GPT-4 Turbo vs GPT-4: 50% cheaper at $0.01/1K tokens (input) vs $0.03/1K
- • Implement caching: Cache responses to frequent questions (reduces API calls by 30-40%)
- • Set max_tokens wisely: 500 tokens = ~375 words, sufficient for most responses
Budget estimate: With these optimizations, expect $0.03-0.05 per conversation (avg 8 messages). 1,000 conversations/month = $30-50 in API costs.
Source: OpenAI Pricing Calculator + Intgr8AI client benchmark data (Oct 2024)
Monitoring & Improvement
According to Stanford's AI Index Report 2024, chatbots that implement active monitoring improve accuracy by 23% within first 90 days. Track these metrics:
- • Conversation completion rate: % of chats that end without escalation
- • Average response time: Should be <2 seconds for good UX
- • User satisfaction: Thumbs up/down after each response
- • Failed intent detection: Questions the bot couldn't answer
- • Token usage trends: Detect cost anomalies early
Resources & Further Reading
Official Documentation
- [1] OpenAI. (2024). "GPT-4 Turbo API Documentation." platform.openai.com/docs
- [2] OpenAI. (2024). "Production Best Practices." OpenAI Developer Forum.
- [3] Pinecone. (2024). "Vector Embeddings for Semantic Search." Pinecone Documentation.
Industry Research
- [4] Gartner, Inc. (2024). "Conversational AI Platforms Report." Gartner Research, March 2024.
- [5] Stanford HAI. (2024). "AI Index Report 2024." Stanford Institute for Human-Centered AI.
- [6] OWASP Foundation. (2023). "API Security Top 10." OWASP.org
Code Repository
Complete source code for this tutorial (Days 1-7):
github.com/intgr8ai/chatbot-tutorialIncludes: Full backend/frontend code, deployment configs, testing scripts, and FAQ troubleshooting guide.
Final Thoughts
Building an AI chatbot is no longer a months-long enterprise project. With modern tools like OpenAI's API, vector databases, and serverless hosting, you can have a production-ready bot in under a week and a budget of under $100/month.
The key is to start simple, measure everything, and iterate based on real user conversations. Don't aim for perfection on day one—aim for good enough to deploy, then improve weekly based on data.
Need help implementing this?
If you want Intgr8AI to build and deploy a custom chatbot tailored to your business (with your branding, knowledge base, and integrations), we can have you live in 5-7 days.
Written by
Talal Alkhaled
Founder & CEO, Intgr8AI
November 1, 2025
Related Blogs

Small Business AI on a Budget: A 30-Day Playbook
A complete breakdown of building AI automation with chat, analytics, and cost controls using tools you already have.

One-Day RAG: PDFs to Answers Without a Backend
Spin up retrieval-augmented answers from your PDFs in one day using no-code storage, hosted embeddings, and a thin serverless edge.

How a Regional Bank Saved $2.4M with AI-Powered Customer Support
A real-world case study showing how a mid-sized bank transformed customer service, reduced costs by 67%, and improved satisfaction scores.
