Portfolio

Watch my comprehensive portfolio showcase videos featuring my latest works in AI & Machine Learning, Full-Stack Development, and Modern Web Technologies

💡 Tip: Increase video quality settings for the best viewing experience

Agentic AI Project

MAG7 SEC Filings Analyzer — Agentic Financial Intelligence Platform

OPENAI / ANTHROPIC / OLLAMADETERMINISTIC ROUTERFAST RAG (SINGLE-CALL)SEC (10-K / 10-Q)CITATIONSPINECONE (SERVERLESS)SENTENCE-TRANSFORMERSHYBRID RETRIEVALRERANKINGQUERY REWRITESEMANTIC CACHING (MD5)MULTI-PROVIDER LLMFASTAPIREACT 18

This is not just RAG. It is a controlled, production-shaped system for answering questions over real SEC 10-K / 10-Q filings with VERIFIABLE CITATIONS.

The workflow is intentionally ULTRA OPTIMIZED for latency + cost: a DETERMINISTIC ROUTER classifies intent (single-company, comparison, ingestion) with NO LLM CALL, then a FAST RAG AGENT compresses retrieval + analysis + reporting into ONE LLM call (about 3× fewer calls than traditional chains).

Example questions:

"Compare AAPL vs MSFT: biggest risks and how they differ."
"What were the biggest YoY changes in operating margin, and why?"
"What changed from last quarter and why?"
"Extract and summarize all risk factors from the latest 10-K."
"Identify key growth drivers mentioned in management commentary."
"Analyze revenue trends and segment performance across quarters."

Retrieval runs on PINECONE (SERVERLESS) with sentence-transformer embeddings, and advanced options of HYBRID RETRIEVAL, SECTION BOOSTING, RERANKING + QUERY REWRITING. An MD5-KEYED SEMANTIC CACHE returns repeated queries in ~20ms.

Ships with a MULTI-PROVIDER LLM abstraction (OpenAI / Anthropic / Ollama) with pooled + cached instances, plus performance engineering via async FastAPI, request deduplication, and concurrent comparison execution.

📊 Real SEC Filings

Ingests and indexes actual 10-K / 10-Q filings from all MAG7 companies — answers are grounded in real financial documents with verifiable citations.

⚡ 3× Fewer LLM Calls

Deterministic router classifies intent with zero LLM overhead, then a Fast RAG Agent compresses retrieval + analysis + reporting into a single call.

🔍 Hybrid Retrieval

Semantic + keyword search with section boosting and Cross-Encoder reranking — surfaces the most relevant filing passages even for complex multi-part questions.

🚀 ~20ms Cache Hits

MD5-keyed semantic cache returns identical or near-identical queries almost instantly — dramatically reducing cost and latency for repeated analysis.

🤖 Multi-Provider LLM

Pooled + cached LLM instances across OpenAI, Anthropic, and Ollama — swap providers freely with concurrent comparison execution for benchmarking.

📈 Company Comparison

Ask cross-company questions like "Compare AAPL vs MSFT risks" — the agent routes to a parallel comparison pipeline and merges results into a unified analysis.

🏗️ Infrastructure

ContainerizationDocker + Docker Compose

App HostingAWS EC2

Edge / HTTPSAWS CloudFront

Reverse ProxyNginx

IaCPulumi (TypeScript)

Watch Video Live Demo

💡 Tip: Increase video quality settings for the best viewing experience

Agentic AI - RAG App Project

Policy Guardian — Enterprise Policy RAG + Citations + Streaming

RAGCITATIONSSTREAMINGMULTIMODALQUERY EXPANSIONHYBRID SEARCHRERANKINGCROSS-ENCODERAUTO REWRITELANGGRAPHLANGCHAINPINECONEREDISFASTAPIMULTI-PROVIDER LLMSSEPOSTGRESQL AUDIT LOGSDOCKER COMPOSE

📂 Upload any contract, policy document, or regulatory filing and the agent will answer your questions with answers grounded directly in the content of your documents — no hallucinations, no guesswork. Every response is backed by exact citations with page numbers so you can verify every claim instantly.

🖼️ You can also upload image files — photos, scans, or screenshots — and run them against your loaded policy documents. Ask real-world comparison questions like: "Is this damaged suitcase eligible for a refund under our baggage reimbursement policy?" The agent visually analyzes the image and cross-references it against the relevant policy clauses to give you a grounded, citation-backed answer.

🧠 Powered by a custom fine-tuned model — purpose-built and specialized on contract, policy, and regulatory document language. Unlike generic LLMs, this model has been trained specifically for compliance-style Q&A, resulting in significantly higher accuracy, more precise clause interpretation, and faster, more consistent responses compared to out-of-the-box models on the same domain.

A production-style policy compliance assistant that answers questions from internal policy documents with REAL CITATIONS, REAL-TIME STREAMING, and optional MULTIMODAL (text + images) support using CLIP library.

Example questions the app can answer:

"Is this damaged baggage eligible for refund under our travel reimbursement policy?"
"If an employee shares customer data externally by mistake, what steps must be taken within the first 24 hours?"
"Compare our Data Retention Policy vs Vendor Security Policy — where do responsibilities overlap?"

The app returns:

✅ Confidence score
📄 Exact citations with page numbers
🔗 Reference documents
🎯 Full traceability and verification

The RAG pipeline is orchestrated with LANGGRAPH + LANGCHAIN and follows a clear flow (query embedding → vector search → top-k retrieval → context assembly → LLM generation → citation extraction). Retrieval runs on PINECONE (SERVERLESS) with semantic search + metadata filtering, enhanced with QUERY EXPANSION, HYBRID SEARCH (semantic + keyword), and relevance boosting via RERANKING using a CROSS-ENCODER(plus optional AUTO REWRITE to refine user queries). A REDIS caching layer reduces repeated retrieval/LLM calls for faster, more consistent answers. Supports a unified MULTI-PROVIDER interface (Ollama, OpenAI, Anthropic) and streams token-by-token with SSE. Every answer includes citations + is recorded to POSTGRESQL AUDIT LOGS for traceability; shipped with a production-friendly DOCKER COMPOSE setup.

📄 Cited Answers

Every response includes exact citations with page numbers and reference documents — no hallucinations, full traceability back to the source policy.

⚡ Real-Time Streaming

Token-by-token SSE streaming delivers answers instantly as the LLM generates them — no waiting for the full response to complete.

🔍 Hybrid Retrieval

Combines semantic vector search (Pinecone) with keyword matching, reranked by a Cross-Encoder for maximum relevance — plus optional auto query rewriting.

🤖 Multi-Provider LLM

Unified abstraction supports OpenAI, Anthropic, and Ollama (local) — swap providers without changing any application logic.

🗄️ PostgreSQL Audit Logs

Every query and answer is recorded to a persistent audit log — enabling compliance reporting, usage tracking, and full answer verification.

🖼️ Multimodal Support

Optional image understanding via CLIP library — ask questions about policy diagrams, charts, and visual content embedded in documents.

🏗️ Infrastructure

ContainerizationDocker + Docker Compose

App HostingAWS EC2

Edge / HTTPSAWS CloudFront

Reverse ProxyNginx

IaCPulumi (TypeScript)

Watch Video Live Demo

💡 Tip: Increase video quality settings for the best viewing experience

Agentic AI Project

Agentic AI Travel Planner

MULTI-AGENTLANGGRAPHSUPERVISORFASTAPIAWS LAMBDAAMADEUS APIPOSTGRESQL (NEON)PRISMATANSTACK QUERYLANGSMITH

An agentic AI travel-planning platform that generates fully personalized itineraries using a MULTI-AGENT system orchestrated by LANGGRAPH. A SUPERVISOR agent analyzes user intent and delegates work across SPECIALIZED AGENTS (RESEARCH, LOGISTICS, COMPLIANCE, EXPERIENCE)to produce structured, consistent plans.

Users can:

Choose any city or location
Set trip duration (e.g., 3 days, 7 days, etc.)
Define preferences (budget, food, culture, adventure, etc.)
Instantly receive a structured daily plan

The AI generates:

📍 Exact place names
🏠 Real addresses
🗺 Logical day-by-day schedules
🍽 Recommended restaurants
🎯 Attractions and experiences
⏱ Organized timeline throughout the day

Plans are enriched using real travel data via the Amadeus API to ensure locations are accurate and realistic.

🗺️ Google Maps Route

Interactive day-by-day route map with numbered, color-coded pins for each stop — attractions, restaurants, and landmarks plotted in logical travel order.

💰 Estimated Costs

Full trip cost breakdown — accommodation, food, activities & transport — with a total estimate so travelers can plan their budget at a glance.

💡 Local Tips

AI-curated insider tips for each city — transport hacks, free entry days, local markets, and hidden gems recommended by the agent.

🏨 Recommended Hotels

Curated hotel picks with star ratings, price ranges, addresses, and descriptions — spanning budget to luxury, tailored to the destination.

🍽️ Dining / Restaurant Recommendations

Breakfast, lunch & dinner recommendations with cuisine type, price tier, and exact addresses (and shown on the google Map) — so every meal is part of the experience.

The orchestration runs in a separate FASTAPI service deployed on AWS LAMBDA for stateless, pay-per-use scaling, enriched with real travel data via AMADEUS API. Trips persist to POSTGRESQL (NEON) with PRISMA, while the Next.js UI supports planning, saving, and reviewing across devices. Reliability comes from retries + structured flows and LANGSMITH debugging.

Tech: Next.js • React 18 • FastAPI • AWS Lambda • LangGraph • LangChain • OpenAI • Postgre SQL (Neon) • Prisma • Tailwind CSS • Clerk • TanStack Query • Amadeus API

🏗️ Infrastructure

ContainerizationDocker + Docker Compose

App HostingAWS EC2

Edge / HTTPSAWS CloudFront

Reverse ProxyNginx

IaCPulumi (TypeScript)

Watch Video Live Demo

💡 Tip: Increase video quality settings for the best viewing experience

Agentic AI E-Commerce (MCP + RAG)

Tweeky-Queeky Shop — Agentic AI E-Commerce (Agent Gateway + MCP + RAG)

AGENT GATEWAYPINECONE + EMBEDDINGSONE /CHAT SURFACEINTENT ROUTINGMCP TOOL SERVERTOOL USE (PRODUCTS/ORDERS)RAG PIPELINE (TF-IDF)FASTAPIREACT 18REDUX TOOLKITRTK QUERYDOCKERBEANIE + MONGODBJWTPAYPALSTRIPEDOCKERIZEDHYBRID SEARCH

Enterprise Scale project intentionally built to showcase AGENTIC AI patterns inside a real e-commerce app. The React frontend talks to an AGENT GATEWAY through a single chat surface. The agent performs INTENT ROUTING and delegates to either an internal MCP TOOL SERVER (explicit tools for product search/lookup + order tracking) or a RAG SERVICE (local docs retrieval for policy/support Q&A).

Instead of traditional filters, users can simply ask:

"Find me products between $200–$300 with the best ratings."
"What microphones are best for podcasting?"
"Find chairs shorter than 40 inches and under 5kg."
"Track my latest order."
"What's your return policy for electronics?"

This is a production-style full-stack e-commerce application built using React, FASTAPI, MONGODB, and DOCKER. It provides a complete end-to-end shopping experience—from user authentication and product browsing to secure checkout and order tracking—all backed by scalable async REST APIs and ROLE-BASED admin management.

The system includes a fully functional admin dashboard for managing users, products, orders, and inventory, along with secure JWT-based authentication, image upload support, and PAYPAL + STRIPE payment integration. 🛡️ The frontend communicates with a containerized backend API and MongoDB database, forming a deployable, cloud-ready full-stack solution.

The system understands intent and automatically chooses the correct backend action. 🤖 The key production-shaped boundary is that the UI never calls TOOLS or RETRIEVAL directly—everything stays behind the GATEWAY—making the tool-use layer AUDITABLE and TESTABLE, while enabling optional HYBRID SEARCH (BM25 + embeddings) via PINECONE/OPENAI.

🔧 What You Can Do with This App

You can create an account, securely log in, and manage your profile.
You can browse products, add them to your cart, and complete checkout.
You can make secure payments using PayPal and Stripe.
You can view your past orders and track delivery status.
You can use the admin dashboard to create products, upload images, manage inventory, and process customer orders.

Watch Video

Fine-Tuning Project

Fine-Tuning with Ollama + QLoRA — Policy Compliance LLM

QLORAOLLAMALLAMA 3.1 8B4-BIT NF4PEFTHUGGINGFACE TRANSFORMERSBITSANDBYTESGGUFEVALUATION HARNESSPYTHON

A reproducible, end-to-end fine-tuning pipeline for building a domain-specialized POLICY COMPLIANCE model using QLORA (4-bit quantization), then deploying it for local inference with OLLAMA. Built for repeatable experiments and real deployment, from dataset → fine-tune → packaging.

Implements parameter-efficient fine-tuning of LLAMA 3.1 8B with 4-BIT NF4 quantization—enabling training on a SINGLE GPU. Policy answer accuracy improved by about POLICY ANSWER ACCURACY IMPROVED BY ABOUT ~70% versus the base model, with strong convergence (training loss reduced by ~79% over 3 EPOCHS). Includes an automated EVALUATION HARNESS, a clean END-TO-END PIPELINE (data generation → fine-tuning → adapter merge → evaluation → packaging), and config-driven experiments with clean artifact separation. Ships a merged model packaged for OLLAMA (adapter merge → GGUF) to support fast, OFFLINE INFERENCE.

Tech: Python • HuggingFace Transformers • PEFT (LoRA / QLoRA) • BitsAndBytes • Accelerate • Ollama • GGUF

View Repo

Fine-Tuning Project

GPT-4 Fine-Tuning — Enterprise Policy Compliance AI

FINE-TUNINGGPT-4O-MINIOPENAI APIPYTHONJSONLBENCHMARKINGTOKEN OPTIMIZATIONLATENCY ANALYSISMLOPSAUTOMATION

A production-ready FINE-TUNING and evaluation pipeline for GPT-4O-MINI, designed to improve enterprise policy compliance accuracy while reducing TOKEN USAGE and LATENCY. The domain-specialized model trained on 164 curated JSONL examples boosted accuracy from 35% → 77.5% (~2.2× better) on compliance-focused evaluation sets. Includes an AUTOMATED EVALUATION SUITE with CATEGORY-LEVEL SCORING, REGRESSION CHECKS, and performance reports—token usage dropped by ~20%, and response latency improved from 1.61s → 1.06s (~44% faster). The pipeline follows an MLOPS-STYLE WORKFLOW with training data versioning, job automation, and model validation in OPENAI PLAYGROUND.

View Repo

💡 Tip: Increase video quality settings for the best viewing experience

Serverless Project

Serverless Order Management System — Event-Driven Cloud Platform

SERVERLESSAWS LAMBDAAPI GATEWAYSQSSNSDYNAMODBEVENTBRIDGESTEP FUNCTIONSAWS CDKTYPESCRIPTREACTCHART.JS

A fully SERVERLESS, EVENT-DRIVEN Order Management System on AWS, built for high scalability, operational visibility, and real-time analytics. Order workflows run through AWS LAMBDA behind API GATEWAY, with decoupled processing via SQS, SNS, and DYNAMODB.

Complex lifecycle flows are coordinated with STEP FUNCTIONS (retries, failure paths, DLQs) and optional EVENTBRIDGE pipelines. The whole environment is provisioned with AWS CDK (TypeScript) + CloudFormation, and shipped with a modern REACT + TYPESCRIPT dashboard featuring analytics charts (Chart.js) and CSV import/export workflows. Security and observability lean on least-privilege IAM + CloudWatch monitoring.

Tech: AWS Lambda • API Gateway • SQS • SNS • DynamoDB • S3 • Step Functions • EventBridge • AWS CDK • React 19 • TypeScript • Chart.js

Watch Video Live Demo

💡 Tip: Increase video quality settings for the best viewing experience

Full-Stack Project

Cocktail Maker App - Serverless Lambda + React Query + PostgreSQL

Mixmaster is a full-stack cocktail discovery app built with React ⚛️, React Query ⚡, Node.js 🌐, and PostgreSQL 🗄️. It blends real-time data from TheCocktailDB 🍸 with custom user-created recipes stored locally with uploading your image to S3 bucket🪣, enhanced by intelligent caching for high-speed performance. A serverless AWS Lambda microservice ☁️ handles newsletter automation with Amazon SES ✉️, delivering a smooth, modern UI and fast, reliable CRUD operations.

Watch Video Live Demo

💡 Tip: Increase video quality settings for the best viewing experience

E-Commerce Website

Tweeky Queeky Shop (MERN)

A fully Dockerized MERN e-commerce platform built with React ⚛️ + Redux Toolkit 🧩, Node/Express 🚀, and MongoDB 🗄️. It delivers a real-time product experience with secure JWT authentication 🔐, PayPal/credit-card checkout 💳, image uploads 📸, and full admin management workflows. Designed with a scalable, production-ready architecture ideal for real-world store operations.

Watch Video Live Demo