Command Palette

Search for a command to run...

GitHub
Priyanshu Rawat's avatar
text-3xl text-zinc-950 font-medium

Priyanshu Rawat  

Building production RAG systems with LangGraph

Overview

Social Links

About

Data Scientist and ML Engineer specializing in production-grade RAG systems, agentic AI, and MLOps infrastructure. Currently at University of Rochester's Center for Integrated Research Computing, building multimodal RAG systems and ML-powered intelligence platforms serving 1000+ researchers.

Expertise in LLM optimization (LoRA/QLoRA fine-tuning, quantization, vLLM), vector databases (pgvector, ChromaDB, Pinecone), and production ML deployment (Docker, Kubernetes, CI/CD). Strong foundation in PyTorch, big data technologies (Spark, Kafka, Airflow), and cloud platforms (AWS).

Recent projects include a cybersecurity threat intelligence system with fine-tuned LLMs achieving 3x throughput improvements, and a Wegmans capstone project predicting gluten sensitivity across 5.6M transactions with optimized business ROI.

Let's connect and collaborate on cutting-edge AI solutions!

Stack

GitHub Contributions

Experience

Center for Integrated Research Computing, UoR

Current Employer
  • Built agentic RAG system with hybrid search and context reranking, integrating multi-source knowledge (SQL databases, documentation, cluster metrics) using dynamic tool-calling and LangGraph orchestration, deployed via FastAPI and Docker, serving 1000+ users for HPC job scheduling, resource optimization, and troubleshooting with LangFuse monitoring
  • Deployed self-hosted LLM via vLLM and FastAPI for automated support ticket summaries and resolution suggestions using RAG retrieval from vector database of 200K+ tickets anonymized for compliance, reducing support response time by 40%
  • Built multi-task BERT classifier deployed via FastAPI with JWT authentication for ticket categorization across 4 teams and 4 priority levels (85% precision, 83% recall on 200K+ tickets), providing classification layer for LLM summarization and resolution workflow
  • Designed evaluation framework with 1000 curated query-resolution pairs, measuring retrieval quality (precision@k, MRR) and generation quality (RAGAS faithfulness, answer relevancy) using self-hosted LLM-as-judge for automated, reproducible scoring
  • Configured GitLab CI/CD pipelines with Prometheus/Grafana monitoring and comprehensive automated testing, enabling reliable continuous deployment of ML and RAG systems with 99.5% uptime and automated rollback capabilities
  • LangGraph
  • FastAPI
  • Docker
  • RAG Systems
  • Vector Databases
  • SQL
  • vLLM
  • BERT
  • JWT
  • LangFuse
  • GitLab CI
  • Prometheus
  • Grafana
  • HPC Systems
  • Agentic AI

FLX AI

K-Labs: Continual Learning Lab, UoR

Greene Career Center, UoR

Insignia Consultancy

Education

University of Rochester

Rochester, New York

Key Coursework
  • Machine Learning
  • Computational Statistics
  • Data Science at Scale
  • End-to-End Deep Learning

Graphic Era Hill University

Dehradun, India

Key Coursework
  • Machine Learning
  • Data Structures and Algorithm
  • Deep Learning
  • Object Oriented Programming

Projects(6)

SolarTrack - AI-Powered Solar Analytics Platform

SolarTrack - AI-Powered Solar Analytics Platform

Period
12.2024Present

Full-stack SaaS platform combining AI vision models for automated handwritten log digitization with natural language querying. Solves the manual data entry bottleneck in renewable energy monitoring by enabling users to photograph logbooks and extract structured readings through computer vision.

  • Next.js 15
  • React 19
  • Three.js
  • FastAPI
  • Python
  • Supabase
  • +9 more
Read more
CyberIntel Summarizer: Real-Time Threat Intelligence System

CyberIntel Summarizer: Real-Time Threat Intelligence System

Period
09.2024Present

Real-time cybersecurity threat intelligence system analyzing 100+ daily CVE updates from NVD, CISA, and MITRE ATT&CK feeds. Features LoRA-fine-tuned LLM with 4-bit quantization achieving 3x throughput improvement and interactive Streamlit dashboard for threat analytics.

  • LoRA
  • vLLM
  • 4-bit Quantization
  • FastAPI
  • PostgreSQL
  • Streamlit
  • +5 more
Read more
Gluten Sensitivity Prediction System (Wegmans Capstone)

Gluten Sensitivity Prediction System (Wegmans Capstone)

Period
08.202412.2024

XGBoost classification system analyzing 5.6M transaction records for Wegmans Food Market. Implemented threshold optimization improving precision by 49% while demonstrating reduced coupon waste and higher marketing ROI.

  • XGBoost
  • Feature Engineering
  • Class Imbalance
  • Threshold Optimization
  • Cost-Sensitive Learning
  • F1-optimal
  • +3 more
Read more
HPC Documentation Assistant

HPC Documentation Assistant

Period
02.2025Present

AI-powered documentation assistant for High Performance Computing systems using RAG. Helps 1000+ researchers quickly find answers across massive technical documentation without searching.

  • RAG Systems
  • Vector Databases
  • FastAPI
  • Docker
  • HPC Systems
  • Agentic AI
Read more
Real-Time Tweet Sentiment Analysis Pipeline

Real-Time Tweet Sentiment Analysis Pipeline

Period
01.202503.2025

Real-time sentiment analysis system processing 50K+ tweets per hour to detect emerging public opinion trends. Uses transformer-based NLP achieving 92% accuracy for monitoring sentiment shifts across millions of mentions simultaneously.

  • Apache Spark Streaming
  • Delta Lake
  • Hugging Face Transformers
  • MLflow
  • Databricks
  • Grafana
  • +4 more
Read more
Steam Insights (Gaming Market Analysis & Forecasting)

Steam Insights (Gaming Market Analysis & Forecasting)

Period
08.202412.2024

Predictive analytics platform forecasting gaming market trends and demand with 85% accuracy. Analyzes 8M+ data points across 140K+ games to guide development priorities, pricing strategies, and marketing decisions.

  • Apache Airflow
  • Databricks Spark
  • Kafka
  • XGBoost
  • Random Forest
  • ARIMA
  • +2 more
Read more

Bookmarks

Attention Is All You Need

Author
Vaswani et al.
Bookmarked on

LangChain Documentation

Author
LangChain
Bookmarked on

Retrieval-Augmented Generation for Large Language Models

Author
Lewis et al.
Bookmarked on

The Illustrated Transformer

Author
Jay Alammar
Bookmarked on

LLM Optimization with LoRA and QLoRA

Author
Hu et al.
Bookmarked on

vLLM: Easy, Fast, and Cheap LLM Serving

Author
UC Berkeley
Bookmarked on
Mark
Logotype