Priyanshu Rawat
Building production RAG systems with LangGraph
Overview
Social Links
About
Data Scientist and ML Engineer specializing in production-grade RAG systems, agentic AI, and MLOps infrastructure. Currently at University of Rochester's Center for Integrated Research Computing, building multimodal RAG systems and ML-powered intelligence platforms serving 1000+ researchers.
Expertise in LLM optimization (LoRA/QLoRA fine-tuning, quantization, vLLM), vector databases (pgvector, ChromaDB, Pinecone), and production ML deployment (Docker, Kubernetes, CI/CD). Strong foundation in PyTorch, big data technologies (Spark, Kafka, Airflow), and cloud platforms (AWS).
Recent projects include a cybersecurity threat intelligence system with fine-tuned LLMs achieving 3x throughput improvements, and a Wegmans capstone project predicting gluten sensitivity across 5.6M transactions with optimized business ROI.
Let's connect and collaborate on cutting-edge AI solutions!
Stack
GitHub Contributions
Experience
Center for Integrated Research Computing, UoR
Current Employer- Built agentic RAG system with hybrid search and context reranking, integrating multi-source knowledge (SQL databases, documentation, cluster metrics) using dynamic tool-calling and LangGraph orchestration, deployed via FastAPI and Docker, serving 1000+ users for HPC job scheduling, resource optimization, and troubleshooting with LangFuse monitoring
- Deployed self-hosted LLM via vLLM and FastAPI for automated support ticket summaries and resolution suggestions using RAG retrieval from vector database of 200K+ tickets anonymized for compliance, reducing support response time by 40%
- Built multi-task BERT classifier deployed via FastAPI with JWT authentication for ticket categorization across 4 teams and 4 priority levels (85% precision, 83% recall on 200K+ tickets), providing classification layer for LLM summarization and resolution workflow
- Designed evaluation framework with 1000 curated query-resolution pairs, measuring retrieval quality (precision@k, MRR) and generation quality (RAGAS faithfulness, answer relevancy) using self-hosted LLM-as-judge for automated, reproducible scoring
- Configured GitLab CI/CD pipelines with Prometheus/Grafana monitoring and comprehensive automated testing, enabling reliable continuous deployment of ML and RAG systems with 99.5% uptime and automated rollback capabilities
- LangGraph
- FastAPI
- Docker
- RAG Systems
- Vector Databases
- SQL
- vLLM
- BERT
- JWT
- LangFuse
- GitLab CI
- Prometheus
- Grafana
- HPC Systems
- Agentic AI
FLX AI
K-Labs: Continual Learning Lab, UoR
Greene Career Center, UoR
Insignia Consultancy
Education
University of Rochester
Rochester, New York
Key Coursework
- Machine Learning
- Computational Statistics
- Data Science at Scale
- End-to-End Deep Learning
Graphic Era Hill University
Dehradun, India
Key Coursework
- Machine Learning
- Data Structures and Algorithm
- Deep Learning
- Object Oriented Programming
Projects(6)

SolarTrack - AI-Powered Solar Analytics Platform
- Period
- 12.2024—Present
Full-stack SaaS platform combining AI vision models for automated handwritten log digitization with natural language querying. Solves the manual data entry bottleneck in renewable energy monitoring by enabling users to photograph logbooks and extract structured readings through computer vision.
- Next.js 15
- React 19
- Three.js
- FastAPI
- Python
- Supabase
- +9 more

CyberIntel Summarizer: Real-Time Threat Intelligence System
- Period
- 09.2024—Present
Real-time cybersecurity threat intelligence system analyzing 100+ daily CVE updates from NVD, CISA, and MITRE ATT&CK feeds. Features LoRA-fine-tuned LLM with 4-bit quantization achieving 3x throughput improvement and interactive Streamlit dashboard for threat analytics.
- LoRA
- vLLM
- 4-bit Quantization
- FastAPI
- PostgreSQL
- Streamlit
- +5 more

Gluten Sensitivity Prediction System (Wegmans Capstone)
- Period
- 08.2024—12.2024
XGBoost classification system analyzing 5.6M transaction records for Wegmans Food Market. Implemented threshold optimization improving precision by 49% while demonstrating reduced coupon waste and higher marketing ROI.
- XGBoost
- Feature Engineering
- Class Imbalance
- Threshold Optimization
- Cost-Sensitive Learning
- F1-optimal
- +3 more

HPC Documentation Assistant
- Period
- 02.2025—Present
AI-powered documentation assistant for High Performance Computing systems using RAG. Helps 1000+ researchers quickly find answers across massive technical documentation without searching.
- RAG Systems
- Vector Databases
- FastAPI
- Docker
- HPC Systems
- Agentic AI

Real-Time Tweet Sentiment Analysis Pipeline
- Period
- 01.2025—03.2025
Real-time sentiment analysis system processing 50K+ tweets per hour to detect emerging public opinion trends. Uses transformer-based NLP achieving 92% accuracy for monitoring sentiment shifts across millions of mentions simultaneously.
- Apache Spark Streaming
- Delta Lake
- Hugging Face Transformers
- MLflow
- Databricks
- Grafana
- +4 more

Steam Insights (Gaming Market Analysis & Forecasting)
- Period
- 08.2024—12.2024
Predictive analytics platform forecasting gaming market trends and demand with 85% accuracy. Analyzes 8M+ data points across 140K+ games to guide development priorities, pricing strategies, and marketing decisions.
- Apache Airflow
- Databricks Spark
- Kafka
- XGBoost
- Random Forest
- ARIMA
- +2 more
Bookmarks
Attention Is All You Need
- Author
- Vaswani et al.
- Bookmarked on
LangChain Documentation
- Author
- LangChain
- Bookmarked on
Retrieval-Augmented Generation for Large Language Models
- Author
- Lewis et al.
- Bookmarked on
The Illustrated Transformer
- Author
- Jay Alammar
- Bookmarked on
LLM Optimization with LoRA and QLoRA
- Author
- Hu et al.
- Bookmarked on
vLLM: Easy, Fast, and Cheap LLM Serving
- Author
- UC Berkeley
- Bookmarked on