Priyanshu Rawat

Data Scientist @CIRC, UofR

New York

00:00

[Email protected]

prwt.dev

he/him

Get In Touch

LinkedIn

prwt

GitHub

PRawat00

About

Data Scientist and ML Engineer specializing in production-grade RAG systems, agentic AI, and MLOps infrastructure. Currently at University of Rochester's Center for Integrated Research Computing, building multimodal RAG systems and ML-powered intelligence platforms serving 1000+ researchers.

Expertise in LLM optimization (LoRA/QLoRA fine-tuning, quantization, vLLM), vector databases (pgvector, ChromaDB, Pinecone), and production ML deployment (Docker, Kubernetes, CI/CD). Strong foundation in PyTorch, big data technologies (Spark, Kafka, Airflow), and cloud platforms (AWS).

Recent projects include a cybersecurity threat intelligence system with fine-tuned LLMs achieving 3x throughput improvements, and a Wegmans capstone project predicting gluten sensitivity across 5.6M transactions with optimized business ROI.

Let's connect and collaborate on cutting-edge AI solutions!

Stack

1,869 contributions in 2025 on GitHub.

LessMore

Experience

Center for Integrated Research Computing, UoR

Built agentic RAG system with hybrid search and context reranking, integrating multi-source knowledge (SQL databases, documentation, cluster metrics) using dynamic tool-calling and LangGraph orchestration, deployed via FastAPI and Docker, serving 1000+ users for HPC job scheduling, resource optimization, and troubleshooting with LangFuse monitoring
Deployed self-hosted LLM via vLLM and FastAPI for automated support ticket summaries and resolution suggestions using RAG retrieval from vector database of 200K+ tickets anonymized for compliance, reducing support response time by 40%
Built multi-task BERT classifier deployed via FastAPI with JWT authentication for ticket categorization across 4 teams and 4 priority levels (85% precision, 83% recall on 200K+ tickets), providing classification layer for LLM summarization and resolution workflow
Designed evaluation framework with 1000 curated query-resolution pairs, measuring retrieval quality (precision@k, MRR) and generation quality (RAGAS faithfulness, answer relevancy) using self-hosted LLM-as-judge for automated, reproducible scoring
Configured GitLab CI/CD pipelines with Prometheus/Grafana monitoring and comprehensive automated testing, enabling reliable continuous deployment of ML and RAG systems with 99.5% uptime and automated rollback capabilities

LangGraph
FastAPI
Docker
RAG Systems
Vector Databases
SQL
vLLM
BERT
JWT
LangFuse
GitLab CI
Prometheus
Grafana
HPC Systems
Agentic AI

FLX AI

K-Labs: Continual Learning Lab, UoR

Greene Career Center, UoR

Insignia Consultancy

Education

University of Rochester

Rochester, New York

Key Coursework

Machine Learning
Computational Statistics
Data Science at Scale
End-to-End Deep Learning

Graphic Era Hill University

Dehradun, India

Key Coursework

Machine Learning
Data Structures and Algorithm
Deep Learning
Object Oriented Programming

Projects⁽⁶⁾

SolarTrack - AI-Powered Solar Analytics Platform

Period: 12.2024—Present

Full-stack SaaS platform combining AI vision models for automated handwritten log digitization with natural language querying. Solves the manual data entry bottleneck in renewable energy monitoring by enabling users to photograph logbooks and extract structured readings through computer vision.

Next.js 15
React 19
Three.js
FastAPI
Python
Supabase
+9 more

Read more→

CyberIntel Summarizer: Real-Time Threat Intelligence System

Period: 09.2024—Present

Real-time cybersecurity threat intelligence system analyzing 100+ daily CVE updates from NVD, CISA, and MITRE ATT&CK feeds. Features LoRA-fine-tuned LLM with 4-bit quantization achieving 3x throughput improvement and interactive Streamlit dashboard for threat analytics.

LoRA
vLLM
4-bit Quantization
FastAPI
PostgreSQL
Streamlit
+5 more

Read more→

Gluten Sensitivity Prediction System (Wegmans Capstone)

Period: 08.2024—12.2024

XGBoost classification system analyzing 5.6M transaction records for Wegmans Food Market. Implemented threshold optimization improving precision by 49% while demonstrating reduced coupon waste and higher marketing ROI.

XGBoost
Feature Engineering
Class Imbalance
Threshold Optimization
Cost-Sensitive Learning
F1-optimal
+3 more

Read more→

HPC Documentation Assistant

Period: 02.2025—Present

AI-powered documentation assistant for High Performance Computing systems using RAG. Helps 1000+ researchers quickly find answers across massive technical documentation without searching.

RAG Systems
Vector Databases
FastAPI
Docker
HPC Systems
Agentic AI

Read more→

Real-Time Tweet Sentiment Analysis Pipeline

Period: 01.2025—03.2025

Real-time sentiment analysis system processing 50K+ tweets per hour to detect emerging public opinion trends. Uses transformer-based NLP achieving 92% accuracy for monitoring sentiment shifts across millions of mentions simultaneously.

Apache Spark Streaming
Delta Lake
Hugging Face Transformers
MLflow
Databricks
Grafana
+4 more

Read more→

Steam Insights (Gaming Market Analysis & Forecasting)

Period: 08.2024—12.2024

Predictive analytics platform forecasting gaming market trends and demand with 85% accuracy. Analyzes 8M+ data points across 140K+ games to guide development priorities, pricing strategies, and marketing decisions.

Apache Airflow
Databricks Spark
Kafka
XGBoost
Random Forest
ARIMA
+2 more

Read more→

Bookmarks

Attention Is All You Need

Author: Vaswani et al.

Bookmarked on: 20.12.2025

LangChain Documentation

Author: LangChain

Bookmarked on: 18.12.2025

Retrieval-Augmented Generation for Large Language Models

Author: Lewis et al.

Bookmarked on: 16.12.2025

The Illustrated Transformer

Author: Jay Alammar

Bookmarked on: 14.12.2025

LLM Optimization with LoRA and QLoRA

Author: Hu et al.

Bookmarked on: 12.12.2025

vLLM: Easy, Fast, and Cheap LLM Serving

Author: UC Berkeley

Bookmarked on: 10.12.2025

Mark

Logotype

LinkedIn

GitHub

About

Stack

Experience

Center for Integrated Research Computing, UoR

Data Scientist

FLX AI

Data Science Intern

K-Labs: Continual Learning Lab, UoR

ML Research Assistant

Greene Career Center, UoR

Data Analyst

Insignia Consultancy

Data Science Intern

Education

University of Rochester

Master of Science in Data Science

Key Coursework

Graphic Era Hill University

Bachelor of Science in Computer Science

Key Coursework

Projects⁽⁶⁾

SolarTrack - AI-Powered Solar Analytics Platform

CyberIntel Summarizer: Real-Time Threat Intelligence System

Gluten Sensitivity Prediction System (Wegmans Capstone)

HPC Documentation Assistant

Real-Time Tweet Sentiment Analysis Pipeline

Steam Insights (Gaming Market Analysis & Forecasting)

Bookmarks

Attention Is All You Need

LangChain Documentation

Retrieval-Augmented Generation for Large Language Models

The Illustrated Transformer

LLM Optimization with LoRA and QLoRA

vLLM: Easy, Fast, and Cheap LLM Serving

Priyanshu Rawat Pronounce my name

Overview

Social Links

LinkedIn

GitHub

About

Stack

GitHub Contributions

Experience

Center for Integrated Research Computing, UoR

Data Scientist

FLX AI

Data Science Intern

K-Labs: Continual Learning Lab, UoR

ML Research Assistant

Greene Career Center, UoR

Data Analyst

Insignia Consultancy

Data Science Intern

Education

University of Rochester

Master of Science in Data Science

Key Coursework

Graphic Era Hill University

Bachelor of Science in Computer Science

Key Coursework

Projects(6)

SolarTrack - AI-Powered Solar Analytics Platform

CyberIntel Summarizer: Real-Time Threat Intelligence System

Gluten Sensitivity Prediction System (Wegmans Capstone)

HPC Documentation Assistant

Real-Time Tweet Sentiment Analysis Pipeline

Steam Insights (Gaming Market Analysis & Forecasting)

Bookmarks

Attention Is All You Need

LangChain Documentation

Retrieval-Augmented Generation for Large Language Models

The Illustrated Transformer

LLM Optimization with LoRA and QLoRA

vLLM: Easy, Fast, and Cheap LLM Serving

Priyanshu Rawat

Projects⁽⁶⁾