Projects
A showcase of my data science and AI projects.
Real-Time Tweet Sentiment Analysis Pipeline
01.2025 - 03.2025Comprehensive real-time data pipeline for analyzing tweet sentiment at scale, processing 50K+ tweets per hour through Bronze, Silver, Gold, and Application layers with 99.5% uptime. Deployed MLflow-packaged Hugging Face transformer achieving 92% accuracy with sub-200ms inference latency. Implemented auto-scaling Databricks clusters with optimized partitioning strategies, reducing query response times by 65% and enabling real-time aggregations across 1M+ tweet mentions. Engineered comprehensive monitoring system with Grafana dashboards and automated alerting for sentiment anomalies.
Steam Insights (Gaming Market Analysis & Forecasting)
08.2024 - 12.2024Comprehensive gaming market analysis and forecasting system processing 8M+ data points from 140K+ games. Built ETL pipeline with Apache Airflow, Databricks Spark, and Kafka. Developed ML models (XGBoost, Random Forest) for review analysis and pricing forecasts. Implemented time series forecasting (ARIMA, Prophet) achieving 85% accuracy in genre demand predictions and reliable sales forecasting.