List of Projects

Sales Performance Analysis Dashboard Using Excel

Designed an interactive Excel dashboard to analyze sales trends, customer segmentation, product analytics, and geographic insights with time-based analysis for seasonal trends. Implemented VBA macros with event-driven programming to automate data updates, dynamic filtering, and report generation.

Supply Chain Analysis Using SQL

Conducted comprehensive supply chain performance analysis using SQL to evaluate supplier reliability, inventory management, shipping efficiency, and product profitability. Designed and executed complex queries to identify key metrics such as supplier on-time performance, stockout frequency, defect rates, and high-demand regions.

Analysis of Spatial Temporal Project: Healthcare Accessibility Analysis

Analyzed healthcare accessibility in Toronto to identify optimal locations for new facilities and improve emergency response. Applied spatial analysis techniques, including spatial clustering and kernel density estimation (KDE), to locate areas with high demand and underserved populations. Utilized Inverse Distance Weighting (IDW) for estimating healthcare demand and identifying suitable new facility locations. Employed road network analysis with Dijkstra’s algorithm to determine the shortest routes to existing facilities, enhancing emergency response strategies.

Download Report

Machine Learning Project: Movie Recommendation & Insights Analysis

Utilized natural language processing (NLP) and machine learning techniques to extract valuable insights from a movie dataset. Developed a content-based movie recommendation system that suggests films based on similarities in movie descriptions, leveraging NLP to analyze and compare textual data. Implemented multilabel genre classification by analyzing movie overviews, enabling the categorization of films into multiple genres simultaneously. Additionally, predicted movie ratings using various features such as genre, budget, and revenue, applying machine learning methods and feature importance analysis.

Download Report

Analysis of Big Data Project: Exploratory Data Analysis using PySpark

Utilized PySpark for comprehensive Exploratory Data Analysis (EDA) on US Census data spanning from 2015 to 2017. Employed advanced visualization techniques to explore demographic attributes, including ethnic composition and gender distribution. Investigated socioeconomic factors such as poverty rates, employment patterns, income disparities, and commute patterns across different states to analyze temporal trends for comprehensive insights.

Neural Network Project: Symptom Driven Plant Disease Classification

Designed a plant disease diagnosis system using Gemini-Vision-Pro for extracting visual features and generating symptom descriptions from images. Integrated these insights with a multi-modal fusion model to compare symptom based and image-based classification methods, improving classification accuracy.

Download Report

Early Patient Readmission Prediction

Developed a machine learning pipeline to predict early patient readmissions using a decade of clinical records from 130 US hospitals. Extensive data preprocessing was performed, class imbalance was addressed using SMOTE, and multiple models including Logistic Regression and Random Forest were evaluated, achieving the highest accuracy of 62.2%. Visualization techniques such as ROC curves and confusion matrices were used to assess model performance.

Download Report

Data Visualization Project: Customer Shopping Trends Dashboard

Developed a comprehensive dashboard analyzing customer behavior and purchasing patterns across various dimensions including product categories, regional trends, age group preferences, subscription status, seasonal behavior, promotions, payment methods, and shipping preferences through interactive visualizations using Google Looker Studio.

Download Report

Machine Learning Project: Ames Housing Dataset Analysis

Employed comprehensive data preprocessing techniques, encompassing handling missing values, encoding categorical variables, and feature engineering. Implemented advanced feature selection and outlier detection strategies to refine model accuracy. Leveraged cross-validation for robust hyperparameter tuning and finalized a model for precise house sale price predictions on test data.