Projects

Projects :

Data analysis

Customer Behaviour Modeling

Predicted sentiments/emotions using NLP techniques.
Clustered customers to identify potential issues.
Flagged problematic customers for proactive solutions.

Ask me!

Identification of Bullying Victims Using Data Mining Techniques

Reduced 204 features to 31 for improved model accuracy.
Evaluated and tuned 54 ML model combinations.
Awarded best class project for exceptional performance.

View Project

Conversational Q/A Chatbot

Integrated GPT-3.5, LangChain, and RAG for Q&A NLP.
Built ML workflows with API and Hugging Face deployment.
Enabled memory, intent detection, and real-time retrieval

View Project

Healthcare Cost and Insurance Analytics with Predictive Modeling

Created Tableau dashboards for healthcare insights.
Used Python modeling and SQL for claims analysis.
Analyzed costs and trends to drive decisions.

View Project

Text To Image Generation iOS App

Fine-tuned Stable Diffusion XL for science diagrams.
Deployed model on SageMaker with iOS app integration.
Achieved seamless text-to-image functionality.

View Project

Unsupervised Topic Modeling on the 20 Newsgroups Dataset

Applied LDA modeling on 20 Newsgroups dataset.
Optimized topic coherence with preprocessing and tuning.
Visualized insights using word clouds and bar charts.

View Project

Data Analysis :

Feature Engineering

I have performed Feature Engineering on various type of dataset like textual, image, audio. Its a process of selecting, manipulating and transforming raw data into features.

Visit My Work

Exploratory Data Analysis

Exploratory Data Analysis is a data analytics process to understand the data in depth and learn the different data characteristics, often with visual means. Here you can find my notebook performed on various datasets.

Visit My Work

Pandas, NumPy

Pandas & NumPy are python libraries used for data analysis. Here I tried to find interesting patterns from datasets.

Visit My Work

Machine Learning

Machine Learning Algorithms :

Unsupervised

K Means Clustering

K Means Clustering is an iterative algorithm that devides the unlabled dataset into k different clusters in such way that each dataset belongs to only one group that has similar properties.

Go to notebook

Unsupervised

Apriori

Apriori algorithm uses frequent itemsets to generate association rules. It is based on the concept that a subset of a frequent itemset must also be frequent itemset

Go to notebook

Both

Neural Networks

Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns.

There are multiple types of nueral networks ex. ANN, CNN, RNN, Feed Forward & Back Propagation.

Go to notebook

Unsupervised

FP Growth

Frequent Pattern Growth algorithm is a method of finding frequent patterns without candidate set generation . It counstructs FP tree than using the generate and test strategy of apriori.

Go to notebook

Unsupervised

Support Vector Machine

The goal of SVM is to create a best line or decision boundary that can segregate n-dimensional space into classes such that we can easily put the new data points in the correct category in the future.

Go to notebook

Supervised

Decision Tree

A decision tree is a flowchart like structure where, each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node holds a class label.

Go to notebook

Unsupervised

PCA

PCA is dimensionality reduction method that is often used to reduce the dimensionality of large data sets by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

Go to notebook

Supervised

Fischer's LDA

LDA is a dmensionality reduction technique used as a preprocessing step for pattern classification and machine learning applications .

Go to notebook