Machine Learning Projects
KNN Classification Models
01. Breast Cancer Classification
In this project, I have build a K-Nearest Neighbor classifier that is trained to predict whether a patient has benign or malignant breast cancer.
Project Goals:
- My model would classify Benign & Malignant Breast Cancer with highest accuracy.
Naive Bayes
01. Email Classifier
In this project, I’ve used Naive Bayes implementation on several different datasets. By reporting the accuracy of the classifier, it can be found which datasets are harder to distinguish.
Project Goals:
- How difficult it is to distinguish the difference between emails about hockey and emails about baseball?
- How hard is it to tell the difference between emails about hockey and emails about tech?
- Building an email classifier that classifies emails containing conflicting political issues (Politics_guns, Middle East, Religions)
Bayes Theorem
01. Does MCQ Grading Really Judge Students Knowledge?
Grading a multiple choice exam is easy. But how much do multiple choice exams tell us about what a student really knows? Dr. Dirac is administering a statistics midterm exam and wants to use Bayesβ Theorem to help him understand the following:
Project Goals:
- Given that a student answered a question correctly, what is the probability that she really knows the material?
Decision Trees and Random Forest
01. Predicting Income with Random Forest
In this project, I will be using a dataset containing census information from UCIβs Machine Learning Repository. By using this census data with a random forest, I will try to predict whether or not a person makes more than 50,000 Dollar.
02. Guessing Continent From Flag Color
What are some of the features that would provide clue for defining Continent of a country from just their flag? Maybe some of the colors are good indicators. The presence or absence of certain shapes could provide a hint.
In this project, I’ve used decision trees to try to predict the continent of flags based on several of these features.The Flag Attribute Information for this dataset is from UCIβs Machine Learning Repository.
Project Goals:
- From which Continent the Flagβπ» is from ?
Neural Networks
01. Perceptron Implementation in Logic Gates
In this project, I’ve used building blocks of Neural Network: perceptrons to model the fundamental building blocks of computers β logic gates.
- AND gate - The table below shows the results of an AND gate. Given two inputs, an AND gate will output a 1 only if both inputs are a 1.
- XOR gate β a gate that outputs a 1 only if one of the inputs is a 1.
Project Goals:
- AND gate can be thought of as linearly separable data and Perceptron can be trained to perform AND.
- XOR gate isnβt linearly separable and a Perceptron fails to learn XOR.
Regression Models
01. Feature Engineering
Here I’ve deployed single, double and multiple features linear regression models, for feature selection and model tuning.
Project Goals:
tennis_stats.csv is data from the menβs professional tennis league, which is called the ATP (Association of Tennis Professionals). Data from the top 1500 ranked players in the ATP over the span of 2009 to 2017 are provided in file. The statistics recorded for each player in each year include service game (offensive) statistics, return game (defensive) statistics and outcomes.
- To determine what it takes to be one of the best tennis players in the world.
02. Titanic Survival Prediction
In this project I’ve build Regression model that predicts which passengers survived the sinking of the Titanic, based on features.The data I’ll be using for training the model is provided by Kaggle Titanic competition!
Project Goals:
Predicting what happened to:
- 3rd class passenger
Jack
, - 1st class passenger
Rose
and - 3rd class youngest passenger onboard
Millvina Dean
03. Prediction of Future Production
The Honeybees are in a precarious state right now. There have been articles about the decline of the honeybee population for various reasons. This project is to investigate this decline and how the trends of the past predict the future for the honeybees.
DataFrame about honey production in the United States is collected from Kaggle.