Data Science Projects

A comprehensive portfolio of data science and machine learning projects, showcasing various techniques and technologies.

Project 01: Music Preferences Analysis

Pythonpandas

Analyzed and compared music preferences between two different cities to identify patterns and trends in listening habits.

Project 02: Credit Scoring Analysis

Pythonpandas

Evaluated various metrics to predict the likelihood of a customer defaulting on a loan, helping financial institutions make informed lending decisions.

Project 03: Vehicle Price Analysis

Pythonpandasnumpymatplotlib

Investigated factors influencing vehicle prices by analyzing classified ads data to assist in better pricing strategies.

Project 04: Cell Plans Analysis

Pythonpandasnumpymatplotlibmathscipy

Examined client behavior and identified which telecom packages generate the most income, providing insights for marketing and sales strategies.

Project 05: Video Games Analysis

Pythonpandasnumpyscipymatplotlib

Tested hypothesis regarding video game users and critics to determine promising projects and plan effective advertising campaigns.

Project 06: Taxi Trip Analysis

Pythonpandasnumpyscipymatplotlib

Analyzed taxi trip durations in relation to weather conditions to test and validate hypotheses, aiding in optimizing taxi services.

Project 07: Cell Phone Plans Prediction

PythonpandassklearnDecision TreeRandom ForestLogistic Regression

Developed a classification model to help clients select the best cell phone plan, achieving a performance metric of at least 0.75.

Project 08: Client Retention Prediction

PythonpandasmatplotlibsklearnDecision TreeRandom ForestLogistic Regression

Created a prediction model for client retention with an F1 score of at least 0.59, helping businesses improve their customer retention strategies.

Project 09: Oil Well Location Prediction

PythonpandasnumpyscipymatplotlibsklearnRandom ForestLinear Regression

Validated oil reserve volume prediction models and calculated profits and risks for different regions, aiding in strategic decision-making for OilGiant's operations.

Project 10: Gold Production Modeling

PythonpandasnumpymatplotlibsklearnLinear RegressionDecision TreeRandom Forest

Modeled the production process in the gold mining industry to improve efficiency and developed a prototype machine learning model for industrial applications.

Project 11: Insurance Modeling

PythonnumpypandasseabornmatplotlibsklearnmathK-Nearest Neighbors

Identified similar customers and predicted insurance benefit amounts while ensuring data privacy, enhancing customer service and risk management in insurance.

Project 12: Car Price Prediction

PythonnumpypandasmatplotlibsklearncatboostLightGBMXGBoostDecision TreeRandom ForestLinear Regression

Built a model to determine market value of cars with an emphasis on prediction quality and speed, supporting automotive market analysis.

Project 13: Taxi Orders Prediction

PythonnumpypandasstatsmodelssklearnDecision TreeRandom ForestLinear RegressionLGBM

Predicted the number of taxi orders in the next hour with a RECM metric not exceeding 48, helping optimize taxi fleet management.

Project 14: Movie Reviews Categorization

PythonnumpyrepandasseabornmatplotlibnltktransformerstqdmspacysklearnLogistic RegressionLGBM

Trained models to automatically detect negative movie reviews, aiding in sentiment analysis and customer feedback management.

Project 15: Face Identification

PythonpandasnumpymatplotlibPILtensorflow.kerasCNNSequentialGlobalAveragePooling2DDenseDropout

Built and evaluated a neural network regression model to estimate age based on photographs, supporting biometric analysis applications.

Project 16: Customer Retention Prediction

PythonnumpypandasmatplotlibtensorflowsklearnLogistic RegressionDecision TreeRandom ForestCNN

Developed a model predicting contract cancellations with an AUC-ROC greater than or equal to 0.75, helping businesses reduce churn rates.