Statistical Analysis of Heart Disease Risk Factors with R
This project utilized data from the 2013 Behavioral Risk Factor Surveillance System (BRFSS) to explore the influence of factors such as Body Mass Index (BMI), exercise, cholesterol levels, and other health variables on the risk of developing coronary heart disease (CHD). Through data preprocessing and feature selection in R, the dataset was cleaned and analyzed to identify significant predictors. Various machine learning models were tested, including logistic regression and decision trees, to classify CHD risk, with the final model highlighting cholesterol levels, hypertension, and regular exercise as key determinants. The model achieved a high performance, reflected by an Area Under the Curve (AUC) score of 0.801, indicating strong predictive accuracy.