Machine Learning Approach To Predict Mortality Rates Based On Hospital Clinical Data
Abstract
This thesis integrates fundamental concepts from conventional statistics with the more explanatory, algorithmic, and computational techniques offered by machine learning to predict early mortality risk of surgical patients. Well-known classification methods, including Random Forest, Decision Trees, Nearest Neighbor, Stochastic Gradient Descent, Logistic Regression, Naive Bayes, Bayes Network, Neural Networks, and Support Vector Machines, are utilized to predict mortality risk of elective general surgical patients treated between January 2005 and September 2010 at the Cleveland Clinic [33]. Clinical factors include surgery type, age, gender, race, BMI, underlying chronic conditions, surgical risk indices, surgical timing predictors, the 30-day mortality, and in-hospital complication for each patient.10×10-folding cross validation experiments are conducted to evaluate the prediction performance on low, medium, and high mortality risk groups. A Decision Tree classification model consisting of 83 low and 135 high risk patterns is presented. The overall average accuracy of the classifiers applied to predict low and high risk mortality is 85.2% with precision of 0.89, recall of 0.95, and F-measure of 0.92.The overall accuracy of the classifiers applied to predict low, medium, and high risk mortality is 84.7% with precision of 0.89, recall of 0.94, and F-measure of 0.91.