Multi-Class Logical Analysis of Data with Relaxed Patterns and Its Extension to Survival Analysis
Bain, Travaughn Coren
MetadataShow full item record
This dissertation builds on a previously successful optimization based linear program multi-class classification method, called Logical Analysis of Data (LAD), and improves its generalization capability by introducing relaxed constraint modifications and then further extends and applies it to Survival Analysis. First, we propose the relaxed modifications onto the constraints of the mixed integer linear program (MILP) in the pattern generation phase of LAD. Our modifications are aimed at minimizing the degree of over-fitting to noise, allowing for added flexibility to widen the solution space in hopes to discover more robust classification rules. The proposed method introduces relaxed homogeneity and minimum prevalence measures that foster the flexibility for solutions of the underlying MILP. The advantage of this added flexibility is demonstrated through experiments on several multi-class benchmark datasets. Next, we demonstrate an application of our methodology by tying it with the methodology of Kaplan-Meier estimation for Survival Analysis. Together, the combined techniques produce a methodology aimed at producing robust survival models that can accurately predict the underlying survival distributions and improve on the estimation of risk stratification. Lastly, we propose our adapted survival function estimator, which proportionately adjusts the weight of the baseline survival function based on the related pattern survival curve(s) coverage therein. The utility of which is evaluated by comparing the empirical results from two previously established survival function estimators across two publicly available clinical datasets.