Show simple item record

dc.contributor.advisorNezamoddini-Kachouie, Nezamoddin
dc.contributor.authorDeebani, Wejdan
dc.date.accessioned2019-10-18T17:46:45Z
dc.date.available2019-10-18T17:46:45Z
dc.date.created2019-05
dc.date.issued2019-05
dc.date.submittedMay 2019
dc.identifier.urihttp://hdl.handle.net/11141/2965
dc.descriptionThesis (Ph.D.) - Florida Institute of Technology, 2019en_US
dc.description.abstractSubjects in a population are represented by their characteristics, and the characteristics are represented by variables. Identifying the relationship between these variables is essential for prediction, hypothesis testing, and decision making. The relation between two variables is often quantified using a correlation factor. Once correlations between response and independent variables are known, they can be used to make predictions regarding response variables. That is, if two variables are correlated, by observing one, we can make predictions about the other one. A more accurate prediction can be made where there is a strong relationship between variables. Several correlation factors have been introduced. Among them, Pearson’s Correlation Coefficient has been commonly used, while Distance Correlation and Maximal Information Coefficient have been recently introduced to address the shortcomings of Pearson’s Correlation Coefficient. Different coefficients perform differently for identifying underlying relationships and under different noise conditions. This makes it very challenging to choose the right correlation factor for a specific dataset when the underlying relationship is unknown. In this dissertation, we first compare these factors through a set of Monte Carlo simulations for different relationships and a variety of noise conditions. We then propose a method to ensemble and aggregate them to introduce a more robust factor that can be generally used with a variety of relationship types under different noise conditions. We then apply the proposed ensemble method to DNA copy numbers obtained from patients with non-small cell lung cancer to identify associated genes with lung cancer. Finally, we introduce our Robust Distance Correlation, a method that we developed to improve Distance Correlation and to make it robust with regard to the relationship type as well as the noise environment.en_US
dc.format.mimetypeapplication/pdf
dc.language.isoen_USen_US
dc.rightsCC BY 4.0en_US
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/legalcodeen_US
dc.titleEnsemble Correlation Coefficient for Variable Association Detectionen_US
dc.typeDissertationen_US
dc.date.updated2019-06-13T14:49:48Z
thesis.degree.nameDoctorate of Philosophy in Operations Researchen_US
thesis.degree.levelDoctoralen_US
thesis.degree.disciplineOperations Researchen_US
thesis.degree.departmentMathematical Sciencesen_US
thesis.degree.grantorFlorida Institute of Technologyen_US
dc.type.materialtext


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

CC BY 4.0
Except where otherwise noted, this item's license is described as CC BY 4.0