A new local covariance matrix estimation for the classification of gene expression profiles in high dimensional RNA-Seq data
- Journal Title
- Expert Systems with Applications
- Recent developments in the next-generation sequencing based on RNA-sequencing (RNA-Seq) allow researchers to measure the expression levels of thousands of genes for multiple samples simultaneously. In order to analyze these kinds of data sets, many classification models have been proposed in the literature. Most of the existing classifiers assume that genes are independent; however, this is not a realistic approach for real RNA-Seq classification problems. For this reason, some other classification methods, which incorporates the dependence structure between genes into a model, are proposed. Quantile transformed Quadratic Discriminant Analysis (qtQDA) proposed recently is one of those classifiers, which estimates covariance matrix by Maximum Likelihood Estimator. However, MLE may not reflect the real dependence between genes. For this reason, we propose a new approach based on local dependence function to estimate the covariance matrix to be used in the qtQDA classification model. This new approach assumes the dependencies between genes are locally defined rather than complete dependency. The performances of qtQDA classifier based on two different covariance matrix estimates are compared over two real RNA-Seq data sets, in terms of classification error rates. The results show that using local dependence function approach yields a better estimate of covariance matrix and increases the performance of qtQDA classifier.
- RNA-seq; Gene expression; Local Covariance matrix; Classification; Quadratic Discriminant Analysis
- WEHI Research Division(s)
- Publisher's Version
- Rights Notice
- Refer to copyright notice on published article.
Creation Date: 2021-02-01 12:07:24Last Modified: 2021-03-02 02:07:30