SparSNP: Fast and memory-efficient analysis of all SNPs for phenotype prediction
Journal Title
BMC BIOINFORMATICS
Publication Type
Journal Article
Abstract
Background: A central goal of genomics is to predict phenotypic variation from genetic variation. Fitting predictive models to genome-wide and whole genome single nucleotide polymorphism (SNP) profiles allows us to estimate the predictive power of the SNPs and potentially develop diagnostic models for disease. However, many current datasets cannot be analysed with standard tools due to their large size. Results: We introduce SparSNP, a tool for fitting lasso linear models for massive SNP datasets quickly and with very low memory requirements. In analysis on a large celiac disease case/control dataset, we show that SparSNP runs substantially faster than four other state-of-the-art tools for fitting large scale penalised models. SparSNP was one of only two tools that could successfully fit models to the entire celiac disease dataset, and it did so with superior performance. Compared with the other tools, the models generated by SparSNP had better than or equal to predictive performance in cross-validation. Conclusions: Genomic datasets are rapidly increasing in size, rendering existing approaches to model fitting impractical due to their prohibitive time or memory requirements. This study shows that SparSNP is an essential addition to the genomic analysis toolkit. SparSNP is available at http://www.genomics.csse.unimelb.edu.au/SparSNP
Publisher
BIOMED CENTRAL LTD
Keywords
WHOLE-GENOME ASSOCIATION; SUPPORT VECTOR MACHINES; WIDE ASSOCIATION; CLASSIFICATION; REGRESSION; SELECTION; COMMON; LASSO; MAP
WEHI Research Division(s)
Immunology
Rights Notice
© 2012 Abraham et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Creation Date: 2012-05-10 12:00:00
Last Modified: 0001-01-01 12:00:00
An error has occurred. This application may no longer respond until reloaded. Reload 🗙