SparSNP: Fast and memory-efficient analysis of all SNPs for phenotype prediction
- Author(s)
- Abraham, G; Kowalczyk, A; Zobel, J; Inouye, M;
- Journal Title
- BMC BIOINFORMATICS
- Publication Type
- Journal Article
- Abstract
- Background: A central goal of genomics is to predict phenotypic variation from genetic variation. Fitting predictive models to genome-wide and whole genome single nucleotide polymorphism (SNP) profiles allows us to estimate the predictive power of the SNPs and potentially develop diagnostic models for disease. However, many current datasets cannot be analysed with standard tools due to their large size. Results: We introduce SparSNP, a tool for fitting lasso linear models for massive SNP datasets quickly and with very low memory requirements. In analysis on a large celiac disease case/control dataset, we show that SparSNP runs substantially faster than four other state-of-the-art tools for fitting large scale penalised models. SparSNP was one of only two tools that could successfully fit models to the entire celiac disease dataset, and it did so with superior performance. Compared with the other tools, the models generated by SparSNP had better than or equal to predictive performance in cross-validation. Conclusions: Genomic datasets are rapidly increasing in size, rendering existing approaches to model fitting impractical due to their prohibitive time or memory requirements. This study shows that SparSNP is an essential addition to the genomic analysis toolkit. SparSNP is available at http://www.genomics.csse.unimelb.edu.au/SparSNP
- Publisher
- BIOMED CENTRAL LTD
- Keywords
- WHOLE-GENOME ASSOCIATION; SUPPORT VECTOR MACHINES; WIDE ASSOCIATION; CLASSIFICATION; REGRESSION; SELECTION; COMMON; LASSO; MAP
- Research Division(s)
- Immunology
- Publisher's Version
- https://doi.org/10.1186/1471-2105-13-88
- Open Access at Publisher's Site
- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3483007/
- Terms of Use/Rights Notice
- © 2012 Abraham et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Creation Date: 2012-05-10 12:00:00