Analysis of genome-wide association study data using the protein knowledge base
- Author(s)
- Ballouz, S; Liu, JY; Oti, M; Gaeta, B; Fatkin, D; Bahlo, M; Wouters, MA;
- Journal Title
- BMC GENETICS
- Publication Type
- Journal Article
- Abstract
- Background: Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control. Results: Here we developed and benchmarked several protocols for GWAS data analysis using different in-silico gene prediction and prioritisation methodologies. We adopted a high sensitivity approach to the data, using less conservative statistical SNP associations. Multiple gene search spaces, either of fixed-widths or proximity-based, were generated around each SNP marker. We used the candidate disease gene prediction system Gentrepid to identify candidates based on shared biomolecular pathways or domain-based protein homology. Predictions were made either with phenotype-specific known disease genes as input; or without a priori knowledge, by exhaustive comparison of genes in distinct loci. Because Gentrepid uses biomolecular data to find interactions and common features between genes in distinct loci of the search spaces, it takes advantage of the multi-locus aspect of the data. Conclusions: Results suggest testing multiple SNP-to-gene search spaces compensates for differences in phenotypes, populations and SNP platforms. Surprisingly, domain-based homology information was more informative when benchmarked against gene candidates reported by GWA studies compared to previously determined disease genes, possibly suggesting a larger contribution of gene homologs to complex diseases than Mendelian diseases.
- Publisher
- BIOMED CENTRAL LTD
- Keywords
- DISEASE GENES; FAMILIES DATABASE; CANDIDATE GENES; SEQUENCE; COMPLEX; ANNOTATION; PREDICTION; BLOCKS; POWER; LOCI
- Research Division(s)
- Bioinformatics
- Publisher's Version
- https://doi.org/10.1186/1471-2156-12-98
- Open Access at Publisher's Site
- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3261104/
- Terms of Use/Rights Notice
- © 2011 Ballouz et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Creation Date: 2011-11-13 12:00:00