Genotype and Phenotype Association Analysis Based on Multi-omics Statistical Data


如何引用文章

全文:

详细

Background:When using clinical data for multi-omics analysis, there are issues such as the insufficient number of omics data types and relatively small sample size due to the protection of patients' privacy, the requirements of data management by various institutions, and the relatively large number of features of each omics data. This paper describes the analysis of multi-omics pathway relationships using statistical data in the absence of clinical data.

Methods:We proposed a novel approach to exploit easily accessible statistics in public databases. This approach introduces phenotypic associations that are not included in the clinical data and uses these data to build a three-layer heterogeneous network. To simplify the analysis, we decomposed the three-layer network into double two-layer networks to predict the weights of the inter-layer associations. By adding a hyperparameter β, the weights of the two layers of the network were merged, and then k-fold cross-validation was used to evaluate the accuracy of this method. In calculating the weights of the two-layer networks, the RWR with fixed restart probability was combined with PBMDA and CIPHER to generate the PCRWR with biased weights and improved accuracy.

Results:The area under the receiver operating characteristic curve was increased by approximately 7% in the case of the RWR with initial weights.

Conclusion:Multi-omics statistical data were used to establish genotype and phenotype correlation networks for analysis, which was similar to the effect of clinical multi-omics analysis.

作者简介

Xinpeng Guo

School of Air and Missile Defense, Air Force Engineering University

编辑信件的主要联系方式.
Email: info@benthamscience.net

Yafei Song

Department of Basic Sciences, Air Force Engineering University

Email: info@benthamscience.net

Dongyan Xu

Department of Basic Sciences, Air Force Engineering University

Email: info@benthamscience.net

Xueping Jin

School of Air and Missile Defense, Air Force Engineering University

Email: info@benthamscience.net

Xuequn Shang

School of Computer Science and Engineering, Northwestern Polytechnical University

编辑信件的主要联系方式.
Email: info@benthamscience.net

参考

  1. Guo X, Song Y, Liu S, Gao M, Qi Y, Shang X. Linking genotype to phenotype in multi-omics data of small sample. BMC Genomics 2021; 22(1): 537. doi: 10.1186/s12864-021-07867-w PMID: 34256701
  2. Guo X, Han J, Song Y, Yin Z, Liu S, Shang X. Using expression quantitative trait loci data and graph-embedded neural networks to uncover genotype–phenotype interactions. Front Genet 2022; 13: 921775. doi: 10.3389/fgene.2022.921775 PMID: 36046233
  3. Guo Y, Liu S, Li Z, Shang X. BCDForest: A boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data. BMC Bioinformatics 2018; 19(S5) (Suppl. 5): 118. doi: 10.1186/s12859-018-2095-4 PMID: 29671390
  4. Guo X, Lu Y, Yin Z, Shang X. IPMM: Cancer subtype clustering model based on multiomics data and pathway and motif information. Cham: Springer International Publishing 2020; pp. 560-8.
  5. Fiscon G, Conte F, Farina L, Paci P. SAveRUNNER: A network-based algorithm for drug repurposing and its application to COVID-19. PLOS Comput Biol 2021; 17(2): e1008686. doi: 10.1371/journal.pcbi.1008686 PMID: 33544720
  6. van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JAM. A text-mining analysis of the human phenome. Eur J Hum Genet 2006; 14(5): 535-42. doi: 10.1038/sj.ejhg.5201585 PMID: 16493445
  7. Kim Y, Park JH, Cho YR. Network-based approaches for disease-gene association prediction using protein-protein interaction networks. Int J Mol Sci 2022; 23(13): 7411. doi: 10.3390/ijms23137411 PMID: 35806415
  8. Wu X, Jiang R, Zhang MQ, Li S. Network-based global inference of human disease genes. Mol Syst Biol 2008; 4(1): 189. doi: 10.1038/msb.2008.27 PMID: 18463613
  9. Gilad Y, Rifkin SA, Pritchard JK. Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet 2008; 24(8): 408-15. doi: 10.1016/j.tig.2008.06.001 PMID: 18597885
  10. Schadt EE, Lamb J, Yang X, et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 2005; 37(7): 710-7. doi: 10.1038/ng1589 PMID: 15965475
  11. Zhu Z, Zhang F, Hu H, et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet 2016; 48(5): 481-7. doi: 10.1038/ng.3538 PMID: 27019110
  12. Roytman M, Kichaev G, Gusev A, Pasaniuc B. Methods for fine-mapping with chromatin and expression data. PLoS Genet 2018; 14(2): e1007240. doi: 10.1371/journal.pgen.1007240 PMID: 29481575
  13. Köhler S, Gargano M, Matentzoglu N, et al. The human phenotype ontology in 2021. Nucleic Acids Res 2021; 49(D1): D1207-17. doi: 10.1093/nar/gkaa1043 PMID: 33264411
  14. Murtagh F, Contreras P. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip Rev Data Min Knowl Discov 2012; 2(1): 86-97. doi: 10.1002/widm.53
  15. Havens TC, Bezdek JC, Leckie C, Hall LO, Palaniswami M. Fuzzy c-means algorithms for very large data. IEEE Trans Fuzzy Syst 2012; 20(6): 1130-46. doi: 10.1109/TFUZZ.2012.2201485
  16. Kohonen T. The self-organizing map. Neurocomputing 1998; 21(1-3): 1-6. doi: 10.1016/S0925-2312(98)00030-7
  17. Wu FX. Genetic weighted k-means algorithm for clustering large-scale gene expression data. BMC Bioinformatics 2008; 9(S6) (Suppl. 6): S12. doi: 10.1186/1471-2105-9-S6-S12 PMID: 18541047
  18. You ZH, Huang ZA, Zhu Z, et al. PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction. PLOS Comput Biol 2017; 13(3): e1005455. doi: 10.1371/journal.pcbi.1005455 PMID: 28339468
  19. Ba-alawi W, Soufan O, Essack M, Kalnis P, Bajic VB. DASPfind: new efficient method to predict drug–target interactions. J Cheminform 2016; 8(1): 15. doi: 10.1186/s13321-016-0128-4 PMID: 26985240
  20. Luo J, Long Y. NTSHMDA: Prediction of human microbe-disease association based on random walk by integrating network topological similarity. IEEE/ACM Trans Comput Biol Bioinform 2020; 17: 1341-51.
  21. Köhler S, Bauer S, Horn D, Robinson PN. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 2008; 82(4): 949-58. doi: 10.1016/j.ajhg.2008.02.013 PMID: 18371930
  22. Li Y, Patra JC. Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network. Bioinformatics 2010; 26(9): 1219-24. doi: 10.1093/bioinformatics/btq108 PMID: 20215462
  23. Chen X, Liu MX, Yan GY. RWRMDA: Predicting novel human microRNA–disease associations. Mol Biosyst 2012; 8(10): 2792-8. doi: 10.1039/c2mb25180a PMID: 22875290
  24. Smedley D, Haider S, Durinck S, et al. The BioMart community portal: An innovative alternative to large, centralized data repositories. Nucleic Acids Res 2015; 43(W1): W589-98. doi: 10.1093/nar/gkv350 PMID: 25897122
  25. Keshava Prasad TS, Goel R, Kandasamy K, et al. Human protein reference database-2009 update. Nucleic Acids Res 2009; 37(Database): D767-72. doi: 10.1093/nar/gkn892 PMID: 18988627
  26. Mathivanan S, Ahmed M, Ahn NG, et al. Human Proteinpedia enables sharing of human protein data. Nat Biotechnol 2008; 26(2): 164-7. doi: 10.1038/nbt0208-164 PMID: 18259167
  27. Piñero J, Bravo À, Queralt-Rosinach N, et al. DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res 2017; 45(D1): D833-9. doi: 10.1093/nar/gkw943 PMID: 27924018
  28. Peng J, Hui W, Li Q, et al. A learning-based framework for miRNA-disease association identification using neural networks. Bioinformatics 2019; 35(21): 4364-71. doi: 10.1093/bioinformatics/btz254 PMID: 30977780
  29. Ramos EM, Hoffman D, Junkins HA, et al. Phenotype–genotype integrator (PheGenI): Synthesizing genome-wide association study (GWAS) data with existing genomic resources. Eur J Hum Genet 2014; 22(1): 144-7. doi: 10.1038/ejhg.2013.96 PMID: 23695286
  30. Cornish AJ, David A, Sternberg MJE. PhenoRank: Reducing study bias in gene prioritization through simulation. Bioinformatics 2018; 34(12): 2087-95. doi: 10.1093/bioinformatics/bty028 PMID: 29360927
  31. Zhang Y, Liu J, Liu X, et al. Prioritizing disease genes with an improved dual label propagation framework. BMC Bioinformatics 2018; 19(1): 47. doi: 10.1186/s12859-018-2040-6 PMID: 29422030
  32. Yang K, Wang R, Liu G, et al. HerGePred: Heterogeneous network embedding representation for disease gene prediction. IEEE J Biomed Health Inform 2019; 23(4): 1805-15. doi: 10.1109/JBHI.2018.2870728 PMID: 31283472

补充文件

附件文件
动作
1. JATS XML

版权所有 © Bentham Science Publishers, 2024