IMR Press / FBL / Volume 27 / Issue 8 / DOI: 10.31083/j.fbl2708225
Open Access Original Research
Identification of Gene-Environment Interactions by Non-Parametric Kendall’s Partial Correlation with Application to TCGA Ultrahigh-Dimensional Survival Genomic Data
Show Less
1 Department of Statistics, Feng Chia University, 40724 Taichung, Taiwan
*Correspondence: jhwang@mail.fcu.edu.tw (Jie-Huei Wang)
Academic Editor: Alexandros G. Georgakilas
Front. Biosci. (Landmark Ed) 2022, 27(8), 225; https://doi.org/10.31083/j.fbl2708225
Submitted: 19 April 2022 | Revised: 29 June 2022 | Accepted: 7 July 2022 | Published: 18 July 2022
Copyright: © 2022 The Author(s). Published by IMR Press.
This is an open access article under the CC BY 4.0 license.
Abstract

Background: In biomedical and epidemiological studies, gene-environment (G-E) interactions play an important role in the etiology and progression of many complex diseases. In ultra-high-dimensional survival genomic data, two common approaches (marginal and joint models) are proposed to determine important interaction biomarkers. Most existing methods for detecting G-E interactions (marginal Cox model and marginal accelerated failure time model) are limited by a lack of robustness to contamination/outliers in response outcome and prediction biomarkers. In particular, right-censored survival outcomes and ultra-high-dimensional feature space make relevant feature screening even more challenging. Methods: In this paper, we utilize the non-parametric Kendall’s partial correlation method to obtain pure correlation to determine the importance of G-E interactions concerning clinical survival data under a marginal modeling framework. Results: A series of simulated scenarios are conducted to compare the performance of our proposed method (Kendall’s partial correlation) with some commonly used methods (marginal Cox’s model, marginal accelerated failure time model, and censoring quantile partial correlation approach). In real data applications, we utilize Kendall’s partial correlation method to identify G-E interactions related to the clinical survival results of patients with esophageal, pancreatic, and lung carcinomas using The Cancer Genome Atlas clinical survival genetic data, and further establish survival prediction models. Conclusions: Overall, both simulation with medium censoring level and real data studies show that our method performs well and outperforms existing methods in the selection, estimation, and prediction accuracy of main and interacting biomarkers. These applications reveal the advantages of the non-parametric Kendall’s partial correlation approach over alternative semi-parametric marginal modeling methods. We also identified the cancer-related G-E interactions biomarkers and reported the corresponding coefficients with p-values.

Keywords
gene-environment interaction
Kendall's correlation
marginal modeling
partial correlation
survival prediction
TCGA
Figures
Fig. 1.
Funding
MOST 110-2118-M-035-001-MY2/Ministry of Science and Technology of Republic of China (Taiwan)
Share
Back to top