†These authors contributed equally.
Academic Editor: Andrea Ciavattini
Backgrounds: Cervical cancer was a primary epithelial malignant tumor in the cervix, which was one of the most common malignant tumor in gynecology. We aimed to investigate the relation of tumor microenvironment and the prognosis of cervical cancer patients. Methods: We conducted an extensive bioinformatics analysis aims to study the correlation between stromal/immune cells and the prognosis of cervical cancer. In order to investigate the associations between genes and overall survival (OS) of cervical cancer. We performed large-scale data analysis through a global gene expression profile. We analyzed the expression profile of cervical cancer using the Cancer Genome Atlas (TCGA) database. An immune score and stromal score depending on the estimation algorithm which can quantify the stromal or immune components of cervical cancer was obtained. Based on that, we divided the cervical cancer patients in the TCGA database into high- and low-score groups, and then the identified different expression genes (DEGs) that expression associated with cervical cancer patient’s prognosis was identified. After that, we generated protein-protein interaction (PPI) networks and interrelationship analyses of the immune system by performing functional enrichment analysis. Results: Our study showed that these 363 genes were primarily associated with immune/inflammatory responses. Meanwhile, Gene Expression Omnibus (GEO) confirmed that 9 genes (CX3CL1, SCML4, LYZ, FGD2, SLAMF6, GIMAP7, CCL19, SELP and POU2AF1) were significantly associated with cervical cancer prognosis. Conclusions: We have made a list of genes related to tumor microenvironment which would be potential biomarkers for the prognosis of cervical cancer patients.
Cervical cancer ranks third among women’s cancers in the whole world. About 528,000 people suffers from cervical cancer annually, and 266,000 people died of the disease [1, 2]. It is a primary epithelial malignant tumor of cervix, which referred to cancer occurring in the cervix and arise from cervical intraepithelial lesions [3]. Long-term persistent HPV infection significantly increases the incidence of cervical cancer [4, 5, 6]. Without treatment, about a third of Cervical intraepithelial neoplasia (CIN) 3 will become invasive cervical cancer within 30 years [7, 8, 9]. Cervical cancer was a disease that can be prevented and cured for the following reasons: (1) Understanding the main causes of human papillomavirus (HPV) infection [10]; (2) Careful screening and follow-up, active treatment of precancerous lesions [11]; (3) Early diagnosis can lead to a cure [12]. Cervical cytology was undoubtedly the best screening method, although it was the first method used for cancer screening. The purpose of cervical cytology is to detect cancer cells at early stage of cervical cancer [13]. However, the debate was on the operation of a randomized controlled trial (RCT). Most RCT was not always feasible because of the limitation of ethics and cost. It was also different between developed and developing countries [14]. As a result, cervical cancer remains one of the leading causes of cancer deaths worldwide [15], and the need for high level clinical and research evidence was still a key issue to address.
The micro-environment of malignant tumors was composed of immune cells, stromal cells, extracellular stromal molecules and inflammatory mediators [16]. The balance alteration of the immune environment resulted in immunosuppression, which allowed tumor avoidance and immune monitoring avoidance [17]. From gene expression profiles, the ESTIMATE algorithm was used to determine stromal or immune cell ratios in tumor samples [18]. The ESTIMATE algorithm has been applied to hepatocellular carcinoma [19], glioma [20] and acute promyelocytic leukemia [21] as a newly developed algorithm based on tumor micro-environment. However, whether the ESTIMATE algorithm can evaluate the prognosis of cervical cancer was still unclear.
In this study, we aimed to obtain a complete genes expression profile and summarize key genes related with micro-environment in order to predict the prognosis of cervical cancer patients.
Based on the Cancer Genome Atlas (TCGA) data, gene expression profiles and clinical profiles were generated from cervical cancer patients (https://tcga-data.nci.nih.gov/tcga/). And the gene expression RNAseq was obtained from Affymetrix HT-HGU133A (March 15, 2020). We created stromal scores and immune scores by downloaded estimation algorithm database. Finally, we used the Gene Expression Omnibus (GEO) datasets to obtain gene expression profiles of cervical cancer patients for verification (https://www.ncbi.nlm.nih.gov/).
The cervical cancers patients were classified into high stromal/immune groups
and low stromal/immune groups according Estimate scores. We analyzed
differentially expressed genes using the Limma software package. Differentially
expressed genes were screened with adj. p
ClustVwas was used to draw heatmaps and cluster plots.
The database STRING was used for PPI network analysis, Cytoscape software was used to reconstructed it subsequently. Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data. The connectivity degree was analyzed by The MCODE plug-in in Cytoscape of each node of the network.
The association between the DEGs and OS were illustrated by Kaplan-Meier plots. Log-rank test was applied to access the statistical significance.
We use DAVID (The Database for Annotation, Visualization and Integrated
Discovery) for functional enrichment analysis of DEGs, which include their
biological processes (BP), molecular functions (MF), or cellular components (CC).
Meanwhile, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways analysis was
also conducted in DAVID. The cut-off of false discovery rate (FDR) was
Stromal and immune scores were particularly relevant to the clinical characters of cervical cancer. We extracted the full gene expression profiles and clinical information of all 306 cervical cancer patients from the TCGA database. The age of subjects for initial pathological diagnosis was 20~88 years, and the median age was 47.44 years. Histological types of cervical cancer in these patients included adenosquamous (6, 2.0%), squamous cell carcinoma (253, 82.6%), and adenocarcinoma (47, 15.4%). The ESTIMATE algorithm showed that the stromal score ranged from 307.40 to 2568.14, meanwhile, the immune score ranged from 1556.77 to 2086.32. In addition, the scores of stromal and immune were significantly correlated with histological type (Fig. 1A,B).
Immune scores were associated with Cervical cancer subtypes and OS. (A,B) Distribution of stromal scores and Immune scores for Cervical cancer subtypes. (C) The higher immune scores were found to be statistical related to longer OS through the Kaplan-Meier survival curves analysis (log-rank test, p = 0.0243). (D) The median OS of low stromal score group were shorter than the high group, but showed no significant difference (log-rank test, p = 0.3041). (E) Distribution of immune scores for different cancers pathological stage (M0: distant transfer; M1: no distant transfer; p = 0.0226). (F) Distribution of stromal scores for different cancers pathological stage (M0: distant transfer; M1: no distant transfer; p = 0.0558).
To investigate possible associations between stromal or immune system scores and OS, we divided 306 cervical cancer patients into higher and lower halves (low score and high score) according to their scores. Forty-three cases with OS less than 30 days or more than 3000 days were excluded. The Kaplan-Meier survival curve (Fig. 1C) indicated that compared to patients with a low immune score, patients with high immune score had a longer median OS time (2520 vs. 2032 days, p = 0.0243). Besides, the median OS time of high stromal group was longer than the low group (2520 vs. 2052 days, p = 0.3041), although it was not statistically significant (Fig. 1D). In addition, the immune score correlated significantly with the metastatic status (Fig. 1E,F).
A recent meta-analysis showed that different histopathological cervical cancer
subtypes have different survival prognosis [22]. We made a Pearson correlation
analysis in different histopathological cervical cancer subtypes. As shown in
Supplementary Fig. 1 and Supplementary
Table 1, immune group, stromal group and histological type were statistically
correlated (p
To study the relationship between gene expression profile and stromal/immune scores, we evaluated RNA-Seq data from all 306 cervical cancer patients in TCGA. We identified 1225 and 1112 DEGs in accordance with the stromal and Immune scores, respectively (Fig. 2A). We also used the heat map to delineated the DEGs of the low compared to high immune scores/stromal scores groups (Fig. 2B). Besides, we obtained 8 up-regulated genes and 471 down-regulated genes from the immune/stromal subgroup (Fig. 2C). It was worth mentioning that immune scores was superior to stromal scores in the statistically significant differences in clinical characteristics of cervical cancer (Fig. 1). Thus, the DEGs of all subsequent analysis in this study was based on immune scores.
Comparison of gene expression profile with
stromal scores and Immune scores in Cervical cancer. (A) The differential genes
were identified based on stromal score and immune score. DEGs volcano plots of
low immune score group and high immune score group/stromal score group.
p
To further investigate the potential functions of these DEGs,
we performed a functional enrichment analysis of 930 genes down-regulated and 295
genes up-regulated in the low-immune scores group (Supplementary Table
2). The top-level GO terms of down-regulated genes identified included plasma
membrane, receptor activity, immune and inflammatory responses, and chemokine
activity (Fig. 2D–F). However, the functional enrichment analysis of
up-regulated genes didn’t meet our requirements of FDR
We used Kaplan-Meier survival curves to analysis the inherent
roles of DEGs in predicting the OS in cervical cancer patients. The log-rank test
indicated that of 930 down-regulated DEGs in the low-immune scores group, 363
(Supplementary Table 4) were positively in relation with OS (p
Conjunction in TCGA between individual DEGs expression and
cervical cancer OS. After compared the two gene expression groups (high in red;
low in green), we picked up the DEGs, then Kaplan-Meier survival curves were made
(log rank test p
In order to investigate the interaction between determined 363 DEGs and prognostic value, we built a PPI network, which consists of 230 nodes and 1043 edges based on the STRING tool. As showed in Fig. 4, the tops four significant modules were further accessed. And these modules LCK, CCR7, GIMAP6 and CTLA4 modules were named respectively for convenience of description. For the LCK module (Fig. 4A), it was formed by 201 edges containing 24 nodes, LCK, ZAP70, CD4, CD28, and CD86 were the nodes of note among them because they were most closely connected to others of the module. For the CCR7 module (Fig. 4B), the degree values were higher for CCR7, CXCR5, CCR5, GNGT2 and CXCR6 nodes. As showed in the GIMAP6 module, several immune-related protein (GIMAP) family genes tenanted the center of it, which include GIMAP6, GIMAP5, GIMAP7, GIMAP8, and GIMAP4 (Fig. 4C). In CTLA4 module, CTLA4, CD19, FOXP3 and TYROBP had higher degree values (Fig. 4D).
Top 4 PPI networks of LCK, CCR7, GIMAP6 and CTLA4 modules. In the PPI network, the gene expression Z-score was indicated by the node’s color, while the amount of proteins that interacted with a given protein were represented by the node’s size. (A) LCK module, (B) CCR7 module, (C) GIMAP6 module, and (D) CTLA4 module.
The functional aggregation of those 363 genes were also closely related to the
immune response. Totally, 10 cellular components, 41 biological processes and 9
molecular functions of GO terms were found to be important (-log FDR
Significant analysis of DEGs for GO term and
KEGG pathway. (A) Cellular component, (B) biological process, (C) molecular
function, and (D) KEGG pathway (Top pathways with FDR
For determination of the prognostic importance of these genes among TCGA in other cervical cancer cases, we analyzed gene expression data of 300 cervical cancer cases from GEO. Based on the GSE44001 sequence, a total of 9 genes (CX3CL1, SCML4, LYZ, FGD2, SLAMF6, GIMAP7, CCL19, SELP and POU2AF1) have been shown to be significantly associated with prognosis of cervical cancer (Fig. 6).
The prognostic value of genes verified in GEO
database. To demonstrate the prognostic value of genes (high in red;
low in green), Cox
Regression survival curves were made by a log-rank test. p
In this study, we analyzed the micro-environment of cervical cancer and identified the relevant genes with prognostic value in TCGA. We respectively identified 1225 and 1112 DEGs through stromal and immune scores. Next, we analyzed these DEGs through GO terms analysis, enrichment analysis of signaling pathway and construction of PPI network. Importantly, we also validated these potential prognostic genes in the GEO database.
First, we identified 1225 differentially expressed genes from low vs. high immune scores groups, partially showed connected with the tumor micro-environment, which was shown in Fig. 2 by GO term analysis. An algorithm named ESTIMATE was used to determine Stromal scores or immune scores for patients with cervical cancer, which indicated that these scores were particularly connect to the categorization of cervical cancer subtypes. Besides, immune scores were significantly correlated to OS in different pathological stages of cancer.
Next, we analyzed the OS of 1225 genes, and found that 363 genes were positively correlated with OS of cervical cancer. In addition, we constructed 4 PPI modules in Fig. 4, which were all associated with immune/inflammatory responses. As previously reported, the nodes in modules which were greatly interconnected that include LCK, CCR7, and GIMAP4, played an vital role in the function of mature T-cells and the selection and maturation of developing T-cells in the thymus [23, 24]. They also were regulatory factors that mediate Epstein-Barr virus (EBV) effecting on B lymphocytes [25, 26]. Moreover, these nodes played a vitally role in the regulation of apoptosis [27, 28].
Finally, we found that gene expression of 9 tumor micro-environment related genes was significantly correlated with prognosis through the validation of GEO in an independent cohort of 300 cervical cancer patients (Fig. 6). Two (LYZ and CCL19) of the 9 identified genes, were reported to be related to cervical cancer proliferation or OS prediction [29], which indicated the predictive values for our big data through analysis TCGA and GEO databases. The remaining 7 genes (CX3CL1, SCML4, FGD2, SLAMF6, GIMAP7, SELP, POU2AF1) were not obviously correlated with the cervical cancer prognosis in previous reports and their roles in cervical cancer deserve further investigation.
Conclusively, we conducted an integrated bioinformatics analysis of the cervical cancer, which focused on the immune micro-environment. Meanwhile, we determined, explored, and verified common DEGs to illuminate the prognostic value of these DEGs in cervical cancer patients. The genes that we identified in this study would be potential target for further investigation of cervical cancer. Our results added to the current understanding of the interaction relationship between cervical cancer and the immune micro-environment, which may provide new possibilities for therapeutic and prognostic targets.
The gene expression profile and clinical profiles of cervical cancer patients was obtained from the TCGA (https://tcga-data.nci.nih.gov/tcga/) and GEO Datasets (https://www.ncbi.nlm.nih.gov/).
Among the authors in the list, PQZ and ML designed the research. PSC, WJW and XXY performed the research and wrote the manuscript. XXY, PSC and MZH improved the draft. And all authors read and approved the final manuscript.
Ethics approval was not applicable because TCGA and GEO data is publicly available and without specific identifiers.
We would like to express our gratitude to all those who helped us during the writing of this manuscript. Thanks to all the peer reviewers for their opinions and suggestions.
This research received no external funding.
The authors declare no conflict of interest.