IMR Press / CEOG / Volume 49 / Issue 6 / DOI: 10.31083/j.ceog4906144
Open Access Original Research
Identification and Validation of LYZ and CCL19 as Prognostic Genes in the Cervical Cancer Micro-Environment
Show Less
1 Department of Laboratory Medicine, The First Affiliated Hospital, Sun Yat-sen University, 510000 Guangzhou, Guangdong, China
2 Traditional Chinese Medicine Hospital of Guangdong Province, Guangzhou University of Traditional Chinese Medicine, 510000 Guangzhou, Guangdong, China
3 Gynecology Department, Guangdong Women and Children Hospital, 511442 Guangzhou, Guangdong, China
*Correspondence: (Peisong Chen); (Min Liu)
These authors contributed equally.
Academic Editor: Andrea Ciavattini
Clin. Exp. Obstet. Gynecol. 2022 , 49(6), 144;
Submitted: 25 January 2022 | Revised: 19 April 2022 | Accepted: 5 May 2022 | Published: 17 June 2022
Copyright: © 2022 The Author(s). Published by IMR Press.
This is an open access article under the CC BY 4.0 license.

Backgrounds: Cervical cancer was a primary epithelial malignant tumor in the cervix, which was one of the most common malignant tumor in gynecology. We aimed to investigate the relation of tumor microenvironment and the prognosis of cervical cancer patients. Methods: We conducted an extensive bioinformatics analysis aims to study the correlation between stromal/immune cells and the prognosis of cervical cancer. In order to investigate the associations between genes and overall survival (OS) of cervical cancer. We performed large-scale data analysis through a global gene expression profile. We analyzed the expression profile of cervical cancer using the Cancer Genome Atlas (TCGA) database. An immune score and stromal score depending on the estimation algorithm which can quantify the stromal or immune components of cervical cancer was obtained. Based on that, we divided the cervical cancer patients in the TCGA database into high- and low-score groups, and then the identified different expression genes (DEGs) that expression associated with cervical cancer patient’s prognosis was identified. After that, we generated protein-protein interaction (PPI) networks and interrelationship analyses of the immune system by performing functional enrichment analysis. Results: Our study showed that these 363 genes were primarily associated with immune/inflammatory responses. Meanwhile, Gene Expression Omnibus (GEO) confirmed that 9 genes (CX3CL1, SCML4, LYZ, FGD2, SLAMF6, GIMAP7, CCL19, SELP and POU2AF1) were significantly associated with cervical cancer prognosis. Conclusions: We have made a list of genes related to tumor microenvironment which would be potential biomarkers for the prognosis of cervical cancer patients.

cervical cancer
immune micro-environment
bioinformatics analysis
1. Background

Cervical cancer ranks third among women’s cancers in the whole world. About 528,000 people suffers from cervical cancer annually, and 266,000 people died of the disease [1, 2]. It is a primary epithelial malignant tumor of cervix, which referred to cancer occurring in the cervix and arise from cervical intraepithelial lesions [3]. Long-term persistent HPV infection significantly increases the incidence of cervical cancer [4, 5, 6]. Without treatment, about a third of Cervical intraepithelial neoplasia (CIN) 3 will become invasive cervical cancer within 30 years [7, 8, 9]. Cervical cancer was a disease that can be prevented and cured for the following reasons: (1) Understanding the main causes of human papillomavirus (HPV) infection [10]; (2) Careful screening and follow-up, active treatment of precancerous lesions [11]; (3) Early diagnosis can lead to a cure [12]. Cervical cytology was undoubtedly the best screening method, although it was the first method used for cancer screening. The purpose of cervical cytology is to detect cancer cells at early stage of cervical cancer [13]. However, the debate was on the operation of a randomized controlled trial (RCT). Most RCT was not always feasible because of the limitation of ethics and cost. It was also different between developed and developing countries [14]. As a result, cervical cancer remains one of the leading causes of cancer deaths worldwide [15], and the need for high level clinical and research evidence was still a key issue to address.

The micro-environment of malignant tumors was composed of immune cells, stromal cells, extracellular stromal molecules and inflammatory mediators [16]. The balance alteration of the immune environment resulted in immunosuppression, which allowed tumor avoidance and immune monitoring avoidance [17]. From gene expression profiles, the ESTIMATE algorithm was used to determine stromal or immune cell ratios in tumor samples [18]. The ESTIMATE algorithm has been applied to hepatocellular carcinoma [19], glioma [20] and acute promyelocytic leukemia [21] as a newly developed algorithm based on tumor micro-environment. However, whether the ESTIMATE algorithm can evaluate the prognosis of cervical cancer was still unclear.

In this study, we aimed to obtain a complete genes expression profile and summarize key genes related with micro-environment in order to predict the prognosis of cervical cancer patients.

2. Methods
2.1 Data Acquisition

Based on the Cancer Genome Atlas (TCGA) data, gene expression profiles and clinical profiles were generated from cervical cancer patients ( And the gene expression RNAseq was obtained from Affymetrix HT-HGU133A (March 15, 2020). We created stromal scores and immune scores by downloaded estimation algorithm database. Finally, we used the Gene Expression Omnibus (GEO) datasets to obtain gene expression profiles of cervical cancer patients for verification (

2.2 Identification of Different Expression Genes (DEGs)

The cervical cancers patients were classified into high stromal/immune groups and low stromal/immune groups according Estimate scores. We analyzed differentially expressed genes using the Limma software package. Differentially expressed genes were screened with adj. p < 0.05 and fold change >1.

2.3 Heatmap Analysis and Cluster Analysis

ClustVwas was used to draw heatmaps and cluster plots.

2.4 Protein-protein Interaction (PPI) Network Analysis

The database STRING was used for PPI network analysis, Cytoscape software was used to reconstructed it subsequently. Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data. The connectivity degree was analyzed by The MCODE plug-in in Cytoscape of each node of the network.

2.5 Overall Survival (OS) Curve

The association between the DEGs and OS were illustrated by Kaplan-Meier plots. Log-rank test was applied to access the statistical significance.

2.6 DEGs Enrichment Analysis

We use DAVID (The Database for Annotation, Visualization and Integrated Discovery) for functional enrichment analysis of DEGs, which include their biological processes (BP), molecular functions (MF), or cellular components (CC). Meanwhile, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways analysis was also conducted in DAVID. The cut-off of false discovery rate (FDR) was <0.05.

3. Results

Stromal and immune scores were particularly relevant to the clinical characters of cervical cancer. We extracted the full gene expression profiles and clinical information of all 306 cervical cancer patients from the TCGA database. The age of subjects for initial pathological diagnosis was 20~88 years, and the median age was 47.44 years. Histological types of cervical cancer in these patients included adenosquamous (6, 2.0%), squamous cell carcinoma (253, 82.6%), and adenocarcinoma (47, 15.4%). The ESTIMATE algorithm showed that the stromal score ranged from 307.40 to 2568.14, meanwhile, the immune score ranged from 1556.77 to 2086.32. In addition, the scores of stromal and immune were significantly correlated with histological type (Fig. 1A,B).

Fig. 1.

Immune scores were associated with Cervical cancer subtypes and OS. (A,B) Distribution of stromal scores and Immune scores for Cervical cancer subtypes. (C) The higher immune scores were found to be statistical related to longer OS through the Kaplan-Meier survival curves analysis (log-rank test, p = 0.0243). (D) The median OS of low stromal score group were shorter than the high group, but showed no significant difference (log-rank test, p = 0.3041). (E) Distribution of immune scores for different cancers pathological stage (M0: distant transfer; M1: no distant transfer; p = 0.0226). (F) Distribution of stromal scores for different cancers pathological stage (M0: distant transfer; M1: no distant transfer; p = 0.0558).

To investigate possible associations between stromal or immune system scores and OS, we divided 306 cervical cancer patients into higher and lower halves (low score and high score) according to their scores. Forty-three cases with OS less than 30 days or more than 3000 days were excluded. The Kaplan-Meier survival curve (Fig. 1C) indicated that compared to patients with a low immune score, patients with high immune score had a longer median OS time (2520 vs. 2032 days, p = 0.0243). Besides, the median OS time of high stromal group was longer than the low group (2520 vs. 2052 days, p = 0.3041), although it was not statistically significant (Fig. 1D). In addition, the immune score correlated significantly with the metastatic status (Fig. 1E,F).

A recent meta-analysis showed that different histopathological cervical cancer subtypes have different survival prognosis [22]. We made a Pearson correlation analysis in different histopathological cervical cancer subtypes. As shown in Supplementary Fig. 1 and Supplementary Table 1, immune group, stromal group and histological type were statistically correlated (p < 0.05). However, the correlation between immune group and histological type (r value: –0.22), stromal group and histological type (r value: –0.13) was weak. The weak correlation showed that immune group and histological type, stromal group and histological type could be used as a complementary in the prediction of prognosis.

3.1 Comparison of DEGs via Stromal Scores and Immune Scores in Cervical Cancer

To study the relationship between gene expression profile and stromal/immune scores, we evaluated RNA-Seq data from all 306 cervical cancer patients in TCGA. We identified 1225 and 1112 DEGs in accordance with the stromal and Immune scores, respectively (Fig. 2A). We also used the heat map to delineated the DEGs of the low compared to high immune scores/stromal scores groups (Fig. 2B). Besides, we obtained 8 up-regulated genes and 471 down-regulated genes from the immune/stromal subgroup (Fig. 2C). It was worth mentioning that immune scores was superior to stromal scores in the statistically significant differences in clinical characteristics of cervical cancer (Fig. 1). Thus, the DEGs of all subsequent analysis in this study was based on immune scores.

Fig. 2.

Comparison of gene expression profile with stromal scores and Immune scores in Cervical cancer. (A) The differential genes were identified based on stromal score and immune score. DEGs volcano plots of low immune score group and high immune score group/stromal score group. p < 0.05 genes were expressed in red (fold change >1) and green (fold change <1) respectively. The black plots show the remaining genes (no significant difference). (B) Heat maps of the DEGs of upper half (high scores) vs. lower half (low scores) of the immune score/stromal score groups (fold change >1, p < 0.05). Heat maps were made according to the mean linkage method and Pearson distance measurement method. The lower expressed genes were marked in green, while the higher expressed genes were marked in red. (C) Commonly changed DEGs in the immune/stromal groups (8 up- and 471 down-regulated genes). (D–F) Top 10 gene ontology (GO) terms. False discovery rate (FDR) of GO analysis was acquired from DAVID functional annotation tool (p < 0.05).

To further investigate the potential functions of these DEGs, we performed a functional enrichment analysis of 930 genes down-regulated and 295 genes up-regulated in the low-immune scores group (Supplementary Table 2). The top-level GO terms of down-regulated genes identified included plasma membrane, receptor activity, immune and inflammatory responses, and chemokine activity (Fig. 2D–F). However, the functional enrichment analysis of up-regulated genes didn’t meet our requirements of FDR <0.05 (Supplementary Table 3).

3.2 DEGs in Predicting OS Based on TCGA

We used Kaplan-Meier survival curves to analysis the inherent roles of DEGs in predicting the OS in cervical cancer patients. The log-rank test indicated that of 930 down-regulated DEGs in the low-immune scores group, 363 (Supplementary Table 4) were positively in relation with OS (p < 0.05). The characteristic genes were shown in Fig. 3.

Fig. 3.

Conjunction in TCGA between individual DEGs expression and cervical cancer OS. After compared the two gene expression groups (high in red; low in green), we picked up the DEGs, then Kaplan-Meier survival curves were made (log rank test p < 0.05) (log rank test p < 0.05). (A) CD6, (B) MAP4K1, (C) SELP, (D) DPEP2, (E) ZNF831, (F) LOC100188949, (G) CCR7, (H) LILRA4, (I) NLRC3, (J) TTC24, (K) CHIT1 and (L) SNAI3.

3.3 PPI among Prognostic Genes

In order to investigate the interaction between determined 363 DEGs and prognostic value, we built a PPI network, which consists of 230 nodes and 1043 edges based on the STRING tool. As showed in Fig. 4, the tops four significant modules were further accessed. And these modules LCK, CCR7, GIMAP6 and CTLA4 modules were named respectively for convenience of description. For the LCK module (Fig. 4A), it was formed by 201 edges containing 24 nodes, LCK, ZAP70, CD4, CD28, and CD86 were the nodes of note among them because they were most closely connected to others of the module. For the CCR7 module (Fig. 4B), the degree values were higher for CCR7, CXCR5, CCR5, GNGT2 and CXCR6 nodes. As showed in the GIMAP6 module, several immune-related protein (GIMAP) family genes tenanted the center of it, which include GIMAP6, GIMAP5, GIMAP7, GIMAP8, and GIMAP4 (Fig. 4C). In CTLA4 module, CTLA4, CD19, FOXP3 and TYROBP had higher degree values (Fig. 4D).

Fig. 4.

Top 4 PPI networks of LCK, CCR7, GIMAP6 and CTLA4 modules. In the PPI network, the gene expression Z-score was indicated by the node’s color, while the amount of proteins that interacted with a given protein were represented by the node’s size. (A) LCK module, (B) CCR7 module, (C) GIMAP6 module, and (D) CTLA4 module.

3.4 GO and Pathway Enrichment Analyses of Genes with Prognostic Value

The functional aggregation of those 363 genes were also closely related to the immune response. Totally, 10 cellular components, 41 biological processes and 9 molecular functions of GO terms were found to be important (-log FDR >1 or FDR <0.05). As showed in Fig. 5A–C, top terms consisted of external plasma membrane, immune response, adaptive immune response, and receptor activity. As showed in Fig. 5D, we analyzed all pathways through the KEGG, that indicated the correlation between primary immunodeficiency and inflammatory bowel disease.

Fig. 5.

Significant analysis of DEGs for GO term and KEGG pathway. (A) Cellular component, (B) biological process, (C) molecular function, and (D) KEGG pathway (Top pathways with FDR <0.05, -log FDR >1 were shown).

3.5 Validation Used by GEO Database

For determination of the prognostic importance of these genes among TCGA in other cervical cancer cases, we analyzed gene expression data of 300 cervical cancer cases from GEO. Based on the GSE44001 sequence, a total of 9 genes (CX3CL1, SCML4, LYZ, FGD2, SLAMF6, GIMAP7, CCL19, SELP and POU2AF1) have been shown to be significantly associated with prognosis of cervical cancer (Fig. 6).

Fig. 6.

The prognostic value of genes verified in GEO database. To demonstrate the prognostic value of genes (high in red; low in green), Cox Regression survival curves were made by a log-rank test. p < 0.05 were statistical difference. (A) CX3CL1, (B) SCML4, (C) LYZ, (D) FGD2, (E) SLAMF6, (F) GIMAP7, (G) CCL19, (H) SELP and (I) POU2AF1.

4. Discussion

In this study, we analyzed the micro-environment of cervical cancer and identified the relevant genes with prognostic value in TCGA. We respectively identified 1225 and 1112 DEGs through stromal and immune scores. Next, we analyzed these DEGs through GO terms analysis, enrichment analysis of signaling pathway and construction of PPI network. Importantly, we also validated these potential prognostic genes in the GEO database.

First, we identified 1225 differentially expressed genes from low vs. high immune scores groups, partially showed connected with the tumor micro-environment, which was shown in Fig. 2 by GO term analysis. An algorithm named ESTIMATE was used to determine Stromal scores or immune scores for patients with cervical cancer, which indicated that these scores were particularly connect to the categorization of cervical cancer subtypes. Besides, immune scores were significantly correlated to OS in different pathological stages of cancer.

Next, we analyzed the OS of 1225 genes, and found that 363 genes were positively correlated with OS of cervical cancer. In addition, we constructed 4 PPI modules in Fig. 4, which were all associated with immune/inflammatory responses. As previously reported, the nodes in modules which were greatly interconnected that include LCK, CCR7, and GIMAP4, played an vital role in the function of mature T-cells and the selection and maturation of developing T-cells in the thymus [23, 24]. They also were regulatory factors that mediate Epstein-Barr virus (EBV) effecting on B lymphocytes [25, 26]. Moreover, these nodes played a vitally role in the regulation of apoptosis [27, 28].

Finally, we found that gene expression of 9 tumor micro-environment related genes was significantly correlated with prognosis through the validation of GEO in an independent cohort of 300 cervical cancer patients (Fig. 6). Two (LYZ and CCL19) of the 9 identified genes, were reported to be related to cervical cancer proliferation or OS prediction [29], which indicated the predictive values for our big data through analysis TCGA and GEO databases. The remaining 7 genes (CX3CL1, SCML4, FGD2, SLAMF6, GIMAP7, SELP, POU2AF1) were not obviously correlated with the cervical cancer prognosis in previous reports and their roles in cervical cancer deserve further investigation.

5. Conclusions

Conclusively, we conducted an integrated bioinformatics analysis of the cervical cancer, which focused on the immune micro-environment. Meanwhile, we determined, explored, and verified common DEGs to illuminate the prognostic value of these DEGs in cervical cancer patients. The genes that we identified in this study would be potential target for further investigation of cervical cancer. Our results added to the current understanding of the interaction relationship between cervical cancer and the immune micro-environment, which may provide new possibilities for therapeutic and prognostic targets.

Availability of Data and Materials

The gene expression profile and clinical profiles of cervical cancer patients was obtained from the TCGA ( and GEO Datasets (

Author Contributions

Among the authors in the list, PQZ and ML designed the research. PSC, WJW and XXY performed the research and wrote the manuscript. XXY, PSC and MZH improved the draft. And all authors read and approved the final manuscript.

Ethics Approval and Consent to Participate

Ethics approval was not applicable because TCGA and GEO data is publicly available and without specific identifiers.


We would like to express our gratitude to all those who helped us during the writing of this manuscript. Thanks to all the peer reviewers for their opinions and suggestions.


This research received no external funding.

Conflict of Interest

The authors declare no conflict of interest.

Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. International Journal of Cancer. 2015; 136: E359–386.
Small W, Bacon MA, Bajaj A, Chuang LT, Fisher BJ, Harkenrider MM, et al. Cervical cancer: a global health crisis. Cancer. 2017; 123: 2404–2412.
Burd EM. Human Papillomavirus and Cervical Cancer. Clinical Microbiology Reviews. 2003; 16: 1–17.
Chesson HW, Dunne EF, Hariri S, Markowitz LE. The Estimated Lifetime Probability of Acquiring Human Papillomavirus in the United States. Sexually Transmitted Diseases. 2014; 41: 660–664.
Winer RL, Hughes JP, Feng Q, Xi LF, Cherne S, O’Reilly S, et al. Early Natural History of Incident, Type-Specific Human Papillomavirus Infections in Newly Sexually Active Young Women. Cancer Epidemiology Biomarkers & Prevention. 2011; 20: 699–707.
Woodman CBJ, Collins SI, Young LS. The natural history of cervical HPV infection: unresolved issues. Nature Reviews Cancer. 2007; 7: 11–22.
Crosbie EJ, Einstein MH, Franceschi S, Kitchener HC. Human papillomavirus and cervical cancer. The Lancet. 2013; 382: 889–899.
Ghosh I, Mittal S, Banerjee D, Chowdhury N, Basu P. Study of Correlation of Cervical Epithelial Thickness with the Grade of Colposcopic Abnormality. International Journal of Gynecological Pathology. 2016; 35: 269–274.
McCredie MR, Sharples KJ, Paul C, Baranyai J, Medley G, Jones RW, et al. Natural history of cervical neoplasia and risk of invasive cancer in women with cervical intraepithelial neoplasia 3: a retrospective cohort study. The Lancet Oncology. 2008; 9: 425–434.
Wentzensen N, Arbyn M. HPV-based cervical cancer screening- facts, fiction, and misperceptions. Preventive Medicine. 2017; 98: 33–35.
Mezei AK, Armstrong HL, Pedersen HN, Campos NG, Mitchell SM, Sekikubo M, et al. Cost-effectiveness of cervical cancer screening methods in low- and middle-income countries: a systematic review. International Journal of Cancer. 2017; 141: 437–446.
William W, Ware A, Basaza-Ejiri AH, Obungoloch J. A review of image analysis and machine learning techniques for automated cervical cancer screening from pap-smear images. Computer Methods and Programs in Biomedicine. 2018; 164: 15–22.
Kanyina EW, Kamau L, Muturi M. Cervical precancerous changes and selected cervical microbial infections, Kiambu County, Kenya, 2014: a cross sectional study. BMC Infectious Diseases. 2017; 17: 647.
Castellsagué X. Natural history and epidemiology of HPV infection and cervical cancer. Gynecologic Oncology. 2008; 110: S4–S7.
Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA: A Cancer Journal for Clinicians. 2015; 65: 87–108.
De Nola R, Menga A, Castegna A, Loizzi V, Ranieri G, Cicinelli E, et al. The Crowded Crosstalk between Cancer Cells and Stromal Microenvironment in Gynecological Malignancies: Biological Pathways and Therapeutic Implication. International Journal of Molecular Sciences. 2019; 20: 2401.
Solomon B, Young RJ, Rischin D. Head and neck squamous cell carcinoma: Genomics and emerging biomarkers for immunomodulatory cancer treatments. Seminars in Cancer Biology. 2018; 52: 228–240.
Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nature Communications. 2013; 4: 2612.
Pan L, Fang J, Chen M, Zhai S, Zhang B, Jiang Z, et al. Promising key genes associated with tumor microenvironments and prognosis of hepatocellular carcinoma. World Journal of Gastroenterology. 2020; 26: 789–803.
Jia D, Li S, Li D, Xue H, Yang D, Liu Y. Mining TCGA database for genes of prognostic value in glioblastoma microenvironment. Aging. 2018; 10: 592–605.
Yan H, Qu J, Cao W, Liu Y, Zheng G, Zhang E, et al. Identification of prognostic genes in the acute myeloid leukemia immune microenvironment based on TCGA data analysis. Cancer Immunology, Immunotherapy. 2019; 68: 1971–1978.
Yao G, Qiu J, Zhu F, Wang X. Survival of Patients with Cervical Cancer Treated With Definitive Radiotherapy or Concurrent Chemoradiotherapy According to Histological Subtype: A Systematic Review and Meta-Analysis. Frontiers in Medicine. 2022; 9: 843262.
Horkova V, Drobek A, Mueller D, Gubser C, Niederlova V, Wyss L, et al. Dynamics of the Coreceptor-LCK Interactions during T Cell Development Shape the Self-Reactivity of Peripheral CD4 and CD8 T Cells. Cell Reports. 2020; 30: 1504–1514.e7.
Zheng M, Li D, Zhao Z, Shytikov D, Xu Q, Jin X, et al. Protein phosphatase 2A has an essential role in promoting thymocyte survival during selection. Proceedings of the National Academy of Sciences. 2019; 116: 12422–12427.
Pimienta G, Fok V, Haslip M, Nagy M, Takyar S, Steitz JA. Proteomics and Transcriptomics of BJAB Cells Expressing the Epstein-Barr Virus Noncoding RNAs EBER1 and EBER2. PLoS ONE. 2015; 10: e0124638.
Messinger JE, Dai J, Stanland LJ, Price AM, Luftig MA. Identification of Host Biomarkers of Epstein-Barr Virus Latency IIb and Latency III. mBio. 2019; 10: e01006-19.
Heinonen MT, Kanduri K, Lähdesmäki HJ, Lahesmaa R, Henttinen TA. Tubulin- and actin-associating GIMAP4 is required for IFN-gamma secretion during Th cell differentiation. Immunology and Cell Biology. 2015; 93: 158–166.
Harr MW, Caimi PF, McColl KS, Zhong F, Patel SN, Barr PM, et al. Inhibition of Lck enhances glucocorticoid sensitivity and apoptosis in lymphoid cell lines and in chronic lymphocytic leukemia. Cell Death & Differentiation. 2010; 17: 1381–1391.
Wang Q, Li P, Wu W. A systematic analysis of immune genes and overall survival in cancer patients. BMC Cancer. 2019; 19: 1225.
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Back to top