Integrative Analysis of Ferroptosis-Related Genes in Small Cell Lung Cancer for the Identification of Biomarkers and Therapeutic Targets

Background : Ferroptosis is an iron-dependent programmed cell death mode induced by the toxic buildup of phospholipid peroxidation. Although it is known to affect the initiation and growth of tumors, the association between ferroptosis-related genes (FRGs) and small cell lung cancer (SCLC) has yet to be established. Methods : We used the gene expression omnibus (GEO) and ferroptosis database (FerrDb) to acquire information on SCLC and its associated FRGs. Marker genes were subsequently identified using Least Absolute Shrinkage and Selection Operator (LASSO) and support vector machine recursive feature eilmination (SVM-RFE) algorithms and analyzed for single-gene function and pathway enrichment. Using the drug-gene interaction database (DGIdb), we identified forty drugs targeting six marker genes. The competing endogenous RNA (ceRNA) network revealed the regulation pattern for long non-coding RNA (LncRNA)- microRNA (miRNA)-messenger RNA (mRNA) based on marker genes. Results : Six differentially expressed FRGs ( ATG3 , MUC1 , RRM2 , IDH2 , PARP1 , and EZH2 ) were identified as marker genes with accurate diagnostic capabilities. According to single-gene function and pathway enrichment analyses, these marker genes may be involved in immunomodulation and the cell cycle, as well as numerous pathways connected to tumorigenesis, including the JAK-STAT and PPAR signal pathways. In addition, CIBERSORT analysis showed that MUC1 and PARP1 expression may affect the immune microenvironment in SCLC. Conclusions : We confirmed the accuracy of marker genes for the diagnosis of SCLC using a logistic regression model, thus providing further opportunities to study SCLC-related mechanisms. The accuracy of these results for the diagnosis of SCLC must now be confirmed by further research prior to clinical application.


Introduction
Lung cancer is the most lethal cancer type in the world [1]. Small cell lung cancer (SCLC) is a highly malignant tumor type that accounts for about 15-20% of all lung cancers [2,3]. The 5-year survival rate for SCLC is <10% due to its rapid growth and frequent metastasis [4]. Immunotherapy with PD-L1 antibody in combination with platinumetoposide has gradually become the standard first-line treatment for extensive SCLC [5,6]. Although most patients with SCLC initially respond well, they soon develop resistance [7], and hence only a few patients benefit from immunotherapy. Therefore, novel biomarkers that can help to treat SCLC are urgently needed.
Ferroptosis is a newly recognized mode of programmed cell death identified several years ago. It manifests mainly as the accumulation of ferritin and lipid peroxide on the membrane [8]. Ferroptosis differs from apoptosis and necrosis in that it characteristically shows mitochondrial shrinkage and a decrease in the number of mitochondrial cristae [9]. Many diseases have been linked to ferroptosis, including neurological disorders, coronary heart disease, ischemia/reperfusion injury, and different types of tumors [10]. Abnormal metabolism is the most important factor that determines sensitivity to ferroptosis [11,12]. In particular, deficiency of the antioxidant glutathione and inactivation of glutathione peroxidase 4 (GPX4) can increase lipid peroxidation, thereby inducing ferroptosis [13]. Ferroptosis also plays a significant and complex role in cancer cells. Some studies have reported that it can promote tumor growth by inhibiting anti-tumor immunity [14]. However, other authors have shown that ferroptosis could also inhibit tumor growth and induce resistance to chemotherapy drugs [15].
The involvement of ferroptosis in the development of lung cancer has been confirmed. Lung cancer patients have higher serum iron and ferritin levels than healthy patients, as well as higher total iron binding capacity. The serum iron concentration has been positively correlated with the incidence of lung cancer [16]. GPX4 is highly expressed in non-small cell lung cancer (NSCLC) cell lines A549 and NCI-H460, and inhibition of GPX4 can induce ferroptosis in lung cancer cells [17]. In addition, ferroptosis has been associated with some of the signaling pathways related to SCLC. JAK2/STAT3 inactivation could cause GPX4dependent ferroptosis by promoting the expression of p53 [18]. Moreover, the PI3K/AKT/HIF-1α axis can enhance the resistance of glioma to ferroptosis by up-regulating SLC7A11 [19]. However, the specific relationship between ferroptosis and SCLC has not been extensively studied. It is still unknown whether ferroptosis-related genes (FRGs) could help in the diagnosis and treatment of SCLC. Therefore, this bioinformatic study investigated whether FRGs could be novel and effective biomarkers of SCLC. We also evaluated their impact on the immune microenvironment of SCLC.

Data Acquisition
Gene expression information for SCLC and normal samples were gathered from the gene expression omnibus (GEO) database. GSE149507 was selected as the training cohort, with 18 SCLC samples and 18 normal samples. GSE108055 was selected as the test cohort, with 12 SCLC samples and 10 normal samples. FRGs (n = 355) utilized for this research were identified from the ferroptosis database (FerrDb). Supplementary Table 1 shows the gene list. Potential candidate-targeted drugs for marker genes were predicted through the drug-gene interaction database (DGIdb, https://www.dgidb.org/).

Identification of Differentially Expressed Ferroptosis-Related Genes
Expression data on 204 FRGs from SCLC and normal samples were obtained from GSE149507. Differentially Expressed Ferroptosis-Related Genes (DE-FRGs) between SCLC and normal samples were assessed by Student's t-test in R (Supplementary Table 2). Genes with p < 0.05 were deemed to be statistically significant, and relationships between the DE-FRGs were examined using Pearson's correlation analysis.

Function and Pathway Enrichment Analysis
The R language software package V4.2.0 (https://ww w.r-project.org/) and https://www.reactome.org were used to perform gene ontology (GO) and Reactome enrichment, respectively, for analysis of the functions and pathways of DE-FRGs.

Identification of the Best Gene Biomarkers for SCLC Diagnosis
The best diagnostic gene markers for SCLC were identified using support vector machine recursive feature elimination (SVM-RFE) together with the least absolute shrinkage and selection operator (LASSO). LASSO regression algorithm improves the prediction accuracy by using regularization, which was implemented using the "glmnet" R package. The SVM-RFE model was then set up by the SVM package and compared with 10-fold cross-validations by using mean misjudgment rates. The overlapping biomarkers revealed by these two algorithms were determined as the best gene biomarkers for SCLC. The diagnostic efficacy of the best marker genes was assessed from the plot of the receiver operating characteristic (ROC) curve and the com-putation of the area under the curve (AUC). Meanwhile, a logistic regression model with all marker genes was established using "glm" R package to distinguish SCLC samples from normal samples in GSE149507. The ROC curve was used to assess the diagnostic capability of this logistic regression model.

Single-Gene Sets Enrichment Analysis
To study functional enrichment and pathway enrichment, we analyzed the relationship between the 6 candidate marker genes identified previously and the remaining genes in GSE149507 using R package gene sets enrichment analysis (GSEA) (V.4.1.0). All genes from high to low were then sorted according to their correlation with marker genes. The sorted genes were used as gene sets in additional studies. Supplementary Tables 3,4 summarize the GO and KEGG enrichment data for each marker gene, respectively.

Single-Gene Sets Variation Analysis
Gene sets variation analysis (GSVA) analysis was performed using the R package GSVA (V.1.38.0). The difference in GSVA scores between high-expression and lowexpression marker genes was examined using the "Limma" R package. The filter condition was set to |t| > 2, p < 0.05. In the group with high expression, the pathway was deemed active when t > 0, while in the group with low expression, it was deemed active when t < 0.

Analysis of Infiltrating Immune Cells
Immune cells in the tumor microenvironment were evaluated in the GSE149507 dataset using CIBERSORT. The sum of the proportions of 22 infiltrating immune cells determined by CIBERSORT was 1. The association between immune cells and marker genes was subsequently examined using Pearson's correlation analysis. Specific information on the percentage of 22 types of infiltrating immune cells is shown in Supplementary Table 5.

Identification of DE-FRGs in SCLC Samples
A total of 204 DE-FRGs were found between SCLC and normal samples in the GSE149507 dataset, comprising 85 up-regulated genes and 119 down-regulated genes (Supplementary Table 1). A clustering heatmap shows how the expression of DE-FRGs varied across all samples ( Fig. 1), while a correlation heatmap shows how these DE-FRGs were related (Supplementary Fig. 1).

Analysis of the Biological Functions of DE-FRGs
GO function enrichment and Reactome enrichment analyses were performed to investigate DE-FRG-related functions and pathways. GO analysis results showed that BP (Biological Process) was enriched for the functions of: 'response to oxidative stress', 'cellular response to chemical stress', and 'response to nutrient levels'. Enrichment of CC (Cell Component) included 'transcription regulator complex'. Enrichment of MF (Molecular Function) contained 'DNA-binding transcription factor binding' and 'RNA polymerase II-specific DNA-binding transcription factor binding' (Fig. 2A,B). Reactome results showed that DE-FRGs were significantly enriched for 'signaling by Interleukins', 'Interleukin-4 and Interleukin-13 signaling', and 'fatty acid metabolism' (Fig. 2C). The 'autophagy' pathway and 'cellular response to chemical stress' pathway were also enriched. These results suggest that the effect of DE-FRGs on the progression of SCLC may be related to immune regulation, oxidative stress, transcription factors, and autophagy.

Six DE-FRGs May Serve as Potential Markers for SCLC Diagnosis
The aim of this work was to determine whether DE-FRGs could be used for the diagnosis of SCLC. LASSO regression and the SVM-RFE algorithm were performed to find the optimal DE-FRGs for distinguishing SCLC from normal tissue in the GSE149507 dataset. For LASSO regression analysis, the penalty parameters were tuned through ten-fold cross-validation, with eight SCLC characteristic genes identified (Fig. 3A,B). DE-FRGs were then filtered using the SVM-RFE algorithm, and 64 were considered to be characteristic genes (maximal accuracy = 0.942, minimal Root Mean Square Error (RMSE) = 0.0583) (Fig. 3C,D). In the ensuing analysis, six marker genes were identified as common genes by LASSO regression in conjunction with the SVM-RFE algorithm. These were the ATG3, MUC1, RRM2, IDH2, PARP1, and EZH2 genes (Fig. 3E).
Using the "glm" R package, the six marker genes    distinguish SCLC from normal tissue than single marker genes.

Expression Levels of Marker Genes and Confirmation in the Test Cohort
Next, GSE108055 was used as the test cohort to confirm the expression level of the six candidate marker genes. The gene expression patterns for EZH2, IDH2, MUC1, PARP1, and RRM2 observed in the test cohort were consistent with those observed in the training cohort. EZH2 (p = 3.1 × 10 −6 ), IDH2 (p = 2.2 × 10 −5 ), PARP1 (p = 3.1 × 10 −6 ) and RRM2 (p = 3.1 × 10 −6 ) had higher expression levels in SCLC samples than in normal samples, whereas MUC1 (p = 0.025) had lower expression in SCLC samples than in normal samples (Fig. 4A).
We then used the test cohort GSE108055 to confirm the accuracy of our logistic regression model. The ROC curve is shown in Fig. 4B,C. According to this result, the AUCs of EZH2, IDH2, MUC1, PARP1, and RRM2 were all >0.7, while the AUC for the sum of these genes was 1.
This confirms the logistic regression model developed earlier can specifically distinguish between SCLC and normal samples.

Marker Genes were Shown by GSEA and GSVA Analyses to be Tightly Connected to Several SCLC-Associated Pathways
We performed single-gene GSEA-KEGG pathway analysis and GSEA-GO function analysis to further evaluate the potential for the six marker genes to differentiate SCLC samples from normal samples. The top six results with the most significant enrichment are shown in the figures. From the GSEA-KEGG analysis, the six marker genes had strong associations with the cell cycle, cytokinecytokine receptor interactions, lysosome, hematopoietic cell lineage, complement and coagulation cascades, as well as some disease pathways (asthma, autoimmune thyroid disease, viral myocarditis, leishmaniasis infection). Moreover, the results showed that EZH2 and IDH2 were associated with the JAK-STAT signaling pathway, while ATG3 and PARP1 were associated with the intestinal immune network for IgA production (Supplementary Fig. 2A-F).
GSEA-GO function analysis revealed that with the exception of MUC1, the other five marker genes were all related to immune response (adaptive immune response, lymphocyte-mediated immunity, T-cell-mediated immunity, immune receptor activity, humoral immune response, and immune response regulation signal pathway) (Supplementary Fig. 3A-F).
Depending on the expression level of each marker gene, the pathways for different activation levels between SCLC and normal samples were analyzed by GSVA. GSVA-KEGG pathway analysis revealed that EZH2 overexpression was associated with 'ARACHIDONIC ACID METABOLISM', 'JAK STAT SIGNALING PATH-WAY', 'PPAR SIGNALING PATHWAY', and 'TYRO-SINE METABOLISM'. IDH2 up-regulation was associated with 'PPAR SIGNALING PATHWAY' and 'TYROSINE METABOLISM', while PARP up-regulation was linked to the 'JAK STAT SIGNALING PATHWAY' and 'PPAR SIGNALING PATHWAY'. Furthermore, down-regulation of RRM2 was associated with the 'PPAR SIGNALING PATHWAY', 'JAK STAT SIGNALING PATHWAY', and 'ARACHIDONIC ACID METABOLISM'. These pathways were found to have a strong connection to the occurrence and development of lung cancer. Moreover, IDH2 and PARP overexpression could activate the 'NOD-LIKE RECEPTOR SIGNALING PATHWAY' and the 'TOLL-LIKE RECEPTOR SIGNALING PATHWAY', both of which are related to the immune reaction. Down-regulation of RRM2 could also activate these two pathways. All of the above results are shown in Supplementary Fig. 4A-E. GSVA-GO functional analysis showed that overexpression of EZH2, IDH2, PARP1, and RRM2, and down-regulation of MUC1 were all partially related to the immune reaction, including the 'IGA_IMMUNOGLOBULIN_COMPLEX', 'INTERLEUKIN-21_PRODUCTION' and the 'IPAF_IN FLAMMASOME_COMPLEX' (Supplementary Fig.  5A-E).

Analysis of Infiltrating Immune Cells
GSEA and GSVA analysis showed a close connection between the candidate marker genes and immune response. The CIBERSORT algorithm was therefore used to investigate differences in infiltrating immune cells between SCLC and normal samples. As shown in Fig. 5A, SCLC samples had lower proportions of memory-resting CD4 T cells, naive B cells, monocytes, activated dendritic cells, resting mast cells, and neutrophils compared with normal samples. However, the proportion of gamma delta T cells, T follicular helper cells, and M1 macrophages were higher in SCLC than in normal samples.
Associations between immune cells and the six marker genes were subsequently investigated using Pearson's correlation analysis. MUC1 showed a negative correlation with activated mast cells and a positive correlation with resting mast cells. EZH2 was negatively correlated with neutrophils (Fig. 5B). These findings imply that MUC1 and EZH2 may be involved in determining the immune microenvironment of SCLC.

Construction of a Drug Regulatory Network for the Marker Genes
Targeted drugs for the marker genes were predicted by the DGIdb database, and two-parameter interaction relationships were set to defaults (Supplementary Table  6). Cytoscape software was used to visualize the results (Fig. 6). This analysis found 40 drugs that target the marker genes. Amongst these, four were for MUC1, twelve for RRM2, six for IDH2, twelve for PARP1, and seven for EZH2. Unfortunately, none were found that targeted ATG3. Inhibitors of EZH2 included CPI-1205 and TAZEMETOSTAT, while for IDH2, they included ENASI-DENIB. Inhibitors for PARP1 included NIRAPARIB, OLAPARIB, VELIPARIB, INIPARIB, TALAZOPARIB, RUCAPARIB CAMSYLATE, TALAZOPARIB TO-SYLATE, with the antagonist including RUCAPARIB and the binder including NIACINAMIDE. Inhibitors of RRM2 included HYDROXYUREA, FLUDARABINE PHOSPHATE, GALLIUM NITRATE, CLOFARABINE, GEMCITABINE HYDROCHLORIDE, GEMCITABINE, TEZACITABINE, and CLADRIBINE.

Construction of Marker Gene ceRNA Network
In order to construct a ceRNA network, we first used miRanda, miRDB, and Targetscan databases to predict miRNAs that bind to the six marker genes. We then predicted lncRNAs that bind to the miRNAs using the spongeScan database. Finally, a lncRNA-miRNA- messenger RNA (mRNA) ceRNA regulatory network containing 217 nodes and 245 edges was constructed (Supplementary Fig. 6).

Discussion
Compared with NSCLC, SCLC shows rapid tumor cell multiplication, high malignancy, early widespread metastasis, and a tendency for association with an abnormal endocrine syndrome [20]. Although SCLC is initially sensitive to chemotherapeutic drugs, it quickly develops drug resistance because of intratumoral heterogeneity, thereby leading to poor prognosis and recurrence [7]. Although the tumor mutational burden (TMB) of SCLC is very high, lymphocyte infiltration into the tumor microenvironment is less than that observed for other solid tumors. Moreover, the expression level of PDL1 is also low, meaning that patients with SCLC have limited sensitivity to immunotherapy [21]. Some studies have shown that different neuroendocrine (NE) cell subtypes are crucial for the resistance of SCLC to ferroptosis. Non-NE-SCLC is more sensitive to ferroptosis than NE-SCLC [22]. This may be because non-NE-SCLC has a mesenchymal phenotype and expresses EMT-related genes. The mesenchymal expression profile is a signature of the ferroptosis response by inducing lipid peroxidation [23].
Previous researchers built a ferroptosis-related prognostic risk scoring model for SCLC using RNA sequencing and clinical data from the cBioPortal database, and that includes thioredoxin interacting protein (TXNIP) [24]. However, there are still few studies on the impact of ferroptosis on SCLC. The aim of the present study was, therefore, to identify FRGs associated with SCLC through bioinformatics analysis. To increase the reliability of our results, we screened gene chips containing multiple sample data and then used another set of gene chips to verify the results. We identified six DE-FRGs through the use of two algorithms, namely ATG3, MUC1, RRM2, IDH2, PARP1, and EZH2. ROC analysis revealed their AUC was >0.7, indicating they could accurately and specifically distinguish SCLC samples from normal samples. Indeed, it is worth noting the AUC values for RRM2, IDH2, PARP1, and EZH2 were all >0.99. RRM2 is known to regulate nucleotide metabolism, protein expression, and DNA repair [25,26] and is crucial in the development of various tumor types. In liver cancer, RRM2 was shown to inhibit ferroptosis by promoting the expression of glutathione synthetase [27]. IDH2 is thought to induce the production of NADPH, a cofactor in the GSH-related mitochondrial antioxidant defense system [28]. Erastin is a small molecule compound that stimulates mitochondrial voltage-dependent anion channel (VDAC) to activate ferroptosis, thereby causing cysteine deficiency and depletion of GSH. Down-regulation of IDH2 can promote ferroptosis induced by erastin [29]. PARP1 is a member of the PARP (poly ADP-ribose polymerase) DNA repair enzyme family. It is overexpressed in a variety of tu-mors, including SCLC, and participates in DNA damage repair [30]. SCLC is sensitive to PARP1 inhibitors. The efficacy of PARP1 inhibitors when combined with chemotherapy drugs is significantly higher than that of chemotherapy drugs used alone, which may be related to defects in homologous recombination repair [31]. PARP1 inhibitors can also cause inhibition of GPX4 and increased expression of lipid peroxide, thus resulting in ferroptosis [32]. EZH2 is a catalytic component of polycomb-group protein 2 (PRC2), which is thought to be closely associated with the development and growth of various tumor types. EZH2 can also facilitate immune escape by preventing intratumoral antigen presentation, impeding immune cell migration, and increasing the inhibitory activity of CD4+ T regulatory cells (Tregs) [33]. Several studies have shown that EZH2 overexpression in SCLC can inhibit cell apoptosis, promote the cell cycle, and lead to resistance of SCLC to chemotherapy and PDL-1 antibody [34,35]. Moreover, EZH2 can inhibit ferroptosis by increasing the expression of SLC7A11 [36].
We performed GSEA and GSVA enrichment analysis of single genes. KEGG pathway analysis showed that four marker genes were related to the JAK-STAT signaling pathway. This pathway is important in many solid tumor types, including lung cancer. JAK-STAT signaling can also regulate tumor immunity. Moreover, up-regulation of IDH2 and PARP and down-regulation of RRM can activate Tolllike receptor and NOD-like receptor signaling pathways. Toll-like receptors (TLRs) and NOD-like receptors (NLRs) are crucial for the response to infection and non-specific immunity, for the coordination of various tumor cell processes, and for tumor immune surveillance [37]. GO function enrichment also revealed the marker genes were enriched in multiple immune-related functions. Therefore, we performed an analysis of immune cell infiltration in all samples and investigated the correlation between immune cells and marker gene expression. Fewer neutrophils and resting mast cells were observed in SCLC compared to normal tissue. MUC1 expression was positively correlated with resting mast cells and showed low expression in SCLC. EZH2 was negatively correlated with neutrophils and overexpressed in SCLC. These results suggest that EZH2 and MUC1 may be novel targets for improving the immune microenvironment of SCLC.
Finally, to identify new biomarkers and potential therapeutic targets for SCLC, we developed a drug regulatory network and ceRNA network for the marker genes. Some of these drugs have already proven to be useful for the treatment of SCLC, including the PARP1 inhibitor niraparib used in the maintenance treatment of extensive-stage SCLC (ES-SCLC) and platinum-sensitive SCLC [38]. Talazoparib can prevent cancer cells from self-repairing after damage. For the treatment of recurrent SCLC following first-line treatment, talazoparib can be combined with temozolomide (TMZ) to increase the anti-tumor response and thus improve outcome. Moreover, the combined treat-ment of TMZ with veliparib can also improve the objective response rate (ORR) in SCLC [39]. It has also been demonstrated that aberration of the RNA regulatory network is an important factor in the development of malignant tumors and that non-coding RNAs are intricately linked to the development and progression of SCLC. Several of the microRNAs found in our ceRNA network, including miR-7-5p, miR-93-5p, and miR-92b-3p, have previously been reported to affect disease progression and therapy of SCLC [40][41][42]. However, due to the high degree of heterogeneity of SCLC, the prognostic value of the ceRNA network identified here requires further verification in large sample cohorts.
While this research has identified several possible biomarkers in the pathogenesis and treatment of SCLC, there are several limitations to the study. Firstly, the SCLC sample size in the GEO database is limited, which may lead to some errors. Future research studies should include larger sample sizes to improve the accuracy of results. Secondly, this comprehensive bioinformatic analysis was based on database and gene chips. To make the research results more meaningful, in vitro and in vivo validation tests should also be performed.

Conclusions
ATG3, MUC1, RRM2, IDH2, PARP1, and EZH2 were identified as possible marker genes of ferroptosis in SCLC. MUC1 and PARP1 may also have significant effects on the immune microenvironment of SCLC.

Availability of Data and Materials
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Author Contributions
YY and AM conceived and designed the study. YY contributed data analysis. MS provided expertise in design of study and interpretation of data. YY, AM, and MS wrote the manuscript or receive it critically for important intellectual content. All authors contributed to editorial changes in the manuscript. All authors read and approved the final manuscript. All authors have participated sufficiently in the work and agreed to be accountable for all aspects of the work.

Ethics Approval and Consent to Participate
Not applicable.

Acknowledgment
Not applicable.

Funding
This study was supported in part by a grant-in-aid from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (grant no. 20K08552 to A. Miyanaga). The funding body was not involved in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.