1 Department of Obstetrics and Gynecology, Women’s Hospital of Nanjing Medical University, Nanjing Women and Children’s Healthcare Hospital, 210004 Nanjing, Jiangsu, China
Abstract
Ovarian yolk sac tumor (OYST) is a rare and malignant subtype of malignant ovarian germ cell tumors (MOGCT). Due to its rarity, few large-scale studies have systematically evaluated the prognostic factors for this tumor type. In the present study, our aim was to identify independent prognostic factors for OYST and develop a nomogram to predict patient survival.
Data from 427 OYST patients diagnosed between 1992 and 2019 were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. Patients were randomly divided into training (n = 299) and validation (n = 128) cohorts. Univariate Cox regression, Least Absolute Shrinkage and Selection Operator (LASSO) regression, and multivariate Cox analysis were used to identify prognostic factors. A nomogram was constructed based on age, American Joint Committee on Cancer (AJCC) stage, regional lymph node status, and liver metastasis. The model’s accuracy and clinical utility were evaluated using the concordance index (C-index), calibration curves, time-dependent receiver operating characteristic (ROC) curves, and decision curve analysis (DCA).
Age, AJCC stage, regional lymph node status, and liver metastasis were identified as independent prognostic factors for OYST. The nomogram demonstrated strong predictive accuracy, with C-indices of 0.868 and 0.813 in the training and validation cohorts, respectively. Calibration curves confirmed the agreement between predicted and observed survival rates. The time-dependent ROC curves showed area under the curves (AUCs) exceeding 0.8 for 3-, 5-, and 10-year survival predictions. DCA revealed that the nomogram provided a superior net benefit compared to the AJCC staging system. A risk stratification system based on the nomogram effectively differentiated high- and low-risk patients, with Kaplan-Meier survival analysis indicating significantly worse outcomes for high-risk patients.
The nomogram developed in this study provides accurate and clinically relevant predictions for the survival of OYST patients. Furthermore, it offers a valuable tool for individualized prognostic assessment and postoperative decision-making. Prospective, multicenter studies are needed to validate and further refine this model.
Keywords
- ovarian yolk sac tumor
- SEER
- nomogram
- prognostic regression model
Ovarian yolk sac tumor (OYST), often referred to as an endodermal sinus tumor, is a rare subset of ovarian malignancies that comprises approximately 2%–3% of all ovarian tumors [1, 2]. As a malignant ovarian germ cell tumor (MOGCT), it arises from differentiation of the extra-embryonic yolk sac and occurs predominantly in young premenopausal women. Among the subtypes of MOGCT, OYST is the second most prevalent histological type, accounting for roughly 20% of cases [3, 4]. Its prevalence is particularly notable in children, where it represents up to 60% of malignant ovarian tumors. Although OYST primarily originates in gonadal tissues, such as the ovaries and testes, it can also develop in extragonadal locations, including the mediastinum, brain, and retroperitoneum [5].
Due to its rarity, current knowledge about OYST is predominantly based on small retrospective studies. However, many of these studies include mixed cohorts of OYST and other MOGCT subtypes [6, 7], which introduces significant confounding factors. Moreover, there is a lack of large-scale or prospective data to comprehensively evaluate the prognostic factors for OYST. In this study, we performed a systematic, retrospective analysis using the Surveillance, Epidemiology, and End Results (SEER) database to explore potential prognostic factors for OYST. Furthermore, we created and validated a nomogram to predict clinical outcomes, with the aim of assisting clinicians identify high-risk patients, provide optimized prognostic counseling, and ultimately improve patient management.
Data for patients diagnosed with OYST between 1992 and 2019 were extracted from the publicly accessible SEER database (Fig. 1). Inclusion criteria were: (1) tumors localized to the ovary (ICD-O-3/WHO; 2008; code: C56.9); (2) histological confirmation of yolk sac tumor (ICD-O-3 Hist/Behave; 2000; code: 9071/3); and (3) diagnosis between 1992 and 2019. Cases were excluded if they: (1) involved non-primary tumors; (2) lacked histological confirmation; (3) had unknown survival status; (4) had 0 months of survival; or (5) had missing survival times. Data extraction was conducted using SEER*Stat software version 8.4.4 (https://seer.cancer.gov/seerstat/).
Fig. 1.
Flowchart for sample selection. SEER, the Surveillance, Epidemiology, and End Results; OYST, ovarian yolk sac tumor.
A total of 13 variables were selected for analysis, representing a comprehensive set of clinical and pathological features. These variables were age, marital status, race, cancer antigen 125 (CA125) level, tumor size, laterality (unilateral or bilateral), grade, SEER stage, American Joint Committee on Cancer (AJCC) stage [8], type of surgery, regional lymph node status, radiotherapy, chemotherapy, systemic therapy, and liver metastasis. Age and tumor size were categorized based on cutoff values identified by X-tile software version 3.6.1 (Yale School of Medicine, New Haven, CT, USA) [9, 10]. Specifically, age at diagnosis was grouped using cutoff points of 23 and 40 years, while tumor size was categorized using a threshold of 195 mm.
Patients were randomly allocated into two groups: a training set (70%) and a validation set (30%). The primary outcome measure was overall survival (OS). Within the training set, univariate Cox regression analysis was initially conducted to identify variables associated with OS. To address collinearity among variables, least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation was employed, resulting in an optimal lambda value of 0.0259. The remaining variables were subsequently entered into a multivariate Cox regression model to identify independent prognostic factors for OS and to develop a nomogram.
The performance of the nomogram was evaluated using several metrics, including the concordance index (C-index), calibration curves, and the area under the receiver operating characteristic curve (AUC). This was carried out in both the training and validation sets. Predictive accuracy and discrimination were assessed through these measures. Decision curve analysis (DCA) was also utilized to compare the clinical utility of the nomogram against the AJCC staging system. Based on prognostic scores generated from the nomogram, patients were categorized into high-risk and low-risk groups.
All analyses and nomogram development were performed using R software version
4.1.3 (R Foundation, Vienna, Austria). Categorical data, including baseline
characteristics (e.g., demographics, tumor characteristics), were summarized as
frequencies (%) and compared between groups using Chi-square tests. For
continuous variables, results were expressed as the mean
A total of 427 patients diagnosed with OYST were retrieved from the SEER
database and randomly allocated into two cohorts. Following a 7:3 split, 299
patients were allocated to the training cohort and 128 to the validation cohort.
Supplementary Table 1 summarizes the baseline characteristics
of the two cohorts. In the overall cohort, the median patient age was 23 years
and the median OS was 112 months (range: 1–335 months). The majority of patients
were aged
The SEER staging distribution was relatively balanced, with localized, regional, and distant stages accounting for 34.2%, 29%, and 34.2% of cases, respectively. AJCC stage I patients comprised 48.7% of cases, while stages III and IV combined represented 35.6%.
Treatment data revealed that 86.9% of patients underwent chemotherapy, with 52.2% receiving systemic therapy. The most common surgical procedures were unilateral salpingo-oophorectomy (31.1%), and oophorectomy combined with omentectomy (27.6%). Positive regional lymph nodes were reported in only 6.6% of cases, while lymph node status was negative in 44.5%. Liver metastases were rare, occurring in only 3.3% of patients.
Univariate Cox regression analysis identified marital status, age, laterality,
AJCC stage, SEER stage, regional lymph node status, type of surgery, and liver
metastasis as potential factors associated with OS (p
Fig. 2.
Construction of the LASSO-Cox regression model. (A)
LASSO coefficients. (B) Selection of the tuning parameter (
Fig. 3.
Nomogram for predicting 3-, 5-, and 10-year overall survival (OS) in OYST patients. OYST, Ovarian yolk sac tumor; AJCC, American Joint Committee on Cancer.
The predictive accuracy of the nomogram was assessed using the concordance index
(C-index). The C-index in the training cohort was 0.868 (95% CI: 0.815–0.920),
while in the validation cohort it was 0.813 (95% CI: 0.709–0.916), indicating
robust discriminatory ability (C-index
Fig. 4.
Calibration curves for the nomogram. (A–C) Calibration curves for 3-, 5-, and 10-year OS in the training cohorts. (D–F) Calibration curves for 3-, 5-, and 10-year OS in the validation cohorts.
Time-dependent receiver operating characteristic (ROC) curves highlighted the model’s ability to discriminate survival trends over time (Fig. 5). In the training cohort, the AUCs for predicting OS at 3-, 5-, and 10-years were 0.925, 0.862, and 0.857, respectively. Similarly, the AUCs for the corresponding time points in the validation cohort exceeded 0.8 (0.847, 0.808, and 0.833), demonstrating robust predictive performance.
Fig. 5.
Receiver operating characteristic (ROC) curves for the nomogram. (A) ROC curves for 3-, 5-, and 10-year OS in the training cohort. (B) ROC curves for 3-, 5-, and 10-year OS in the validation cohort. ROC-AUC, area under the receiver operating characteristic curve.
DCA further confirmed the clinical utility of the nomogram by consistently demonstrating a higher net benefit compared to the AJCC staging system for predicting 3-, 5-, and 10-year OS in both the training and validation cohorts (Fig. 6).
Fig. 6.
Decision curve analysis (DCA) for the nomogram and AJCC stage system. (A–C) DCA curves for predicting 3-, 5-, and 10-year OS in the training cohort. (D–F) DCA curves for predicting 3-, 5-, and 10-year OS in the internal validation cohort.
A risk stratification system was developed using the prognostic scores derived from the nomogram. Patients were divided into high- and low-risk groups based on the median score. Kaplan-Meier survival analysis demonstrated significantly worse outcomes for high-risk patients compared to their low-risk counterparts (Fig. 7).
Fig. 7.
Kaplan-Meier curves for OS in different risk groups.
MOGCT comprise several histological subtypes, including dysgerminoma, OYST, immature teratoma, and mixed germ cell tumors. Among these, OYST is recognized for its aggressive behavior and poor prognosis [7, 11]. Patients with OYST often lack specific symptoms in early stages, making diagnosis challenging [11, 12]. Most patients present with chronic abdominal pain or pelvic masses, while a minority seek emergency care due to complications such as ascites, tumor rupture, or torsion.
Research on OYST has predominantly consisted of small-scale retrospective analyses, combined analyses within ovarian germ cell tumors, or case reports due to its rarity [13, 14, 15, 16, 17, 18]. In this study we conducted a comprehensive analysis using the SEER database. Age, AJCC stage, regional lymph node involvement, and liver metastasis were identified as independent prognostic factors for OYST through LASSO-Cox regression modeling. Importantly, we developed and validated a nomogram to predict 3-, 5-, and 10-year survival rates, making this the first study to provide such a predictive tool for OYST.
Age is a well-recognized determinant of cancer prognosis, often linked to the
accumulation of genetic mutations and immune senescence [19, 20]. Previous
studies have emphasized its critical role in the survival of ovarian cancer
patients [21, 22, 23]. Consistent with an earlier study [24], our analysis revealed
that patients aged 24–39 years had a 4.8-fold increased risk of mortality
compared to patients aged
AJCC staging was identified as a pivotal prognostic factor, demonstrating the importance of tumor extent in determining survival outcomes [25]. Notably, the HR for AJCC Stage IV (16.69) was accompanied by a wide 95% CI (1.16–240.50), reflecting limited precision due to the small number of Stage IV cases. While statistically significant, the variability in this estimate necessitates cautious interpretation in clinical practice. The wide 95% CI indicates a need for larger studies to validate this finding and improve the precision of risk estimation for advanced-stage OYST. The presence of liver metastasis was identified as a critical prognostic factor in our model, consistent with its association with advanced disease and systemic dissemination [26]. Given its rarity in OYST, liver metastasis highlights the aggressive nature of these cases and the urgent need for effective systemic therapies tailored to high-risk patients.
Furthermore, regional lymph node involvement emerged as another significant indicator of poor prognosis. Patients with unknown lymph node status exhibited worse survival outcomes than those with negative lymph nodes, hinting at the potential value of systematic lymphadenectomy in assessing disease burden. However, the role of lymphadenectomy in the management of OYST patients remains controversial. While retrospective studies and meta-analyses suggest that lymphadenectomy may improve OS in advanced cases, its impact on progression-free survival (PFS) is less clear [27, 28]. As stated above, patients in our study with unknown regional lymph node status had poorer survival outcomes than node-negative patients, suggesting there may be a potential survival benefit from lymphadenectomy. However, Harter et al. [29, 30] reported no significant improvement in either PFS or OS from lymphadenectomy in advanced ovarian tumors, while highlighting increased postoperative complications and 60-day mortality. Larger prospective studies are therefore needed to comprehensively assess the benefits and risks of lymphadenectomy in the management of OYST patients.
Our study confirmed the accuracy and clinical utility of the nomogram-based predictive model for OYST prognosis. The C-indices for the training and validation cohorts (0.868 and 0.813, respectively) indicate strong discriminatory power, while calibration curves showed excellent agreement between predicted and observed survival rates. Time-dependent ROC analyses further support the model’s robust predictive capability, with AUC values exceeding 0.8. Moreover, DCA highlighted the superior net benefit of the nomogram in predicting 3-, 5-, and 10-year OS compared to the AJCC staging system. Additionally, the risk stratification system derived from the nomogram effectively classified patients into high- and low-risk groups, with Kaplan-Meier survival curves showing significantly poorer outcome in the high-risk group.
While our study offers valuable insights, several limitations must be acknowledged. First, as a retrospective analysis the study was inherently subject to selection bias and lacks prospective validation. Second, detailed treatment data such as chemotherapy regimens, dosages, cycles, and radiotherapy protocols were unavailable, thus limiting our ability to evaluate the impact of specific therapeutic approaches. Third, grading data was missing for 84% of patients, necessitating its exclusion from analysis and preventing a better understanding of the prognostic role of tumor grade. Fourth, missing data were categorized as “unknown” and were included in the analyses without imputation, potentially affecting the interpretation of certain variables such as regional lymph node status. Fifth, patients with a survival time shorter than one month were excluded, which may introduce survivorship bias by excluding cases of perioperative mortality or acute complications. Finally, external validation using independent datasets was not performed, restricting the generalizability of our findings.
Future multicenter, prospective studies with larger cohorts are essential to address these gaps and allow further refinement of the nomogram. Additionally, the collection of comprehensive treatment and pathological data should improve our understanding of prognostic factors and facilitate the development of tailored therapeutic strategies for OYST.
This study has presented a validated nomogram incorporating age, AJCC stage, regional lymph node status, and liver metastasis as independent prognostic factors. The nomogram provides accurate and clinically applicable survival predictions for OYST patients. This tool offers a visualized and quantitative approach for prognostic assessment, thereby facilitating personalized postoperative management and counseling. Further research is warranted to validate the model and explore its integration into clinical practice, with the ultimate aim of improving outcomes for this rare and aggressive malignancy.
The datasets used in this study are available in SEER databases (https://seer.cancer.gov/), and the data analysis process can be obtained from the corresponding author.
HC was responsible for the research design, data acquisition and analysis, data visualization and manuscript writing. DZ and DL were responsible for the research design, data verification and manuscript review. All authors contributed to editorial changes in the manuscript. All authors read and approved the final manuscript. All authors have participated sufficiently in the work and agreed to be accountable for all aspects of the work.
The study used data from published public data and all original research was ethically approved. Therefore, no ethical approval was required for this study.
Not applicable.
This research received no external funding.
The authors declare no conflict of interest.
During the preparation of this work the authors used ChatGpt-3.5 in order to check spell and grammar. After using this tool, the authors reviewed and edited the content as needed and takes full responsibility for the content of the publication.
Supplementary material associated with this article can be found, in the online version, at https://doi.org/10.31083/CEOG31361.
References
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.







