†These authors contributed equally.
Academic Editor: Graham Pawelec
Background: Existing challenges of lung cancer screening included non-accessibility of computed tomography (CT) scanners and inter-reader variability, especially in resource-limited areas. The combination of mobile CT and deep learning technique has inspired innovations in the routine clinical practice. Methods: This study recruited participants prospectively in two rural sites of western China. A deep learning system was developed to assist clinicians to identify the nodules and evaluate the malignancy with state-of-the-art performance assessed by recall, free-response receiver operating characteristic curve (FROC), accuracy (ACC), area under the receiver operating characteristic curve (AUC). Results: This study enrolled 12,360 participants scanned by mobile CT vehicle, and detected 9511 (76.95%) patients with pulmonary nodules. Majority of participants were female (8169, 66.09%), and never-smokers (9784, 79.16%). After 1-year follow-up, 86 patients were diagnosed with lung cancer, with 80 (93.03%) of adenocarcinoma, and 73 (84.88%) at stage I. This deep learning system was developed to detect nodules (recall of 0.9507; FROC of 0.6470) and stratify the risk (ACC of 0.8696; macro-AUC of 0.8516) automatically. Conclusions: A novel model for lung cancer screening, the integration mobile CT with deep learning, was proposed. It enabled specialists to increase the accuracy and consistency of workflow and has potential to assist clinicians in detecting early-stage lung cancer effectively.
Lung cancer remains the leading cause of cancer-related mortality worldwide in 2020 accounting for 18.4% of overall cancer deaths [1]. The 5-year survival of lung cancer was less than 20% in China due to delayed diagnosis [2]. Patients with early-stage lung cancer who received curative treatment would have a better prognosis substantially, compared with those with lung cancer at advanced stage [3]. Only 17.3% of patients were diagnosed with lung cancer at stage I, which was inferior to America (25.3%). Further, the proportion of stage I was higher in urban (19.5%) than in rural (11.1%) areas in China [4]. Challenges for early detection of lung cancer are warranted to address in China.
Low-dose computed tomography (LDCT) has been proved to significantly reduce the mortality of lung cancer [5]. National Lung Screening Trial (NLST) demonstrated a reduction in the lung-cancer mortality with LDCT screening of about 20% as compared with that in the chest radiography group [6]. Meanwhile, Nederlands-Leuvens Longkanker Screenings Onderzoek (NELSON) lung-cancer screening trial indicated a cumulative rate ratio of 0.76 for death from lung cancer in the computed tomography (CT) arm relative to the no screening control arm [7]. However, CT scanners were relatively unavailable in resource-constrained areas, which caused diagnosis at advanced stage. What’s more, inter-reader variability among radiologists might lead to the missed diagnosis, clinical and financial cost waste [8, 9, 10]. It is imperative to improve the accessibility and consistency of lung cancer screening especially in the resource-limited sites.
The mobile low-dose whole body CT screening unit has the potential to promote the availability of lung cancer screening. It not only provides reliable imaging results, but also reduces geographic barriers, making lung cancer screening more extensively [11]. It had been implemented in Kenya, community in Yorkshire and UK [12, 13]. On the other hand, the rapid development of deep learning has revaluated medical routine at a broad variety of screening and evaluation of lung cancer, diagnosis of COVID-19, detection of lymph node metastases in breast cancer [14, 15, 16, 17, 18, 19]. Based on CT images, deep learning approach detects holistic nodules to automate standard image analysis. Previous studies were limited to small and retrospective samples. The combination of mobile CT vehicle and deep learning model still warrants further investigation.
Herein, a prospective study of lung cancer screening based on mobile CT in rural area was declared, and a sophisticated deep learning system was constructed to assist clinicians to detect the nodules and predict the malignancy risk.
Natural Population Cohort Study of West China Hospital of Sichuan University, a prospective community-based trial of screening in rural areas such as Mianzhu (Site 1), Longquan (Site 2), Pidu, and Seda, Sichuan Province, China, aimed to build a cohort of 80,000 natural population. The lung cancer screening trial was approved by the ethics committee of West China Hospital of Sichuan University.
During January 2020 to June 2020, recruitment of physical examination population was carried out in the corresponding community. The cohort invited permanent residents who were over 20 years old. Each participant provided written informed consent and completed the questionnaire including demographic information, behavior habits such as smoking, disease history like chronic obstructive pulmonary disease (COPD), and family history of cancer. People over 40 years old were eligible for lung cancer screening. Exclusion criteria were as follows: previously diagnosed with lung cancer, pulmonary surgery, chest CT scans within the past 6 months, unwilling to receive mobile CT examination. The screening program in Site 1 started from Jul 2020 to Sep 2020, and the program in Site 2 started from Oct 2020 to Nov 2020. All data had passed the strict quality-control and were stored in Electronic Data Capture (EDC) system.
Chest CT images were acquired from mobile LDCT scanners (Neusoft Corporation).
Scanning parameters were standardized as follows: tube voltage of 120 kV of,
512
We built a two-stage deep learning model to identify high-risk patients from the population-based cohort (Fig. 1). Specifically, the first-stage model detected suspicious nodule candidates from LDCT images of the participants. In the second stage, then, the model input each nodule’s image patches and predicted the corresponding probability of malignancy. Accordingly, the malignancy risk for each patient was depended on the detected nodule which had the highest probability of being lung cancer.
Overall modeling framework. Each enrolled participant was required mobile CT scanning and demographic questionaries. The deep learning model inputs CT volume, analyzes volumetric ROIs, and outputs nodule localization and its malignancy. Abbreviations: LDCT, low-dose computed tomography; ROI, region of interest.
To detect lung nodules from LDCT images, we constructed the detection model and trained the framework followed the steps by our preceding work [20]. Given a LDCT scan, the detection pipeline first went through a series of pre-processing steps (containing lung segmentation, cropping the lung field, resizing, dividing into patches, and normalizing) to adapt to the input space of the model. At the end of the forward propagation, the region proposal network output predictions including the probability of the region being a lung nodule, the 3-dimensional coordinates, and the diameter. Then, non-maximum suppression algorithm was used to remove overlapping predictions. In order to derive a well detector, we iteratively optimized the parameters of the model under the guidance of a loss function on the training dataset. The detection loss function consisted of two parts including the cross-entropy loss (CEL) for classifying whether the region being a nodule and the smoothed L1-norm loss for regressing location and diameter of lung nodules. More details about the training setup could be referred to our previous studies [20]. In this study, we set smaller anchor sizes compared with the consideration that natural population tends to have smaller lung nodules, and their sizes were 5.0 mm, 10.0 mm, and 20.0 mm, respectively [20].
We further constructed a classification model, which predicted probability of
each nodule candidate being malignant. The risk of each participant was
determined by the most-likely cancerous nodule. For each nodule candidate
(detected by the first stage model), the resolution was first resampled to 1.0
The performance of the proposed method was evaluated according to available targets from two aspects: (1) evaluating two models respectively by giving the nodule-wise location and the corresponding risk; (2) evaluating the overall performance of the framework by giving the patient-wise cancerous risk. We split the dataset into 7:1.5:1.5 for training,validation and testing regardless of which evaluating approach was used. It could ensure no prior knowledge of nodule location was available on the testing set when evaluates the performance of the proposed framework. In other word, the overall framework was only provided with LDCT images, and the detection model provided the location of nodule candidates for the classification model. All in all, the key index of our framework was depended on results of the latter evaluation methods since it imitated the real diagnostic procedure in clinical practice.
The questionnaires were strictly defined. A former or current smoker was defined as a person who had smoked more than one cigarette per day lasting more than 6 months. Occupational exposure history focused on exposure to radon,asbestos, arsenic, dust and oil fume at work for more than 12 months. Tumor history was expected lung cancer. Family included parents, brothers, sisters, grandparents, grandsons, and other immediate family members.
Remote reading of CT images was managed by experienced radiologists and nodules were classified into four existing Lung-RADS risk buckets [21]. Lung-RADS 1 was defined low-risk without nodules or with definitely benign nodules. Since Lung-RADS 2 and 3 have relatively mediate risk of malignancy, the above both were grouped in mid-risk bucket for this experiment. Lung-RADS 4A/B/X suspected or confirmed malignancy was defined as high-risk. Clinicians combined with results of deep learning model and imaging reports to judge whether the nodule was at high risk, and further recommend patients for treatment and follow-up. Patients diagnosed with lung cancer pathologically by surgery or biopsy would further receive standardized medical treatment according to lung cancer guideline of China [22] and NCCN guidelines [23].
The follow-up to December 2021 was conducted by the whole process management center of West China Hospital of Sichuan University. The clinical information of patients with lung cancer were retrieved from hospital information system (HIS) or telephone visit. Patients with low-risk nodules or non-abnormalities were recommend to screen annually.
Demographic information was analyzed utilizing Student’s t-test for
continuous variables and chi-square test for categorical variables. Two-side
p
After recruiting 19,281 people, a total of 12,360 participants were enrolled in
this lung cancer screening cohort including 4730 people of Site 1 (Mianzhu) and
7630 people of Site 2 (Longquan) (Fig. 2, Table 1). The median age was 58 years
old. 8169 (66.09%) participants were female, and 9784 (79.16%) participants
were never smokers. The vast majority of people had no history of occupational
exposure (11,075, 89.60%). A few people had a history of COPD (85, 0.69%),
previous tumor history (129, 1.04%), family history of cancer (1169, 9.46%), or
family history of lung cancer (329, 2.66%). Baseline characteristics differed
significantly between the two sites such as age (p
Participant flowchart through the screening pilot in two sites. There were four stage including recruitment, CT screening, nodules detection, pathological diagnosis for lung cancer patients. Site 1, Mianzhu; Site 2, Longquan, Sichuan province, China.
Characteristic | Total N (%) | Site 1 N (%) | Site 2 N (%) | p | |
N = 12360 | N = 4730 | N = 7630 | |||
Age (SD) | 58.19 (9.65) | 61.13 (9.83) | 56.36 (9.07) | ||
Sex | |||||
Male | 4191 (33.91) | 1729 (36.55) | 2462 (32.27) | ||
Female | 8169 (66.09) | 3001 (63.45) | 5168 (67.73) | ||
Smoking history | |||||
Current or former | 2576 (20.84) | 1181 (24.97) | 1395 (18.28) | ||
Never | 9784 (79.16) | 3549 (75.03) | 6235 (81.72) | ||
Occupational exposure history | |||||
Yes | 1285 (10.40) | 266 (5.62) | 1019 (13.36) | ||
No | 11075 (89.60) | 4464 (94.38) | 6611 (86.64) | ||
COPD | 0.14 | ||||
Yes | 85 (0.69) | 26 (0.55) | 59 (0.77) | ||
No | 12275 (99.31) | 4704 (99.45) | 7571 (99.23) | ||
Tumor history | 0.87 | ||||
Yes | 129 (1.04) | 48 (1.01) | 81 (1.06) | ||
No | 12231 (98.96) | 4682 (98.99) | 7549 (98.94) | ||
Family history of cancer | |||||
Yes | 1169 (9.46) | 275 (5.81) | 894 (11.72) | ||
No | 11191 (90.54) | 4455 (94.19) | 6736 (88.28) | ||
Family history of lung cancer | |||||
Yes | 329 (2.66) | 63 (1.33) | 266 (3.49) | ||
No | 12031 (97.34) | 4667 (98.67) | 7364 (96.51) | ||
Patients with pulmonary nodule | 0.70 | ||||
Yes | 9511 (76.95) | 3649 (77.15) | 5862 (76.83) | ||
No | 2849 (23.05) | 1081 (22.85) | 1768 (23.17) | ||
Abbreviation: COPD, chronic obstructive pulmonary disease. |
In the 1-year follow-up round, 86 patients (86/12,360, 0.70%) were
test positive of lung cancer including 35 (35/4730, 0.74%) in Site 1 and 51
(51/7630, 0.67%) in Site 2 (Table 2). Most patients were female (59, 68.60%),
never smoking (68, 79.07%) and had no family history of lung cancer (83,
96.51%). There was almost no statistical difference in the clinical features of
the two sites: sex (p = 0.47), smoking history (p = 0.93),
family history of lung cancer (p = 1), except age (p
Characteristic | Total N (%) | Site 1 N (%) | Site 2 N (%) | p | |
N = 86 | N = 35 | N = 51 | |||
Age (SD) | 59.92 (9.26) | 64.00 (8.69) | 57.12 (8.65) | ||
Sex | 0.47 | ||||
Male | 27 (31.40) | 13 (37.14) | 14 (27.45) | ||
Female | 59 (68.60) | 22 (62.86) | 37 (72.55) | ||
Smoking history | 0.93 | ||||
Current or former | 18 (20.93) | 8 (22.86) | 10 (19.61) | ||
Never | 68 (79.07) | 27 (77.14) | 41 (80.39) | ||
Family history of lung cancer | 1 | ||||
Yes | 3 (3.49) | 1 (2.86) | 2 (3.92) | ||
No | 83 (96.51) | 34 (97.14) | 49 (96.08) | ||
Treatment | |||||
Surgery | 81 (94.19) | 33 (94.29) | 48 (94.12) | ||
Chemotherapy | 9 (10.47) | 2 (5.71) | 7 (13.73) | ||
Radiotherapy | 2 (2.32) | 0 (0) | 2 (3.92) | ||
Targeted therapy | 5 (5.81) | 0 (0) | 5 (9.80) | ||
Histopathology | 0.43 | ||||
Adenocarcinoma | 80 (93.03) | 32 (91.43) | 48 (94.12) | ||
Squamous cancer | 4 (4.65) | 2 (5.71) | 2 (3.92) | ||
Small cell lung cancer | 1 (1.16) | 0 (0) | 1 (1.96) | ||
Sarcomatoid carcinoma | 1 (1.16) | 1 (2.86) | 0 (0) | ||
Stage | 0.30 | ||||
I | 73 (84.88) | 31 (88.57) | 42 (82.36) | ||
II | 6 (6.98) | 1 (2.86) | 5 (9.80) | ||
III | 5 (5.81) | 3 (8.57) | 2 (3.92) | ||
IV | 2 (2.33) | 0 (0) | 2 (3.92) |
Deep learning algorithm assisted clinicians to detect nodules and stratify risk effectively (Fig. 3, Table 3). With regard to nodules detection, the recall yielded 0.8953 (95% CI: 0.8693–0.9207) and FROC score was 0.6359 (95% CI: 0.6354–0.6384) in validation dataset; the recall reached 0.9507 (95% CI: 0.9342–0.9679) and FROC score was 0.6470 (95% CI: 0.6467–0.6495) in testing set. In term of risk stratification, the performance of deep learning model was promising with ACC of 0.9036 (95% CI: 0.8715–0.9317), macro-AUC of 0.8798 (95% CI: 0.8304–0.9248) in validation set, and ACC of 0.8696 (95% CI: 0.8370–0.9022), macro-AUC of 0.8516 (95% CI: 0.7934–0.9051) in testing dataset. Especially for the identification of high-risk groups, the deep learning model achieved superior performance with recall of 0.8571 (95% CI: 0.6667–1.0000), precision of 1.0000, F1 of 0.9630 (95% CI: 0.8889–1.0000), and AUC of 0.9894 (95% CI: 0.9702–1.0000) in validation set, and recall of 0.6923 (95% CI: 0.5000–0.8947), precision of 1.0000, F1 of 0.7619 (95% CI: 0.5600–0.9231), and AUC of 0.9634 (95% CI: 0.9328–0.9906) in testing dataset.
Deep learning performance to identify nodule and predict its malignancy. Receiver operating characteristics (ROC) curves to identify nodule in (A) validation set and (B) testing set. Abbreviations: AUC, area under the receiver operating characteristic curve; CI, confidence interval.
Nodule Detection | ACC | Recall (95% CI) | Precision (95% CI) | F1 (95% CI) | FROC score (95% CI) | ||
Validation set | 0.8953 (0.8693–0.9207) | – | – | 0.6359 (0.6354–0.6384) | |||
Testing set | 0.9507 (0.9342–0.9679) | – | – | 0.6470 (0.6467–0.6495) | |||
Risk Classification | ACC | Recall (95% CI) | Precision (95% CI) | F1 (95% CI) | AUC (95% CI) | ||
Validation set | |||||||
Low risk | 0.9036 (0.8715–0.9317) | 0.5833 (0.4483–0.7143) | 0.7241 (0.5833–0.8571) | 0.6462 (0.5172–0.7463) | 0.8215 (0.7437–0.8903) | ||
Mid risk | 0.9598 (0.9344–0.9805) | 0.9227 (0.8910–0.9517) | 0.9409 (0.9196–0.9590) | 0.8286 (0.7581–0.8936) | |||
High risk | 0.8571 (0.6667–1.0000) | 1 (1.0000–1.0000) | 0.9630 (0.8889–1.0000) | 0.9894 (0.9702–1.0000) | |||
Testing set | |||||||
Low risk | 0.8696 (0.8370–0.9022) | 0.4706 (0.3243–0.6129) | 0.5517 (0.4000–0.7037) | 0.5079 (0.3704–0.6349) | 0.8051 (0.7174–0.8856) | ||
Mid risk | 0.9432 (0.9188–0.9661) | 0.9038 (0.8710–0.9367) | 0.9231 (0.9006–0.9441) | 0.7864 (0.7091–0.8630) | |||
High risk | 0.6923 (0.5000–0.8947) | 1 (1.0000–1.0000) | 0.7619 (0.5600–0.9231) | 0.9634 (0.9328–0.9906) | |||
Abbreviations: ACC, accuracy; FROC, free-response receiver operating characteristic curve; AUC, area under the receiver operating characteristic curve; CI, confidence interval. |
This AI system might be a versatile tool for physicians. As an example of clinical deployment shown in Fig. 4, it could automatically locate pulmonary nodules, predict the degree of risk and recommend treatment for the next step.
Illustration of deep learning system for the detection and diagnosis of lung cancer in clinical application. This system provided localization and risk analysis of nodules on CT images.
A prospective lung cancer screening cohort was conducted in two rural areas of West China. A total of 12,360 participants were enrolled undergoing mobile CT vehicle and 86 patients were diagnosed with lung cancer after one year follow-up. What’s more, a deep learning model was constructed to aid clinicians to recognize high-risk nodules. This novel model made lung cancer screening possible in resource-deficient areas.
Participants in this screening group reported a high rate of pulmonary nodules (9511/12,360, 76.95%). In previous screening trails, the proportion of patients with lung nodule was 22–59% at baseline [5, 24, 25, 26]. The incidence of lung cancer (86/12,360, 0.70%) was inferior than NLST (first screening, 270/26,309, 1.03%) but higher than the crude incidence of lung cancer reported by National Cancer Center in 2015 (57.26 per 100,000) [6, 27]. This difference might be due to the different inclusion criteria of patients. Eligible participants in NLST were between 55 and 74 years old, who had a heavy cigarette smoking history of at least 30 pack-years, or had quit within the last 15 years. National Cancer Center collected cancer data from 368 cancer registries from across China covering rural and urban areas. Furthermore, the majority of patients were female (59/86, 68.60%) and non-smokers (68/86, 79.07%). Existing screening standards emphasized male, heavy smoking history and over 55 years old, but now female, non-smoking and young lung cancer should also be taken seriously [6, 7]. Future precise screening should focus on a subset of individuals at high risk of this particular cancer within the general population.
As far as we know, this was by far the largest prospective screening program using mobile CT involving more than 10,000 people. Although mobile CT solved the geographical restrictions in remote areas, someone preferred hospital-based CT during this program. Residents’ health awareness and acceptance of lung cancer screening should be strengthened by primary-care physicians and specialists. Another strength of this study was the application of deep learning model in clinical routine work. This system could assess localization and malignancy risk calculations of prior imaging, which enabled specialists to gain the efficiency and consistency of workflow. However, the deep learning system recognized no more than 3 mm nodules, resulting in high false positive of detection task. The cut-off value of positive nodules needed to be further optimized. Whatever, the combination of mobile CT and deep learning model might be helpful in alleviating the weakness of facilities and experience in distant regions for lung cancer screening.
Deep learning algorithm has the potential to alter the clinical workflow of lung cancer [28, 29]. At present, increasing number of studies have demonstrated the excellent application of deep learning in screening, diagnosis, and prognosis prediction of lung cancer. Previous study conducted an end-to-end deep learning algorithm to predict cancerous nodules on the basis of 6716 NLST cases, the performance of which was on-par with the radiologists [15]. Our study achieved a state-of-the-art performance on nodule detection with recall of 0.9507 and risk classification with ACC of 0.8696 in prospective large populations. Furthermore, it was possible to determine adenocarcinoma subtypes, gene mutation status and prognosis based on non-invasive CT images, reforming the selection of treatment strategies [30, 31, 32]. Beyond gains in consistency and accuracy, the capacity of deep learning to leverage diverse information has become of prime importance in improving efficiency of lung cancer management. Importantly, clinical adoption of these tools required further verification in external dataset to improve generalizability and effectiveness [33].
There were still several limitations in this study. First, some patients with high-risk pulmonary nodules lost visits or were not diagnosed pathologically as of the follow-up time, bringing unavoidable biases. Secondly, the features of nodules such as size, morphology, and growth were of paramount for risk stratification, but lacked in the predictive model. Last but not least, this study was the result of 1-year follow-up of cohort, 2-year or longer follow-up will make the results more convincing. We would continue to manage these participants, and validate accurate lung cancer screening model.
In conclusion, these results represented a large-scale prospectively screening study on mobile CT and an automated system to evaluate pulmonary nodules and lung cancer malignancy through deep learning. The novel approach in medical applications may assist clinicians to facilitate early diagnosis of lung cancer effectively, especially in resource-constrained sites.
WML, ZY, DL and BJC designed the research study. JS, CDW, and TBD performed the research. LY, TZL, XYX, GW and JXG analyzed the data. JS, LY and CDW wrote the manuscript. All authors contributed to editorial changes in the manuscript. All authors read and approved the final manuscript.
Ethics approval was obtained from the ethics committee of West China Hospital (Project.2020(232)).
We thank all of the volunteers who participated in the study; and staff in Mianzhu people’s Hospital, the First People’s Hospital of Longquan District, Chengdu, as well as West China Hospital of Sichuan University, who were active in the whole process of the screening.
This research was funded by the National Natural Science Foundation of China (92159302, 82100119), the Science and Technology Project of Sichuan (2022ZDZX0018, 2020YFG0473), Science and Technology Innovation Project of Guang’an (2020SYF03), and the 1·3·5 Project for Disciplines of Excellence, West China Hospital. Sichuan University (ZYJC18001).
The authors declare no conflict of interest.
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.