Spindle Detection Based on Elastic Time Window and Spatial Pyramid Pooling

Yiting Ou; Fei Wang; Bai Feng; Liren Tang; Jiahui Pan

doi:10.31083/j.jin2307134

Information
Figures
References
Contents

Academic Editor

Gernot Riedel

Download

[1]Iber C, Ancoli-Israel S, Chesson A. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. 2007.
- Google Scholar
- PubMed
- Crossref
[2]Berry RB, Brooks R, Gamaldo C, Harding SM, Lloyd RM, Quan SF, et al. AASM Scoring Manual Updates for 2017 (Version 2.4). Journal of Clinical Sleep Medicine: JCSM: Official Publication of the American Academy of Sleep Medicine. 2017; 13: 665–666.
- Google Scholar
- PubMed
- Crossref
[3]Mullins A, Parekh A, Kam K, Bubu O, Schoenholz R, Patel S, et al. 0308 The stability of slow wave sleep and EEG microstructure measures across two consecutive nights of laboratory polysomnography in cognitively normal older adults. SLEEP. 2022; 45: 139–139.
- Google Scholar
- Crossref
[4]Hefnawy MA, Fadlallah SA, El-Sherif RM, Medany SS. Competition between enzymatic and non-enzymatic electrochemical determination of cholesterol. Journal of Electroanalytical Chemistry. 2023; 930: 117169.
- Google Scholar
- Crossref
[5]Das PK, Meher S, Panda R, Abraham A. A Review of Automated Methods for the Detection of Sickle Cell Disease. IEEE Reviews in Biomedical Engineering. 2020; 13: 309–324.
- Google Scholar
- PubMed
- Crossref
[6]Zhu QL, Han F, Wang J, Cheng CH, Cai SJ, Wang QJ, et al. Effect of sleep spindle density on memory function in patients with obstructive sleep apnea hypopnea syndrome. Zhonghua Jie he he Hu Xi Za Zhi. 2023; 46: 466–473. (In Chinese)
- Google Scholar
- PubMed
- Crossref
[7]Zhang Y, Quiñones GM, Ferrarelli F. Sleep spindle and slow wave abnormalities in schizophrenia and other psychotic disorders: Recent findings and future directions. Schizophrenia Research. 2020; 221: 29–36.
- Google Scholar
- PubMed
- Crossref
[8]Petit JM, Strippoli MPF, Stephan A, Ranjbar S, Haba-Rubio J, Solelhac G, et al. Sleep spindles in people with schizophrenia, schizoaffective disorders or bipolar disorders: a pilot study in a general population-based cohort. BMC Psychiatry. 2022; 22: 758.
- Google Scholar
- PubMed
- Crossref
[9]van der Heijden AC, Hofman WF, de Boer M, Nijdam MJ, van Marle HJF, Jongedijk RA, et al. Sleep spindle dynamics suggest over-consolidation in post-traumatic stress disorder. Sleep. 2022; 45: zsac139.
- Google Scholar
- Crossref
[10]Chatburn A, Lushington K, Kohler MJ. Consolidation and generalisation across sleep depend on individual EEG factors and sleep spindle density. Neurobiology of Learning and Memory. 2021; 179: 107384.
- Google Scholar
- PubMed
- Crossref
[11]Friedrich M, Mölle M, Friederici AD, Born J. The reciprocal relation between sleep and memory in infancy: Memory-dependent adjustment of sleep spindles and spindle-dependent improvement of memories. Developmental Science. 2019; 22: e12743.
- Google Scholar
- PubMed
- Crossref
[12]Chawla P, Rana SB, Kaur H, Singh K, Yuvaraj R, Murugappan M. A decision support system for automated diagnosis of Parkinson’s disease from EEG using FAWT and entropy features. Biomedical Signal Processing and Control. 2023; 79: 104116.
- Google Scholar
- Crossref
[13]Khoshnevis SA, Sankar R. Diagnosis of Parkinson’s disease using higher order statistical analysis of alpha and beta rhythms. Biomedical Signal Processing and Control. 2022; 77: 103743.
- Google Scholar
- Crossref
[14]Varli M, Ylmaz H. Multiple classification of EEG signals and epileptic seizure diagnosis with combined deep learning. Journal of Computational Science. 2023; 67: 101943.
- Google Scholar
- Crossref
[15]Perez-Valero E, Morillas C, Lopez-Gordo MA, Minguillon J. Supporting the Detection of Early Alzheimer’s Disease with a Four-Channel EEG Analysis. International Journal of Neural Systems. 2023; 33: 2350021.
- Google Scholar
- PubMed
- Crossref
[16]Hori T, Sugita Y, Koga E, Shirakawa S, Inoue K, Uchida S, et al. Proposed supplements and amendments to ‘A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects’, the Rechtschaffen & Kales (1968) standard. Psychiatry and Clinical Neurosciences. 2001; 55: 305–310.
- Google Scholar
[17]Wendt SL, Welinder P, Sorensen HBD, Peppard PE, Jennum P, Perona P, et al. Inter-expert and intra-expert reliability in sleep spindle scoring. Clinical Neurophysiology: Official Journal of the International Federation of Clinical Neurophysiology. 2015; 126: 1548–1556.
- Google Scholar
- PubMed
- Crossref
[18]Qiu S, Yang CH, Wu L, Wang KC, Pan JZ. Machine-vision-based Spindle Positioning System of Grinding-wheel-saw Automatic Replacement System. Sensors and materials: An International Journal on Sensor Technology. 2022; 34: 789–801.
- Google Scholar
- Crossref
[19]Scafa S, Fiorillo L, Lucchini M, Roth C, Agostini V, Vancheri A, et al. Personalized Sleep Spindle Detection in Whole Night Polysomnography. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2020; 2020: 1047–1050.
- Google Scholar
- PubMed
- Crossref
[20]Kinoshita T, Fujiwara K, Kano M, Ogawa K, Sumi Y, Matsuo M, et al. Sleep Spindle Detection Using RUSBoost and Synchrosqueezed Wavelet Transform. IEEE Transactions on Neural Systems and Rehabilitation Engineering: a Publication of the IEEE Engineering in Medicine and Biology Society. 2020; 28: 390–398.
- Google Scholar
- PubMed
- Crossref
[21]Wang F, Li L, Wan Y, Li Z, Luo L, Hu B, et al. An Efficient Sleep Spindle Detection Algorithm Based on MP and LSBoost. Computers, Materials & Continua. 2023; 76: 2301–2316.
- Google Scholar
[22]Chen B, Chen H, Li M. Improvement and Optimization of Feature Selection Algorithm in Swarm Intelligence Algorithm Based on Complexity. Complexity. 2021; 2021: 9985185.
- Google Scholar
- Crossref
[23]Wang K, Kemao Q, Di J, Zhao J. Deep learning spatial phase unwrapping: a comparative review. Advanced Photonics Nexus. 2022; 1: 014001.
- Google Scholar
- Crossref
[24]You J, Jiang D, Ma Y, Wang Y. SpindleU-Net: An Adaptive U-Net Framework for Sleep Spindle Detection in Single-Channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering: a Publication of the IEEE Engineering in Medicine and Biology Society. 2021; 29: 1614–1623.
- Google Scholar
- PubMed
- Crossref
[25]Kulkarni PM, Xiao Z, Robinson EJ, Jami AS, Zhang J, Zhou H, et al. A deep learning approach for real-time detection of sleep spindles. Journal of Neural Engineering. 2019; 16: 036004.
- Google Scholar
- PubMed
- Crossref
[26]Saifutdinova E, Dudysova D, Gerla V, Lhotska L. Improvement of Sleep Spindle Detection by Aggregation Techniques. Mediterranean Conference on Medical and Biological Engineering and Computing. 2020; 76: 226–234.
- Google Scholar
- PubMed
- Crossref
[27]Thiesse L, Staner L, Bourgin P, Roth T, Fuchs G, Kirscher D, et al. Validation of Somno-Art Software, a novel approach of sleep staging, compared with polysomnography in disturbed sleep profiles. Sleep Advances: a Journal of the Sleep Research Society. 2021; 3: zpab019.
- Google Scholar
- PubMed
- Crossref
[28]Jeonghee H, Soyoung P, Jeonghee C. Improving Multi-Class Motor Imagery EEG Classification Using Overlapping Sliding Window and Deep Learning Model. Electronics. 2023; 12: 1186.
- Google Scholar
- Crossref
[29]Fiorillo L, Monachino G, van der Meer J, Pesce M, Warncke JD, Schmidt MH, et al. U-Sleep’s resilience to AASM guidelines. NPJ Digital Medicine. 2023; 6: 33.
- Google Scholar
- PubMed
- Crossref
[30]Tehrani MJ, Rashidinia A, Amoli FA, Esfandiari A. A rare presentation of orbital spindle cell carcinoma a case report and brief review of the literature. BMC Ophthalmology. 2023; 23: 369.
- Google Scholar
- PubMed
- Crossref
[31]Jiang Y, Bugby SL, Cosma G. Automatic detection of scintillation light splashes using conventional and deep learning methods. Journal of Instrumentation. 2022; 17: P06021.
- Google Scholar
- Crossref
[32]Lim JS, Stofa MM, Koo SM, Zulkifley MA. Micro Expression Recognition: Multi-scale Approach to Automatic Emotion Recognition by using Spatial Pyramid Pooling Module. International Journal of Advanced Computer Science and Applications (IJACSA). 2021; 12: 12.
- Google Scholar
- Crossref
[33]Hong Q, Zhong X, Chen W, Zhang Z, Li B. Hyperspectral Image Classification Network Based on 3D Octave Convolution and Multiscale Depthwise Separable Convolution. ISPRS International Journal of Geo-Information. 2023; 12: 505.
- Google Scholar
- Crossref
[34]Huo Y, Zhang Q, Jia Y, Liu D, Guan J, Lin G. A Deep Separable Convolutional Neural Network for Multiscale Image-Based Smoke Detection. Fire Technology. 2022; 58: 1445–1468.
- Google Scholar
- Crossref
[35]Yang B, Li H. A similarity elastic window based approach to process dynamic time delay analysis. Chemometrics & Intelligent Laboratory Systems. 2017; 170: 13–24.
- Google Scholar
[36]Chen P, Chen D, Zhang L, Tang Y, Li X. Automated sleep spindle detection with mixed EEG features. Biomedical Signal Processing and Control. 2021; 70: 103026.
- Google Scholar
- Crossref
[37]Liu D, Liu T, Bi H, Zhao Y, Cheng Y. Multiscale Local Feature Fusion: Marine Microalgae Classification for Few-Shot Learning. Water. 2023; 15: 1413.
- Google Scholar
- Crossref
[38]Zhou W, Lin X, Lei J, Yu L, Hwang JN. MFFENet: Multiscale Feature Fusion and Enhancement Network for RGBThermal Urban Road Scene Parsing. IEEE Transactions on Multimedia. 2021; 24: 2526–2538.
- Google Scholar
- Crossref
[39]Wu D, Zhao J, Wang Z. AM-PSPNet: Pyramid Scene Parsing Network Based on Attentional Mechanism for Image Semantic Segmentation. In International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer: Singapore. 2022.
- Google Scholar
- PubMed
- Crossref
[40]Zhang R, Chen J, Feng L, Li S, Yang W, Guo D. A Refined Pyramid Scene Parsing Network for Polarimetric SAR Image Semantic Segmentation in Agricultural Areas. IEEE Geoscience and Remote Sensing Letters. 2022; 19: 1–5.
- Google Scholar
- Crossref
[41]Elizar E, Zulkifley MA, Muharar R, Zaman MHM, Mustaza SM. A Review on Multiscale-Deep-Learning Applications. Sensors (Basel, Switzerland). 2022; 22: 7384.
- Google Scholar
- PubMed
- Crossref
[42]He J, Wang X, Song Y, Xiang Q. A multiscale intrusion detection system based on pyramid depthwise separable convolution neural network. Neurocomputing. 2023; 530: 48–59.
- Google Scholar
- Crossref
[43]Li G, Zhang J, Zhang M, Wu R, Cao X, Liu W. Efficient depthwise separable convolution accelerator for classification and UAV object detection. Neurocomputing. 2022; 490: 1–16.
- Google Scholar
- Crossref
[44]Lou Y, He Y, Wang L, Chen G. Predicting Network Controllability Robustness: A Convolutional Neural Network Approach. IEEE Transactions on Cybernetics. 2022; 52: 4052–4063.
- Google Scholar
- PubMed
- Crossref
[45]Shen Y, Zhu S, Chen C, Du Q, Xiao L, Chen J. Efficient Deep Learning of Nonlocal Features for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing. 2021; 59: 6029–6043.
- Google Scholar
- Crossref
[46]Tripathi S, Singh SK, Kuan LH. Bag of Visual Words (BoVW) with Deep Features–Patch Classification Model for Limited Dataset of Breast Tumours. ArXiv. 2022. (preprint)
- Google Scholar
- PubMed
- Crossref
[47]Yee PS, Lim KM, Lee CP. DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling. Expert Systems with Applications. 2022; 193: 116382.
- Google Scholar
- Crossref
[48]Sriram S, Vinayakumar R, Sowmya V, Alazab M, Soman KP. Multi-scale Learning based Malware Variant Detection using Spatial Pyramid Pooling Network. IEEE INFOCOM 2020-IEEE conference on computer communications workshops (INFOCOM WKSHPS) (pp. 740–745). IEEE. 2020.
- Google Scholar
- Crossref
[49]Wu C, Lou Y, Wang L, Li J, Li X, Chen G. SPP-CNN: An Efficient Framework for Network Robustness Prediction. IEEE Transactions on Circuits and Systems I: Regular Papers. 2023; 70: 4067–4079.
- Google Scholar
- Crossref
[50]Msonda P, Uymaz SA, Karaaa SS. Spatial Pyramid Pooling in Deep Convolutional Networks for Automatic Tuberculosis Diagnosis. Traitement du Signal. 2020; 37: 1075–1084.
- Google Scholar
- Crossref
[51]Devuyst S, Dutoit T, Stenuit P, Kerkhofs M. Automatic sleep spindles detection–overview and development of a standard proposal assessment method. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference. 2011; 2011: 1713–1716.
- Google Scholar
- PubMed
- Crossref
[52]Krieter S, Thüm T, Schulze S, Saake G, Leich T. YASA: yet another sampling algorithm. In VaMoS ‘20: Proceedings of the 14th International Working Conference on Variability Modelling of Software-Intensive Systems. 2020.
- Google Scholar
- Crossref
[53]Barakat ABP. Convergence and Dynamical Behavior of the ADAM Algorithm for Nonconvex Stochastic Optimization. SIAM Journal on Optimization: A Publication of the Society for Industrial and Applied Mathematics. 2021; 31: 244–274.
- Google Scholar
- Crossref
[54]Hubar S, Koulovatianos C, Li J. Fitting Parsimonious Household-Portfolio Models to Data. Social Science Electronic Publishing. 2014; 1: 37–39.
- Google Scholar
- Crossref
[55]Sharma R, Sircar P, Pachori RB. Automated focal EEG signal detection based on third order cumulant function. Biomedical Signal Processing and Control. 2020; 58: 101856.1–101856.8.
- Google Scholar
- Crossref
[56]Lacourse K, Delfrate J, Beaudry J, Peppard P, Warby SC. A sleep spindle detection algorithm that emulates human expert spindle scoring. Journal of Neuroscience Methods. 2019; 316: 3–11.
- Google Scholar
- PubMed
- Crossref
[57]Sun X, Qi Y, Wang Y, Pan G. Convolutional Multiple Instance Learning for Sleep Spindle Detection With Label Refinement. IEEE Transactions on Cognitive and Developmental Systems. 2023; 15: 272–284.
- Google Scholar
- Crossref
[58]Jiang D, Ma Y, Wang Y. A robust two-stage sleep spindle detection approach using single-channel EEG. Journal of Neural Engineering. 2021; 18: 026–026.
- Google Scholar
- PubMed
- Crossref
[59]Demiar J, Schuurmans D. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research. 2006; 7: 1–30.
- Google Scholar
- Crossref
[60]Atkinson G, Metsis V. Identifying label noise in time-series datasets. Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and the 2020 ACM International Symposium on Wearable Computers (pp. 238–243). 2020.
- Google Scholar
- Crossref

Information
Download
Contents

Open Access 17 Jul 2024Original Research

Spindle Detection Based on Elastic Time Window and Spatial Pyramid Pooling

Yiting Ou ¹, Fei Wang ^1,2,*, Bai Feng ¹, Liren Tang ¹, Jiahui Pan ^1,2,*

Affiliations

Article Info

¹ School of Software, South China Normal University, 528200 Foshan, Guangdong, China

² Research Center for Brain–Computer Interface, Pazhou Laboratory, 510330 Guangzhou, Guagngdong, China

^*Correspondence: fwang@scnu.edu.cn (Fei Wang); panjh82@qq.com (Jiahui Pan)

Abstract

Background: Sleep spindles have emerged as valuable biomarkers for assessing cognitive abilities and related disorders, underscoring the importance of their detection in clinical research. However, template matching-based algorithms using fixed templates may not be able to fully adapt to spindles of different durations. Moreover, inspired by the multiscale feature extraction of images, the use of multiscale feature extraction methods can be used to better adapt to spindles of different frequencies and durations. Methods: Therefore, this study proposes a novel automatic spindle detection algorithm based on elastic time windows and spatial pyramid pooling (SPP) for extracting multiscale features. The algorithm utilizes elastic time windows to segment electroencephalogram (EEG) signals, enabling the extraction of features across multiple scales. This approach accommodates significant variations in spindle duration and polarization positioning during different EEG epochs. Additionally, spatial pyramid pooling is integrated into a depthwise separable convolutional (DSC) network to perform multiscale pooling on the segmented spindle signal features at different scales. Results: Compared with existing template matching algorithms, this algorithm’s spindle wave polarization positioning is more consistent with the real situation. Experimental results conducted on the public dataset DREAMS show that the average accuracy of this algorithm reaches 95.75%, with an average negative predictive value (NPV) of 96.55%, indicating its advanced performance. Conclusions: The effectiveness of each module was verified through thorough ablation experiments. More importantly, the algorithm shows strong robustness when faced with changes in different experimental subjects. This feature makes the algorithm more accurate at identifying sleep spindles and is expected to help experts automatically detect spindles in sleep EEG signals, reduce the workload and time of manual detection, and improve efficiency.

Keywords

sleep spindle
multiscale
elastic time window
spatial pyramid pooling

1. Introduction

Sleep spindles are sinusoidal periodic pulse trains that occur during sleep with a frequency range of 11–16 Hz. The American Academy of Sleep Medicine (AASM) [1] noted that sleep spindles are the hallmark feature of non-rapid eye movement (NREM) sleep [2, 3]. Identifying spindles is of utmost importance in the fields of sleep staging research and clinical disease diagnosis [4, 5]. Extensive studies have revealed significant correlations between the quantity and density of spindles and a range of diseases, including obstructive apnea-hypopnea syndrome [6] and schizophrenia [7, 8]. Moreover, sleep spindles are closely related to sleep maintenance and learning and memory [9, 10, 11]. Electroencephalograms (EEGs) carry important information about brain electrical activity and may reveal many pathologies. EEG analysis has proven to be a powerful tool in sleep research and neurological disease diagnosis [12], with different frequencies broken down into several waveforms, such as alpha [13] and beta [14], as well as events, such as K-complexes [15] and sleep spindles. Currently, extracting appropriate high-level features from EEG signals and detecting sleep spindles based on these features can be roughly divided into three methods.

In clinical diagnosis, spindle detection relies mainly on experienced doctors identifying spindles by counting the number of times the oscillation waveform reaches a peak within a period of time through naked eye observation based on the definition of spindle waves [16], which is the so-called golden rule. However, manual detection is costly and time-consuming, and the accuracy of detection relies heavily on the subjective experience of doctors [17]; therefore, detecting spindles using the golden rule is highly challenging.

Presently, the predominant approach for automatic spindle detection relies on template matching rules [18]. These rules involve identifying the amplitude or time-frequency characteristics of spindle events and employing fixed or adaptable thresholds for prediction purposes. Scafa et al. [19] proposed a personalized sleep spindle wave detection program (PSSD). By determining the optimal set of spindle wave features, using a support vector machine algorithm to distinguish between spindle waves and nonspindle waves and verifying the results on the DREAMS dataset, the sensitivity and specificity reached 98.6% and 98.1%, respectively. Kinoshita et al. [20] proposed a sleep spindle detection method that combines the wavelet synchronized squeezing transform (SST) and random undersampling boosting (RUSBoost). The sensitivity on the MASS dataset is 76.9%, and the accuracy is 61.2%. Wang et al. [21] proposed an automatic spindle detection algorithm based on Matching Pursuit (MP) and least squares boosting (LSBoost). The accuracy on the DREAMS dataset is 94.7%, which is highly sensitive.

In recent years, deep learning has become increasingly prominent in analyzing big data tasks. Although deep learning methods are highly dependent on data and have low interpretability [22], they require optimal algorithm selection, model parameter adjustment, and network layer setting. It takes considerable time and requires developers to have a large knowledge reserve. However, the use of deep learning to solve optimization classification problems is still mainstream in today’s society because deep learning methods have strong generalization capabilities and adaptive capabilities [23]. You et al. [24] proposed a spindle detection method with an attention module (SpindleU-Net) based on the U-Net framework. The sensitivity was 86.6%. With the exploration and development of convolutional neural network (CNN) technology, CNNs have been widely used in speech recognition processing, image processing, machine vision, natural language and other fields. In automatic sleep spindle detection, Kulkarni et al. [25] fused a CNN and a Recurrent Neural Network (RNN) and proposed a deep learning method (SpindleNet) for real-time sleep spindle detection. The sensitivity of the MASS dataset was 90.07%, and the specificity was 96.19%.

Using template matching rules to detect spindles [26, 27], a fixed-length time window is usually used to extract EEG signals, but due to the uncertainty and individual variability of spindles in duration and formation, they are fixed. A long window may not be able to accurately locate and completely segment all spindles, resulting in missed or false detections. Using elastic time windows to extract multiscale features [28] and processing inputs of arbitrary length [29] can adaptively adjust the length of the time window according to the local characteristics of the signal, decompose the signal into subsignals of different scales, and extract corresponding features at each scale, thereby providing a more comprehensive description of the time-frequency characteristics of the signal and improving the sensitivity and accuracy of spindle wave detection. In addition, spindle signals are often rare, and nonspeckle signals account for the majority of spindle signals [30]. The automatic detection of sleep spindles using deep learning methods may be affected by imbalanced data [31], resulting in poor spindle detection results. The multiscale pooling of spatial pyramid pools [32] can increase the attention given to spindle wave signals and improve the accuracy of spindle wave detection. The multiscale depthwise separable convolutional network structure [33] has a strong representation learning ability and can automatically learn more discriminative feature representations from data.

Huo et al. [34] used the multiscale depthwise separable convolution (DSC)-spatial pyramid pooling (SPP) method for smoke image detection and achieved good results. Therefore, they believe that the multiscale DSC-SPP method shows excellent performance in the field of image recognition. After this research, we applied the multiscale DSC-SPP method to spindle detection and proposed a spindle detection algorithm that combines elastic time windows and spatial pyramid pooling to extract multiscale features. An elastic time window is utilized to accommodate the widely varying durations of spindles in EEG. This enables the extraction of regulated deep features from EEG epochs with variable lengths using DSC with spatial pyramid pooling. The effectiveness of the proposed method for automatic sleep spindle detection has been demonstrated through experimental results on the publicly available DREAMS dataset. Specifically, our contributions in this article can be summarized as follows.

• We designed elastic time windows to extract multiscale features of the EEG signals to reduce the damage to the overall structural information of the spindles.

• We integrate spatial pyramid pooling into depthwise separable convolutions and perform multiscale pooling on spindle features to enhance representation to solve the problem of spindle number imbalance in EEG signals.

• We propose the Multiscale DSC-SPP method as a solution for automatic sleep spindle detection, and its efficiency has been validated on the widely used DREAMS dataset.

2. Methods

In this section, we comprehensively introduce a detailed overview of the multiscale DSC-SPP architecture, which consists of several interrelated modules that work together to extract EEG signal features after multiscale feature input based on spatial pyramid pooling. The DSC effectively extracts the deep features of sleep spindles and achieves excellent spindle detection performance.

2.1 Architecture Overview

The proposed multiscale DSC-SPP architecture is shown in Fig. 1. First, multiscale feature extraction is performed on the preprocessed EEG signals through 0.5 s, 0.75 s and 1 s elastic time windows, and the signals are subsequently input into a depthwise separable convolution based on spatial pyramid pooling to obtain the depth features of signals of different scales. The fused features generated by the concatenate layer are subsequently fed to the sigmoid layer for spindle detection. In Depthwise Separable Net (DSNet), we incorporate dense connections, which help to minimize the gap between the input and output layers. This approach facilitates smoother propagation of gradients and feature information throughout the network. In addition, when there is less training data, dense connections also play a role in regularizing the model and reducing the risk of overfitting. The following sections demonstrate the proposed fusion architecture and its modules in detail.

Fig. 1.

The framework of the proposed multiscale DSC-SPP model. The network first consists of three different DSC networks, running different context input sizes at the same time. The output of each part is then passed through the maximum pooling layer and regularization. The SPP layer is installed between the convolutional layer and the fully connected layer. Then, the three different feature outputs are combined at the end of the network to obtain rich spindle features. Finally, the sigmoid function is installed in the last fully connected layer to determine whether spindles exist in the feature. DSC, depthwise separable convolutional; SPP, spatial pyramid pooling.

2.2 Data Preprocessing

The sliding window is a common data processing technique that is easy to implement and understand. By defining the window size and sliding step, the spindle signal can be segmented into multiple windows for processing. Feature extraction algorithms can be applied to each window to extract spindle-related features such as frequency, amplitude, and duration. However, when using sliding windows for spindle detection, it is important to select an appropriate window size and sliding step, as the duration of spindles can vary, typically ranging from 0.5 seconds to 3 seconds. This ensures that the windows adequately cover the time period of the spindle. The computational complexity of sliding windows increases with increasing window size.

In time series data, the elastic time window [35, 36] can be viewed as a sliding window of variable length. Unlike traditional fixed-size windows, elastic time windows can adaptively adjust their length according to the characteristics of the data, thereby better capturing key information in the data. This approach allows the use of different strides to split consecutive time windows in the same dataset, thus creating multiple scale datasets. The calculation formula for the elastic time window is as follows:

(1) $window\_size=M*K$

(2) $stride=S*K$

where M and S are the fixed window size and step size, respectively, and K is the coefficient adjusted according to different scales.

Multiple datasets can be generated by slicing and segmenting the raw signal and labels using different step sizes and window sizes. Specifically, for the i-th scale, the window size is M * i, and the step size is S * i.

In the process of data slicing and scaling, the time interval between windows is controlled by adjusting the step size, thereby achieving elastic partitioning of the dataset. Different step sizes will affect the degree of overlap between windows and thus affect the training effect of the model. The elastic time window can help the model capture more time series features and improve its generalizability.

2.3 Multiscale DSC-SPP Architecture

2.3.1 Multiscale Feature Extraction

According to recent research, multiscale feature learning [37] has demonstrated significant potential in diverse applications, including scene parsing and medical diagnosis [38, 39]. The fundamental concept behind multiscale feature learning involves building multiple neural networks simultaneously with varying context input sizes. The features extracted from these models are subsequently combined in a fully connected layer [40]. By analyzing kernels at different scales, multiscale feature learning aims to capture a broader range of pertinent features and estimate the spatial map associated with the input image [41].

To construct feature maps of different scales input to the network, this paper uses elastic time windows to segment the EEG signals. Slices with different step sizes (0.5 s, 0.75 s, and 1 s) were used to generate three training sets of different scales. The window sizes of each training set are 100, 150 and 200.

2.3.2 Multiscale DSC

Depthwise separable convolution has proven to be a successful technique in neural image classification because it helps eliminate redundant features and significantly reduces the number of parameters required [42]. Unlike standard convolution, depthwise separable convolution breaks down the feature extraction process into two simpler steps: depthwise convolution and pointwise convolution [43]. The entire process is illustrated in Fig. 2.

Fig. 2.

Depthwise separable convolution. DConv, Depthwise Convolution; PConv, Pointwise Convolution.

Convolutional layers that operate in a sliding window fashion are adaptable to varying input sizes, whereas fully connected layers mandate fixed-size inputs to function properly. Due to this limitation, traditional DSC usually requires that the input images be the same size when processing image data. To meet this requirement, images are typically resized using resampling operations such as compression or stretching so that they have the same dimensions [44]. However, this operation introduces certain errors and leads to the loss of useful information in the original image, which may affect the model’s recognition of the image.

To solve the above problems, this paper improves the DSC network structure by adding an SPP layer before the fully connected layer. The SPP layer can extract spatial feature information at different scales, increasing the robustness of the model to the spatial layout and object deformation of the image.

By combining the design of elastic time windows and SPP layers, our model can adapt to input images of different sizes, better handle time series data, and improve the performance of the model in the spindle detection task.

2.3.3 Spatial Pyramid Pooling

In terms of object recognition tasks, the SPP [45, 46] has shown significant advantages in practice. SPP can generate a fixed-length output regardless of the size and dimensions of the input image. Compared with the traditional sliding window method, SPP applies a multi-level spatial pyramid, which can consider object information at different scales and improve the performance of target detection and recognition, especially with good adaptability to targets of different scales. At the same time, the SPP network allows us to generate images for testing from images of any size and supports inputting images of different sizes and ratios during the training process. Therefore, by training with variable-sized input images, the model’s invariance to the input image size can be improved, and the risk of overfitting can be reduced.

Furthermore, the SPP can combine features obtained at variable scales with the flexibility of the input scale. By embedding SPP [47] before the fully connected layer in DSC, the input feature map is divided into several parts, and features are extracted from each part. As shown in Fig. 3, any given feature map can be split into n * n subsets using spatial bins of size n, and a fixed-size vector is generated by selecting the maximum value from each spatial bin. Each feature map is then pooled multiple times, and its output vectors are concatenated to produce a one-dimensional output vector of the feature map. The key principle is to assign different numbers of spatial bins to input feature maps of different sizes. The spatial bin-pooled features from all the filters are flattened and connected to create a final feature representation of consistent length. This approach enables the model to generate fixed-length representations irrespective of the size or scale of the input features [36].

Fig. 3.

Spatial pyramid pooling diagram. The SPP module captures information from various subregions at different scales. By fusing the information from various subregions within these receptive fields, a more robust representation can be obtained. Here, 64 refers to the number of filters in the final convolutional layer.

Assuming that $n_{f}$ represents the number of input feature maps, $n_{b}$ represents the number of spatial bins, and $b_{i}$ represents the i-th bin, the expression of the output vector size ( $V_{s}$ ) of the SPP is as shown in Eqn. 3 [48].

(3) $V_{s}=n_{f}*\sum\nolimits_{i=1}^{n_{b}}\operatorname{size}\left(b_{i}\right)^{2}$

In the SPP operation, the input feature map is divided into multiple subregions, and each subregion is pooled independently to form a fixed-size output vector.

As shown in Fig. 4, this paper uses three SPPs with sizes of 1 * 1, 2 * 2 and 4 * 4. For an input image of size N * N, LN’ * N’ feature maps are obtained in the last convolutional layer, where L is the number of filters [49]. To process input images of different sizes, the SPP layer method is used. In the SPP layer, the feature map is divided into three different levels of spatial bins (1 * 1, 2 * 2, and 4 * 4) and processed through max pooling of the corresponding sizes. Then, a representation vector of length pL is generated as the output of the SPP layer, where L and p are predefined hyperparameters. Therefore, regardless of the size of the input image, a fixed-length pL vector can be generated as the input of the fully connected layer. It has been proven in the literature that the SPP layer is not sensitive to the performance changes exhibited by different settings of the spatial bin [50].

Fig. 4.

DSC network structure with the SPP layer. In the SPP layer, the feature map is partitioned into three levels of spatial bins with sizes of 1 * 1, 2 * 2 and 4 * 4. Each bin is then subjected to max pooling with the corresponding size. The SPP layer produces a representation vector of length pL as its output.

3. Experiments and Results

3.1 Dataset

We validated our approach on the DREAMS dataset (https://zenodo.org/records/2650142#.YRtw6o4zY2w) from the Sleep Laboratory of Andr vsamsale Hospital in Belgium [51]. The DREAMS dataset comprises 30-minute polysomnography (PSG) excerpts obtained from eight subjects, including four males and four females aged 45.88 $\pm{}$ 7.87 years. These individuals exhibit various sleep pathologies, such as dysomnia, restless syndrome, insomnia, and apnea/hypopnea syndrome. The PSG excerpts in the DREAMS dataset consisted of three EEG channels (FP1-A1, C3-A1, O1-A1), two electro-oculogram (EOG) channels, and one electromyography (EMG) channel. Two sleep experts independently scored the excerpts based on the C3 channel. The sleep spindles were manually tagged by two experts on either the C3-A1 or CZ-A1 channel. The combined annotations, based on the “OR” criterion, were considered accurate references for determining the start and end points of the spindles. Further details can be found in Table 1.

Table 1.Details of the DREAMS dataset.

	Tagged channel	Length (s)	Sampling rate (Hz)	Nr. Spindles labeled by expert 1	Nr. Spindles labeled by expert 2	Nr. Interannotator-agreed spindles
Excerpt 1	C3-A1	1800	100	52	115	135
Excerpt 2	CZ-A1	1800	200	60	52	77
Excerpt 3	C3-A1	1800	50	5	44	44
Excerpt 4	CZ-A1	1800	200	44	25	63
Excerpt 5	CZ-A1	1800	200	56	86	103
Excerpt 6	CZ-A1	1800	200	72	87	117
Excerpt 7	CZ-A1	1800	200	18	-	-
Excerpt 8	CZ-A1	1800	200	48	-	-

The table shows that each excerpt in the DREAMS dataset is 1800 * 200. In our work, we resampled excerpt 1 and excerpt 3 to 200 Hz using a configurable sampling algorithm for t-wise interaction sampling [52]. For training and validation, we utilized the first six excerpts from the dataset.

3.2 Implementation Details

In the suggested framework, the Adam optimizer is used to optimize the model by starting with a learning rate of 0.0001 and a batch size of 64 [53]. Additionally, dropout and early stopping techniques are implemented to prevent overfitting. The dropout strategy has demonstrated superior performance compared to other regularization methods [54]. A specified percentage of the input units in each layer (excluding the first layer) is randomly set to zero. Throughout our experiments, we empirically established a dropout rate of 0.2.

In our approach, we adopt a subject-independent methodology for training and testing the model. This implies that the data in the training set and the data in the test set are sourced from different subjects. To ensure nonoverlapping testing, we employed a 6-fold cross-validation scheme in which each fold involved the use of data from different subjects for testing, while the remaining subjects’ data were used for training. This process allowed us to thoroughly evaluate the model’s performance across all subjects in the dataset. As part of the 6-fold cross-validation, we divided the dataset such that 70% of all the samples were used for training and the remaining 30% were used for validation in each fold. During training, we also included 30% of the data as independent validation samples for early stopping purposes. We then averaged the results obtained from each fold to evaluate the overall performance of the model. This approach is described in detail in third order cumulant function (ToC) [55].

The model utilizes the sigmoid function to produce a probability ranging from 0 to 1, indicating whether a given EEG epoch belongs to the main axis. During the training phase, the probability obtained from the model is applied to each data point within the epoch. In the testing phase, the EEG signal is divided into segments three times, with each segment having a different time window. The time windows for segmentation wesssre 0.5 seconds, 0.75 seconds, and 1 second. As a result, each data point will have three distinct probabilities. For a given excerpt x consisting of n data points, the final output for each data point is determined by selecting the highest probability value out of the three distinct probabilities associated with that data point:

(4) $P\left(x_{i}\right)=\max\left(P_{0.5}\left(x_{i}\right),P_{0.75}\left(x_{i}% \right),P_{1}\left(x_{i}\right)\right)(1\leq i\leq n)$

where $P_{0.5}(x_{i})$ , $P_{0.75}(x_{i})$ , and $P_{1}(x_{i})$ represent the probability values assigned to the i-th data point within their respective elastic time windows. By comparing these probabilities to a predetermined threshold value, continuous data points that exceed the threshold can be merged to create an estimated spindle wave. Thresholds were obtained through web searches.

3.3 Evaluation Methods

The occurrence of spindles in the DREAMS dataset was distributed across stage 2 or stage 3 NREM (N2, N3) sleep. The density and duration of these spindles varied depending on the subject and their respective sleep pathology. Due to the different lengths of the spindles, the predicted spindles and the annotated spindles are not accurately matched. When the overlap between the predicted spindles and the annotated spindles significantly exceeded a specific threshold, the study treated them as spindles. Intersection-over-union (IoU) is utilized to quantify the overlap between predicted and annotated spindles, providing a measure of their similarity.

(5) $IoU=\frac{N_{x_{\text{predicted }}}\cap x_{\text{annotated }}}{N_{x_{\text{% predicted }}\cup x_{\text{annotated }}}}$

where $N_{x_{\text{predicted}\ }\cap{}x_{\text{annotated}\ }}$ and $N_{x_{\text{predicted}\ }\cup{}x_{\text{annotated}\ }}$ represent the number of data points that cross and merge, respectively, between the predicted spindles and labeled spindles, as shown in Fig. 5. Predicted spindles with IoUs greater than the threshold $\delta{}$ are marked as true positives (TPs). In this study, $\delta{}$ defaults to 0.25, following the common definition from previous work [56].

Fig. 5.

A simplified diagram illustrating the ground truth and detection process. The “ground truth” corresponds to the “annotated”, while “detection” corresponds to the “predicted”. The intersection of the ground truth and detection data is referred to as an “union”, while the union of the two is called an “intersection”. Predicted spindle waves with an IoU greater than the threshold are labeled “TP”. TP, true positive; IoU, intersection-over-union.

We marked true positives by calculating the IoU overlap between the predicted spindle waves and expert-labeled spindle waves. Therefore, our model outputs label the predicted spindle waves and outputs the number of predicted spindle waves rather than a spindle length identifier.

When working with imbalanced datasets, a high accuracy may not necessarily indicate superior performance of the spindle detection model due to the larger proportion of negative samples. We calculate six commonly used evaluation indicators, accuracy, precision, F1-score and negative predictive value (NPV), expressed as follows:

(6) $\text{ Accuracy }=\frac{TP+TN}{N},N=TP+TN+FP+FN$

(7) $F1-\text{ score }=2\cdot\frac{\text{ Sensitivity }\cdot\text{ Precision }}{% \text{ Sensitivity }+\text{ Precision }}$

(8) $Precison=\frac{TP}{TP+FP}$

(9) $NPV=\frac{TN}{TN+FN}$

3.4 Comparison Method

(1) Advanced Performance Solutions: In recent years, many scholars have proposed effective spindle detection algorithms based on the DREAMS dataset. Likewise, we further propose a new spindle detection method for multiscale feature extraction. Compared with the solutions of other scholars, our proposed solution has the best overall performance.

(a) Sun et al. [57] proposed a multi-instance learning framework based on convolutional neural networks, CNN- Multiple Instance Learning (MIL). This framework assumes that only a portion of each labeled spindle segment contains real spindle patterns and learns spindle-related features by distinguishing informative instances in the feature learning stage.

(b) Jiang et al. [58] proposed a two-stage method. In the predetection stage, the Teager energy operator with adaptive parameters is used to discover as many candidate sleep spindles as possible, and a classifier is used to further identify the true sleep spindles among all the candidate spindles.

(c) Kulkarni et al. [25] proposed a deep learning strategy, SpindleNet. The fixed-scale features learned by the CNN are further passed to the RNN, and the RNN output (from 50 time steps) of subnet 1 is combined with the RNN output feature of subnet 2 to achieve spindle multiscale feature fusion.

(d) Chen et al. [36] proposed a method of mixing EEG signal features using an elastic time window to adapt to significant changes in spindle duration and mixing depth features with the information entropy of EEG signals for spindle detection.

(2) Baseline models: For more effective verification, we implemented three baseline models to further compare with published solutions and to benchmark our proposed solutions.

(a) In the pre-detection stage, we use 0.5 s, 0.75 s and 1 s elastic time windows to segment the EEG signals to obtain EEG signals of different scales, which will be used to obtain multiscale characteristic spindles.

(b) We add the SPP layer in the feature fusion stage to fuse inputs of different scales into a fixed-size feature matrix. Specifically, three spatial bins of different sizes (1 * 1, 2 * 2, and 4 * 4) are set in the SPP layer, generating a 21 * N output.

(c) We use the DSC variant as the basic model of the CNN. The network consists of three repeated DSC layers, a concatenate layer, an SPP layer, and a fully connected layer. Specifically, for each DSC, 3 * 3 kernels are used, ReLU activation is used with a stride of 1, and the output range is set between 0 and 1 through the sigmoid function.

3.5 Ablation Experiment and Analysis

To gain a better understanding of the contributions of each component in our framework, we conducted a series of ablation and analysis experiments. These experiments aimed to determine the impact of different modules on the performance of the final model. Overall, our research findings provide evidence for the effectiveness of the proposed framework and emphasize the importance of considering multiple scales in spindle wave detection.

3.5.1 Multiscale Feature Extraction

The performance of a model utilizing an elastic time window design is evaluated and compared to that of a model with a fixed-length window (0.5 s) in this series of experiments. As shown in Table 2, compared to the method using a fixed window, the model using an “elastic” time window has higher accuracy.

Table 2.Results of the ablation experiment based on the DREAMS dataset.

	Accuracy (%)	Sensitivity (%)	Precision (%)	F1-score (%)	p value
DSC	92.84	49.90	48.46	37.17	0.1093
DSC+Multiscale	95.28	49.61	49.08	49.31	0.0352
DSC+SPP	94.87	49.99	49.85	49.46	0.0462
DSC+Multiscale+SPP	95.75	50.02	50.14	49.47	-

Moreover, to observe the impact of the IoU threshold on performance, we used fixed windows and elastic windows to test the spindle proportion detected under different thresholds. As shown in Fig. 6, when the threshold is greater than 0.3, most spindles can be predicted, and compared with that of the fixed window in the elastic time window, the proportion of spindles with IoU $>$ 0.5 is 11% greater. The results showed that the elastic time window design has advantages in detecting the onset and location of spindle waves, as it can better adapt to different temporal and spectral features as well as duration variations. This approach significantly improves the detection performance of spindle waves.

Fig. 6.

Distribution of IoU on the predicted spindle epochs. The pink bar represents the elastic window, while the yellow bar represents the fixed window. The proportions of detected spindle waves were tested under different IoU ranges.

3.5.2 Multiscale Pooling

In these experiments, the performance of spindle multiscale pooling at the SPP layer is evaluated and compared with that of the DSC model without the SPP. As shown in Table 2, compared with a single DSC model, the DSC model based on spatial pyramid pooling can enhance the focus on spindle wave signals and improve the overall detection accuracy.

Finally, we verify the effectiveness of the spindle detection model based on DSC with an elastic time window and the SPP in this paper. The model simultaneously considers both the temporal and spectral features, captures complete spindle wave signal segments through the elastic time window, and enhances the focus on spindle wave signals using spatial pyramid pooling, thereby improving detection accuracy.

The results are shown in Table 2. Compared with the single DSC, the improved model has better performance in terms of accuracy, precision and F1-scores, which are increased by 3%, 2% and 12%, respectively. The effectiveness of the elastic time window and SPP in the automatic spindle detection model was preliminarily verified.

To validate the significant differences in the improved algorithm, we employed the Friedman test, a nonparametric statistical method recommended by Demiar [59]. The Friedman test was used to compare the differences among multiple groups in paired samples. By calculating the ranks of each group, we determined whether there were significant differences among the evaluation metrics, indicating the presence of significant differences.

We calculated the p value for each row of DSC, DSC+Multiscale, and DSC+SPP compared to DSC+Multiscale+SPP separately. Table 2 shows that the p values for DSC+Multiscale and DSC+SPP are both less than 0.05, indicating that the improvements are statistically significant.

3.5.3 Detection Results with Annotations

Fig. 7 illustrates the outcomes of the detection approach by employing expert-annotated events as the ground truth in the DREAMS dataset. Fig. 7a shows that within the 20-second raw EEG segment, the experts identified a total of 3 spindles. On the other hand, Fig. 7b shows the spindle candidates detected using the elastic time window after data filtering (the brown area represents the detected spindles). Compared with the spindle waves (red rectangular area segments) marked by experts, 5 spindle waves were detected in this step. Fig. 7c shows the final recognition result after the SPP combines the spindle wave features of each candidate. The final detection results are approximately consistent with the spindles annotated by experts. A comparison of the results in Fig. 7b,c reveals that multiscale feature extraction in elastic time windows can be used to search for ambiguous candidate spindles, whereas SPP performs multiscale feature fusion to identify accurate results.

Fig. 7.

The detection results of each step are analyzed using the proposed method. (a) The blue data represent a sample of unfiltered 20-second raw EEG segments, where the red rectangular window refers to the expert-annotated spindles. (b) The blue data represent filtered EEG data, the brown rectangle represents a large number of sleep spindle candidate data points searched in the elastic time window, and the red rectangle represents the real spindle. (c) The brown rectangle represents the sleep spindle identified by the SPP, and the red rectangle represents the real situation. EEG, electroencephalogram.

3.6 Comparison and Results

To assess the performance of our proposed method, we compare its performance with those of four state-of-the-art methods that were previously mentioned. The compared methods include the multi-instance learning framework (CNN-MIL), a deep learning strategy (SpindleNet), a two-stage method (two-stage), and label noise identification (Labelwix).

When using other machine learning-based methods, all training, validation, and test sets in each experiment were the same as those used by this model to ensure a fair comparison. Table 3 (Ref. [25, 57, 58, 60]) shows the evaluation results of our proposed method and the abovementioned methods on the DREAMS dataset.

Table 3.Comparison among different methods and results obtained on the DREAMS dataset. “-” indicates the corresponding value not provided in the comparison model.

Paper	Method	Cross-validation	Accuracy (%)	Precision (%)	NPV (%)	F1-score (%)
Proposed Method	Multiscale DSC-SPP	Sixfold	95.75 $\pm$ 0.17	50.14 $\pm$ 8.14	96.55 $\pm$ 0.33	49.47 $\pm$ 1.14
Sun et al. [57]	CNN-MIL	Fivefold	95.38 $\pm$ 0.77	47.65 $\pm$ 9.35	98.36 $\pm$ 0.43	47.28 $\pm$ 5.07
Kulkarni et al. [25]	SpindleNet	Fivefold	95.37 $\pm$ 0.69	44.26 $\pm$ 7.93	98.12 $\pm$ 0.49	44.02 $\pm$ 4.42
Jiang et al. [58]	Two-Stage	Sixfold	95.11 $\pm$ 1.17	45.37 $\pm$ 8.05	98.07 $\pm$ 0.37	42.46 $\pm$ 4.20
Atkinson and Metsis [60]	Labelfix	-	95.24 $\pm$ 1.14	47.60 $\pm$ 10.67	97.67 $\pm$ 0.51	40.09 $\pm$ 6.36

NPV, negative predictive value; CNN, convolutional neural network; MIL, multi-instance learning framework.

Fig. 8 shows several examples of detection results from different methods. In each subfigure, the first row is a 20 s EEG signal, and the shaded rectangle represents the real situation. As shown in Fig. 8a,b, the predictions in this article include all true spindles, and the polarization positioning of spindle waves is more consistent with the real situation than is that of other methods.

Fig. 8.

Visual comparison of the spindle detection results with existing methods. The shaded rectangle and the first black line represent the adopted ground truth (GT), which comes from expert annotations. The colored lines correspond to the occurrence of spindles predicted by the automatic method. (a) The EEG signal between 7935s and 7955s. (b) The EEG signal between 13,560s and 13,580s.

4. Discussion

In this study, we develop a multiscale feature extraction method for spindle detection and validate it on the DREAMS dataset with simulated and real annotations. First, our method can more comprehensively capture the characteristics of spindle signals by using elastic time windows to extract multiscale features from filtered EEG data. Then, multilevel pooling is performed in the spatial pyramid pool to enhance the attention given to the spindle signal, which helps to better capture the temporal and spatial variations in spindle signals. This multiscale feature extraction strategy allows us to perform more accurate cuts of spindle signals. Subplots (b) and (c) in Fig. 7 show the complementary advantages of this work. Comparing the two figures, the elastic time window detects more spindle wave candidates but has a higher rate of false positives. To ensure the reliability of spindle identification, the SPP technique is used to eliminate nonspecle waves. This helps in filtering out false positives and enhancing the accuracy of spindle detection.

Considering that spindles have uncertainty and individual differences in duration and deflection positioning, a fixed-length time window may not be able to accurately locate and completely segment all spindles, resulting in missed or false detections. Therefore, the advantage of multiscale feature extraction of spindle waves is effective. We compared the impacts of the fixed time window and the elastic time window on the ground truth proportion under different IoU thresholds. The results are shown in Fig. 6. Compared with that of the fixed window, the elastic time window has an IoU $>$ 0.5, which is 11% greater. Moreover, Fig. 8 shows that for the same piece of data, the number and start time of spindle waves detected by different methods are very different from those of experts, and the spindle waves detected by this method and the deviation positioning of spindle waves are more accurate.

In addition, our method is universal across disciplines because we trained our method on 6 subjects in DREAMS and evaluated the model through sixfold cross-validation to prevent model overfitting and underfitting, thereby eliminating the adverse effects caused by unbalanced data partitioning during a single partition. As shown in Table 3, compared to the state-of-the-art spindle wave detection algorithm based on template matching, our model combines multiscale and SPP techniques to fully utilize the multilevel feature information in the data. It also incorporates an elastic time window approach to capture complete spindle wave signal segments, thereby avoiding false detections or missed detections caused by incomplete signal fragments. This approach achieves a high accuracy rate of 95.75% and a higher F1-score.

Our method can be used in clinical applications, such as detecting other EEG signature waves and predicting epilepsy. In these cases, labels divided due to fixed windows may be mistaken for learned ineffective features. Through our method, labels of features at different scales can be learned, allowing more complete and accurate features to be extracted.

5. Limitations

Although multiscale DSC-SPP has shown promising results, the individual differences in spindle waves and the various factors that can influence these differences (e.g., age, sex, sleep stage, and sleep quality) limit the adaptability of the algorithm. Our study is based on the DREAMS dataset, which still has a relatively small sample size, further limiting the adaptability of the algorithm. Therefore, to improve the adaptability and robustness of the algorithm, our next step could be to validate it on both self-collected experimental data and online datasets. In addition, spindles in subjects with sleep diseases or sleep disorders may differ greatly in duration and frequency, which is also an issue worth exploring in the future when determining the threshold for continuous data points on spindles.

6. Conclusions

We propose a spindle detection algorithm that combines an elastic time window and spatial pyramid pooling to extract multiscale features. The elastic time window allows for the extraction of multiscale features from the EEG signal, addressing the issue of fixed time windows being unable to fully segment spindle waves. The multiscale pooling of spatial pyramid pooling increases the focus on spindle wave signals and improves the accuracy of spindle wave detection. Additionally, the depthwise separable convolutional network architecture has strong representation learning capabilities, as it automatically learns more discriminative feature representations from the data. The experimental results on the DREAMS dataset for automatic spindle wave detection demonstrate that the proposed method is comparable to state-of-the-art approaches. In the future, this method is expected to be applied to clinical sleep research related to spindle characteristics.

Abbreviations

DSC, depthwise separable convolution; SPP, spatial pyramid pooling; EEG, electroencephalogram; EOG, electro-oculogram; EMG, electromyography; NPV, Negative Predictive Value; IoU, Intersection-over-union; NREM, non-rapid eye movement.

Availability of Data and Materials

The datasets generated or analyzed during the current study are available: https://zenodo.org/records/2650142#.YRtw6o4zY2w.

Author Contributions

YO, JP and FW designed and conceptualized the research project. BF and LT analyzed and interpreted the data. YO, BF and LT drafted the manuscript. JP and FW played a supportive role throughout the study, offering professional optimization and guidance related to the manuscript. All the authors contributed to editorial changes in the manuscript. All the authors read and approved the final manuscript. All the authors have participated sufficiently in the work and agreed to be accountable for all the aspects of the work.

Ethics Approval and Consent to Participate

Not applicable.

Acknowledgment

We sincerely appreciate the publicly available DREAMS dataset provided by the sleep laboratory of Andr vsamsale Hospital in Belgium. We thank all of the subject except the authors from the School of Software in South China Normal University.

Funding

This research was funded by the Guangdong Basic and Applied Basic Research Foundation (Grant No. 2021A1515011853), the National Natural Science Foundation of China (Grant No. 61906019 and 62006082), STI 2030-Major Projects 2022ZD0208900.

Conflict of Interest

The authors declare no conflict of interest. Jiahui Pan is serving as one of the Guest editors of this journal. We declare that Jiahui Pan had no involvement in the peer review of this article and has no access to information regarding its peer review. Full responsibility for the editorial process for this article was delegated to Gernot Riedel.

References

[1] Iber C, Ancoli-Israel S, Chesson A. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. 2007.
Cited within: 1Google Scholar PubMed Crossref
[2] Berry RB, Brooks R, Gamaldo C, Harding SM, Lloyd RM, Quan SF, et al. AASM Scoring Manual Updates for 2017 (Version 2.4). Journal of Clinical Sleep Medicine: JCSM: Official Publication of the American Academy of Sleep Medicine. 2017; 13: 665–666.
Cited within: 1Google Scholar PubMed Crossref
[3] Mullins A, Parekh A, Kam K, Bubu O, Schoenholz R, Patel S, et al. 0308 The stability of slow wave sleep and EEG microstructure measures across two consecutive nights of laboratory polysomnography in cognitively normal older adults. SLEEP. 2022; 45: 139–139.
Cited within: 1Google Scholar Crossref
[4] Hefnawy MA, Fadlallah SA, El-Sherif RM, Medany SS. Competition between enzymatic and non-enzymatic electrochemical determination of cholesterol. Journal of Electroanalytical Chemistry. 2023; 930: 117169.
Cited within: 1Google Scholar Crossref
[5] Das PK, Meher S, Panda R, Abraham A. A Review of Automated Methods for the Detection of Sickle Cell Disease. IEEE Reviews in Biomedical Engineering. 2020; 13: 309–324.
Cited within: 1Google Scholar PubMed Crossref
[6] Zhu QL, Han F, Wang J, Cheng CH, Cai SJ, Wang QJ, et al. Effect of sleep spindle density on memory function in patients with obstructive sleep apnea hypopnea syndrome. Zhonghua Jie he he Hu Xi Za Zhi. 2023; 46: 466–473. (In Chinese)
Cited within: 1Google Scholar PubMed Crossref
[7] Zhang Y, Quiñones GM, Ferrarelli F. Sleep spindle and slow wave abnormalities in schizophrenia and other psychotic disorders: Recent findings and future directions. Schizophrenia Research. 2020; 221: 29–36.
Cited within: 1Google Scholar PubMed Crossref
[8] Petit JM, Strippoli MPF, Stephan A, Ranjbar S, Haba-Rubio J, Solelhac G, et al. Sleep spindles in people with schizophrenia, schizoaffective disorders or bipolar disorders: a pilot study in a general population-based cohort. BMC Psychiatry. 2022; 22: 758.
Cited within: 1Google Scholar PubMed Crossref
[9] van der Heijden AC, Hofman WF, de Boer M, Nijdam MJ, van Marle HJF, Jongedijk RA, et al. Sleep spindle dynamics suggest over-consolidation in post-traumatic stress disorder. Sleep. 2022; 45: zsac139.
Cited within: 1Google Scholar Crossref
[10] Chatburn A, Lushington K, Kohler MJ. Consolidation and generalisation across sleep depend on individual EEG factors and sleep spindle density. Neurobiology of Learning and Memory. 2021; 179: 107384.
Cited within: 1Google Scholar PubMed Crossref
[11] Friedrich M, Mölle M, Friederici AD, Born J. The reciprocal relation between sleep and memory in infancy: Memory-dependent adjustment of sleep spindles and spindle-dependent improvement of memories. Developmental Science. 2019; 22: e12743.
Cited within: 1Google Scholar PubMed Crossref
[12] Chawla P, Rana SB, Kaur H, Singh K, Yuvaraj R, Murugappan M. A decision support system for automated diagnosis of Parkinson’s disease from EEG using FAWT and entropy features. Biomedical Signal Processing and Control. 2023; 79: 104116.
Cited within: 1Google Scholar Crossref
[13] Khoshnevis SA, Sankar R. Diagnosis of Parkinson’s disease using higher order statistical analysis of alpha and beta rhythms. Biomedical Signal Processing and Control. 2022; 77: 103743.
Cited within: 1Google Scholar Crossref
[14] Varli M, Ylmaz H. Multiple classification of EEG signals and epileptic seizure diagnosis with combined deep learning. Journal of Computational Science. 2023; 67: 101943.
Cited within: 1Google Scholar Crossref
[15] Perez-Valero E, Morillas C, Lopez-Gordo MA, Minguillon J. Supporting the Detection of Early Alzheimer’s Disease with a Four-Channel EEG Analysis. International Journal of Neural Systems. 2023; 33: 2350021.
Cited within: 1Google Scholar PubMed Crossref
[16] Hori T, Sugita Y, Koga E, Shirakawa S, Inoue K, Uchida S, et al. Proposed supplements and amendments to ‘A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects’, the Rechtschaffen & Kales (1968) standard. Psychiatry and Clinical Neurosciences. 2001; 55: 305–310.
Cited within: 1Google Scholar
[17] Wendt SL, Welinder P, Sorensen HBD, Peppard PE, Jennum P, Perona P, et al. Inter-expert and intra-expert reliability in sleep spindle scoring. Clinical Neurophysiology: Official Journal of the International Federation of Clinical Neurophysiology. 2015; 126: 1548–1556.
Cited within: 1Google Scholar PubMed Crossref
[18] Qiu S, Yang CH, Wu L, Wang KC, Pan JZ. Machine-vision-based Spindle Positioning System of Grinding-wheel-saw Automatic Replacement System. Sensors and materials: An International Journal on Sensor Technology. 2022; 34: 789–801.
Cited within: 1Google Scholar Crossref
[19] Scafa S, Fiorillo L, Lucchini M, Roth C, Agostini V, Vancheri A, et al. Personalized Sleep Spindle Detection in Whole Night Polysomnography. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2020; 2020: 1047–1050.
Cited within: 1Google Scholar PubMed Crossref
[20] Kinoshita T, Fujiwara K, Kano M, Ogawa K, Sumi Y, Matsuo M, et al. Sleep Spindle Detection Using RUSBoost and Synchrosqueezed Wavelet Transform. IEEE Transactions on Neural Systems and Rehabilitation Engineering: a Publication of the IEEE Engineering in Medicine and Biology Society. 2020; 28: 390–398.
Cited within: 1Google Scholar PubMed Crossref
[21] Wang F, Li L, Wan Y, Li Z, Luo L, Hu B, et al. An Efficient Sleep Spindle Detection Algorithm Based on MP and LSBoost. Computers, Materials & Continua. 2023; 76: 2301–2316.
Cited within: 1Google Scholar
[22] Chen B, Chen H, Li M. Improvement and Optimization of Feature Selection Algorithm in Swarm Intelligence Algorithm Based on Complexity. Complexity. 2021; 2021: 9985185.
Cited within: 1Google Scholar Crossref
[23] Wang K, Kemao Q, Di J, Zhao J. Deep learning spatial phase unwrapping: a comparative review. Advanced Photonics Nexus. 2022; 1: 014001.
Cited within: 1Google Scholar Crossref
[24] You J, Jiang D, Ma Y, Wang Y. SpindleU-Net: An Adaptive U-Net Framework for Sleep Spindle Detection in Single-Channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering: a Publication of the IEEE Engineering in Medicine and Biology Society. 2021; 29: 1614–1623.
Cited within: 1Google Scholar PubMed Crossref
[25] Kulkarni PM, Xiao Z, Robinson EJ, Jami AS, Zhang J, Zhou H, et al. A deep learning approach for real-time detection of sleep spindles. Journal of Neural Engineering. 2019; 16: 036004.
Cited within: 4Google Scholar PubMed Crossref
[26] Saifutdinova E, Dudysova D, Gerla V, Lhotska L. Improvement of Sleep Spindle Detection by Aggregation Techniques. Mediterranean Conference on Medical and Biological Engineering and Computing. 2020; 76: 226–234.
Cited within: 1Google Scholar PubMed Crossref
[27] Thiesse L, Staner L, Bourgin P, Roth T, Fuchs G, Kirscher D, et al. Validation of Somno-Art Software, a novel approach of sleep staging, compared with polysomnography in disturbed sleep profiles. Sleep Advances: a Journal of the Sleep Research Society. 2021; 3: zpab019.
Cited within: 1Google Scholar PubMed Crossref
[28] Jeonghee H, Soyoung P, Jeonghee C. Improving Multi-Class Motor Imagery EEG Classification Using Overlapping Sliding Window and Deep Learning Model. Electronics. 2023; 12: 1186.
Cited within: 1Google Scholar Crossref
[29] Fiorillo L, Monachino G, van der Meer J, Pesce M, Warncke JD, Schmidt MH, et al. U-Sleep’s resilience to AASM guidelines. NPJ Digital Medicine. 2023; 6: 33.
Cited within: 1Google Scholar PubMed Crossref
[30] Tehrani MJ, Rashidinia A, Amoli FA, Esfandiari A. A rare presentation of orbital spindle cell carcinoma a case report and brief review of the literature. BMC Ophthalmology. 2023; 23: 369.
Cited within: 1Google Scholar PubMed Crossref
[31] Jiang Y, Bugby SL, Cosma G. Automatic detection of scintillation light splashes using conventional and deep learning methods. Journal of Instrumentation. 2022; 17: P06021.
Cited within: 1Google Scholar Crossref
[32] Lim JS, Stofa MM, Koo SM, Zulkifley MA. Micro Expression Recognition: Multi-scale Approach to Automatic Emotion Recognition by using Spatial Pyramid Pooling Module. International Journal of Advanced Computer Science and Applications (IJACSA). 2021; 12: 12.
Cited within: 1Google Scholar Crossref
[33] Hong Q, Zhong X, Chen W, Zhang Z, Li B. Hyperspectral Image Classification Network Based on 3D Octave Convolution and Multiscale Depthwise Separable Convolution. ISPRS International Journal of Geo-Information. 2023; 12: 505.
Cited within: 1Google Scholar Crossref
[34] Huo Y, Zhang Q, Jia Y, Liu D, Guan J, Lin G. A Deep Separable Convolutional Neural Network for Multiscale Image-Based Smoke Detection. Fire Technology. 2022; 58: 1445–1468.
Cited within: 1Google Scholar Crossref
[35] Yang B, Li H. A similarity elastic window based approach to process dynamic time delay analysis. Chemometrics & Intelligent Laboratory Systems. 2017; 170: 13–24.
Cited within: 1Google Scholar
[36] Chen P, Chen D, Zhang L, Tang Y, Li X. Automated sleep spindle detection with mixed EEG features. Biomedical Signal Processing and Control. 2021; 70: 103026.
Cited within: 3Google Scholar Crossref
[37] Liu D, Liu T, Bi H, Zhao Y, Cheng Y. Multiscale Local Feature Fusion: Marine Microalgae Classification for Few-Shot Learning. Water. 2023; 15: 1413.
Cited within: 1Google Scholar Crossref
[38] Zhou W, Lin X, Lei J, Yu L, Hwang JN. MFFENet: Multiscale Feature Fusion and Enhancement Network for RGBThermal Urban Road Scene Parsing. IEEE Transactions on Multimedia. 2021; 24: 2526–2538.
Cited within: 1Google Scholar Crossref
[39] Wu D, Zhao J, Wang Z. AM-PSPNet: Pyramid Scene Parsing Network Based on Attentional Mechanism for Image Semantic Segmentation. In International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer: Singapore. 2022.
Cited within: 1Google Scholar PubMed Crossref
[40] Zhang R, Chen J, Feng L, Li S, Yang W, Guo D. A Refined Pyramid Scene Parsing Network for Polarimetric SAR Image Semantic Segmentation in Agricultural Areas. IEEE Geoscience and Remote Sensing Letters. 2022; 19: 1–5.
Cited within: 1Google Scholar Crossref
[41] Elizar E, Zulkifley MA, Muharar R, Zaman MHM, Mustaza SM. A Review on Multiscale-Deep-Learning Applications. Sensors (Basel, Switzerland). 2022; 22: 7384.
Cited within: 1Google Scholar PubMed Crossref
[42] He J, Wang X, Song Y, Xiang Q. A multiscale intrusion detection system based on pyramid depthwise separable convolution neural network. Neurocomputing. 2023; 530: 48–59.
Cited within: 1Google Scholar Crossref
[43] Li G, Zhang J, Zhang M, Wu R, Cao X, Liu W. Efficient depthwise separable convolution accelerator for classification and UAV object detection. Neurocomputing. 2022; 490: 1–16.
Cited within: 1Google Scholar Crossref
[44] Lou Y, He Y, Wang L, Chen G. Predicting Network Controllability Robustness: A Convolutional Neural Network Approach. IEEE Transactions on Cybernetics. 2022; 52: 4052–4063.
Cited within: 1Google Scholar PubMed Crossref
[45] Shen Y, Zhu S, Chen C, Du Q, Xiao L, Chen J. Efficient Deep Learning of Nonlocal Features for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing. 2021; 59: 6029–6043.
Cited within: 1Google Scholar Crossref
[46] Tripathi S, Singh SK, Kuan LH. Bag of Visual Words (BoVW) with Deep Features–Patch Classification Model for Limited Dataset of Breast Tumours. ArXiv. 2022. (preprint)
Cited within: 1Google Scholar PubMed Crossref
[47] Yee PS, Lim KM, Lee CP. DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling. Expert Systems with Applications. 2022; 193: 116382.
Cited within: 1Google Scholar Crossref
[48] Sriram S, Vinayakumar R, Sowmya V, Alazab M, Soman KP. Multi-scale Learning based Malware Variant Detection using Spatial Pyramid Pooling Network. IEEE INFOCOM 2020-IEEE conference on computer communications workshops (INFOCOM WKSHPS) (pp. 740–745). IEEE. 2020.
Cited within: 1Google Scholar Crossref
[49] Wu C, Lou Y, Wang L, Li J, Li X, Chen G. SPP-CNN: An Efficient Framework for Network Robustness Prediction. IEEE Transactions on Circuits and Systems I: Regular Papers. 2023; 70: 4067–4079.
Cited within: 1Google Scholar Crossref
[50] Msonda P, Uymaz SA, Karaaa SS. Spatial Pyramid Pooling in Deep Convolutional Networks for Automatic Tuberculosis Diagnosis. Traitement du Signal. 2020; 37: 1075–1084.
Cited within: 1Google Scholar Crossref
[51] Devuyst S, Dutoit T, Stenuit P, Kerkhofs M. Automatic sleep spindles detection–overview and development of a standard proposal assessment method. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference. 2011; 2011: 1713–1716.
Cited within: 1Google Scholar PubMed Crossref
[52] Krieter S, Thüm T, Schulze S, Saake G, Leich T. YASA: yet another sampling algorithm. In VaMoS ‘20: Proceedings of the 14th International Working Conference on Variability Modelling of Software-Intensive Systems. 2020.
Cited within: 1Google Scholar Crossref
[53] Barakat ABP. Convergence and Dynamical Behavior of the ADAM Algorithm for Nonconvex Stochastic Optimization. SIAM Journal on Optimization: A Publication of the Society for Industrial and Applied Mathematics. 2021; 31: 244–274.
Cited within: 1Google Scholar Crossref
[54] Hubar S, Koulovatianos C, Li J. Fitting Parsimonious Household-Portfolio Models to Data. Social Science Electronic Publishing. 2014; 1: 37–39.
Cited within: 1Google Scholar Crossref
[55] Sharma R, Sircar P, Pachori RB. Automated focal EEG signal detection based on third order cumulant function. Biomedical Signal Processing and Control. 2020; 58: 101856.1–101856.8.
Cited within: 1Google Scholar Crossref
[56] Lacourse K, Delfrate J, Beaudry J, Peppard P, Warby SC. A sleep spindle detection algorithm that emulates human expert spindle scoring. Journal of Neuroscience Methods. 2019; 316: 3–11.
Cited within: 1Google Scholar PubMed Crossref
[57] Sun X, Qi Y, Wang Y, Pan G. Convolutional Multiple Instance Learning for Sleep Spindle Detection With Label Refinement. IEEE Transactions on Cognitive and Developmental Systems. 2023; 15: 272–284.
Cited within: 3Google Scholar Crossref
[58] Jiang D, Ma Y, Wang Y. A robust two-stage sleep spindle detection approach using single-channel EEG. Journal of Neural Engineering. 2021; 18: 026–026.
Cited within: 3Google Scholar PubMed Crossref
[59] Demiar J, Schuurmans D. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research. 2006; 7: 1–30.
Cited within: 1Google Scholar Crossref
[60] Atkinson G, Metsis V. Identifying label noise in time-series datasets. Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and the 2020 ACM International Symposium on Wearable Computers (pp. 238–243). 2020.
Cited within: 2Google Scholar Crossref

Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Academic Editor

Download

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Academic Editor

Article Metrics

Download

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Abstract

Keywords

References