- Academic Editor
Background: To enhance the information transfer rate (ITR) of a steady-state visual evoked potential (SSVEP)-based speller, more characters with flickering symbols should be used. Increasing the number of symbols might reduce the classification accuracy. A hybrid brain-computer interface (BCI) improves the overall performance of a BCI system by taking advantage of two or more control signals. In a simultaneous hybrid BCI, various modalities work with each other simultaneously, which enhances the ITR. Methods: In our proposed speller, simultaneous combination of electromyogram (EMG) and SSVEP was applied to increase the ITR. To achieve 36 characters, only nine stimulus symbols were used. Each symbol allowed the selection of four characters based on four states of muscle activity. The SSVEP detected which symbol the subject was focusing on and the EMG determined the target character out of the four characters dedicated to that symbol. The frequency rate for character encoding was applied in the EMG modality and latency was considered in the SSVEP modality. Online experiments were carried out on 10 healthy subjects. Results: The average ITR of this hybrid system was 96.1 bit/min with an accuracy of 91.2%. The speller speed was 20.9 char/min. Different subjects had various latency values. We used an average latency of 0.2 s across all subjects. Evaluation of each modality showed that the SSVEP classification accuracy varied for different subjects, ranging from 80% to 100%, while the EMG classification accuracy was approximately 100% for all subjects. Conclusions: Our proposed hybrid BCI speller showed improved system speed compared with state-of-the-art systems based on SSVEP or SSVEP-EMG, and can provide a user-friendly, practical system for speller applications.
Patients with severe motor paralysis caused by brain stroke or amyotrophic lateral sclerosis (ALS) [1] may lose the ability to use their peripheral nervous system responsible for controlling voluntary muscle contractions. The brain-computer interface (BCI) enables a communication pathway between the patient’s brain and their surroundings [2, 3].
Brain activity can be measured using several techniques such as electroencephalography, magnetoencephalography (MEG), and electrocorticography (ECoG) [4, 5, 6]. The electroencephalogram (EEG) is used as the input in most BCI systems. BCI systems can be based on the brain patterns in event–related desynchronization/synchronization [7], steady-state visual evoked potentials (SSVEP) [8], the P300 component of event related potentials (ERP) [9], and slow cortical potentials [10]. BCI systems have been applied in various applications such as diagnosis [11], rehabilitation and restoration [12, 13], and smart environment [14] games [15] and entertainment [16]. The use of these systems as a speller is one of the most common applications.
Different BCI spellers have been developed based on various control signals such as P300 and SSVEP [17, 18, 19]. One of the primary BCI spellers is a P300-based speller [20, 21, 22, 23] in which P300 responses are quantified to select the characters. These spellers have been developed based on various kinds of stimuli including visual, auditory, and tactile [20, 21, 22]. The most common P300 BCI is the visual-based BCI. In the visual paradigm, characters flash randomly in different patterns. The flashing pattern may be single character (SC), row-column (RC), or region-based (RB) [24]. Recently, SSVEP-based BCI spellers have attracted increasing attention [25] in comparison with other modalities. This modality has several benefits including high information transfer rate [26], desirable accuracy, the use of fewer channels, and short training time [27, 28, 29, 30]. In the simplest structure of SSVEP-based BCI spellers, many symbols and stimulation frequencies equal to the number of characters are needed [31, 32, 33]. A large number of flickers on the screen requires symbols to be smaller in size and arranged more closely to each other. Neighbor flickers have a destructive effect on SSVEP frequency recognition [34, 35]. Conversely, the stimulation frequency range in which the SSVEP response has higher amplitude is limited. Therefore, in order to devote several frequencies to a short span, a shorter frequency step should be selected, which in turn increases the detection error. Decreasing the classification accuracy has a negative impact on the information transfer rate (ITR). These spellers are therefore usually developed with fewer flickers and a tree structure. In this method, a character is selected in a hierarchical manner over several consecutive steps. Various spellers have been implemented with a 2-level [36, 37, 38] or 3-level [39, 40, 41] structure. Despite favorable accuracy, these systems are slow and so they often cannot achieve a high ITR.
A hybrid brain-computer interface (HBCI) takes advantages of two or more control signals with the aim of improving the system performance. In an HBCI system, a BCI control signal is combined (simultaneously/sequentially) with another BCI control signal or with a human-machine interface biological signal [42]. One of the most widely used combinations in spellers is the combination of SSVEP and P300 [43, 44, 45]. P300-based BCI systems are time consuming as they require many trials to reach desirable accuracy, which decreases the ITR [46, 47, 48]. Conversely, both SSVEP and P300 require visual stimulation which causes visual fatigue. To overcome these disadvantages, researchers have used some residual abilities in other organs and combined BCI control signals with other biological signals. Lin et al. [34, 49] have achieved desirable accuracy and ITR by combining the SSVEP and the electromyogram (EMG). Other studies have also combined high-frequency SSVEP and surface EMG for spelling applications [50, 51].
SSVEP and EMG have few interactions with each other and they both require no training. The signal-to-noise ratio (SNR) of EMG is high and the classification procedure of this signal is simple, which is implemented in less time with high accuracy [34]. The combination of these signals is therefore advantageous. By simultaneous combination of these two signals with the use of less flickering symbols, we can increase the ITR along with the desired accuracy.
Character encoding technique is an important issue in BCI spellers [52]. The occurrence probability of different characters affects the character encoding. Devoting easier code to characters with higher occurrence probability increases the accuracy. In this regard, in some hierarchical spellers, a distinct frequency has been considered for the ‘Delete’ key as the most widely used key, but the usage rate of other characters has not been considered [37, 39, 41]. It is evident that various characters appear with different frequencies [52]. However, few studies have assessed the real occurrence probability of characters [53, 54, 55, 56, 57]. The encoding procedure based on character frequency rate therefore has a major impact on system performance.
The SSVEP potential takes multiple cycles to reach a stable state [58]. Thalamocortical oscillations must reach a synchronous state in which the classifier attains an acceptable accuracy level [59]. There will therefore be latency between the onset of cue flickering and satisfactory classification. SSVEP oscillation in the target symbol does not disappear immediately [35]. Some studies have applied the first 100–150 ms duration of the trial as the latency of the brain to SSVEP stimulation [60], so this period of the signal is not used in data processing [61, 62]. Considering SSVEP latency in the analysis of SSVEP can improve the classification accuracy [61].
In the proposed speller in the present study, we combined the SSVEP and EMG simultaneously. The idea of using the frequency rate for character encoding was considered in the EMG modality. For this purpose, less muscle activity was assigned to the group of characters with higher occurrence probability. We selected a processing time window of 2 s after the latency. Ten subjects participated in online experiments and results were reported for each subject separately.
Ten subjects (mean age 30 years
Various muscles can be used to record the EMG signal. In some studies, EMG signal has been recorded from the forearm [34, 49, 68] and in some others, facial muscles have been used [69, 70, 71, 72, 73]. The number of repetitions is also different in various studies. In many studies, various commands have been implemented using only one muscle and a different number of repetitions of muscle activity [34, 49]. Previous studies have showed a declining trend in accuracy with the number of repetitions. In our study, the flexor carpi radialis of both hands were used with only one degree of wrist flexion to achieve more commands. To record the EMG of each muscle, two electrodes were used. The first electrode was considered as the re-reference, while the second one was applied to record the EMG. Fig. 1 shows the channel placement for the EEG and EMG electrodes.
Placement of three EEG electrodes SSVEP channels (blue), ground and reference (black) and four EMG electrodes (green). EEG, electroencephalogram; SSVEP, steady-state visual evoked potential; EMG, electromyogram; REF, reference; GND, ground.
A 15.6-inch light-emitting diode (LED) monitor of a laptop (ideapad, Lenovo, Bejing, China) with a refresh rate of 60 Hz was utilized. The stimuli presentation was managed using the psychophysics toolbox of Matlab (URL: http://psychtoolbox.org) [74, 75], which provides precise stimuli [76]. Nine flashing symbols were used on the screen. The flash frequencies were 5.88, 6.25, 6.66, 7.14, 7.69, 8.33, 9.09, 10, and 11.11 Hz. This frequency range has a strong SSVEP response [77, 78].
Various studies have assessed the occurrence probability of letters and sorted them based on the frequency rate [79, 80, 81, 82, 83]. This issue was investigated more comprehensively in [79], but the ‘Space’ was not included. In [80], the frequency rate was reported for the ‘Space’, whereas it was not reported for any other characters. We recalculated the frequency rate of 26 Latin letters with consideration of the ‘Space’, as shown in Table 1. First, we considered the frequency rate of the ‘Space’ equal to 18.43% based on [79, 80, 81, 82, 83]. To calculate the frequency rate of other letters, we multiplied the frequency rate of each letter reported in [79, 80, 81, 82, 83] by 81.57%.
Character | Frequency rate |
Space | 18.43% |
E | 10.24% |
T | 7.28% |
A | 6.96% |
O | 6.35% |
N | 6.00% |
I | 5.99% |
S | 5.53% |
R | 5.47% |
H | 3.91% |
L | 3.38% |
D | 3.13% |
C | 2.59% |
U | 2.13% |
M | 1.93% |
F | 1.71% |
P | 1.66% |
G | 1.59% |
Y | 1.40% |
W | 1.34% |
B | 1.14% |
V | 0.86% |
K | 0.60% |
X | 0.16% |
Z | 0.08% |
J | 0.08% |
Q | 0.06% |
The proposed system consists of 36 characters (26 Latin letters, eight punctuation characters, ‘Space’, and ‘Delete’). By assigning 36 characters to nine symbols, each symbol represents four characters. Characters were categorized into six groups based on SSVEP frequency and four subgroups based on muscle activity. Four subgroups were selected in the state of inactivity of the two wrists, right wrist flexion, left wrist flexion, and both wrist flexion. Characters were grouped in subgroups based on the character frequency rate [79, 80]. No muscle activity was assigned to more commonly used characters (characters of the first line including “A” “O” “T” “E” ‘Space’ “delete” “S” “I” “N”). Other characters were dedicated to groups with one and two activities based on decreasing order of frequency rate. For characters with one muscle activity (right or left), we assigned adjacent characters in alphabetical order to one symbol. For example, we assigned “B” and “C” to the same symbol. This was done with the aim of making the system more user-friendly. Both muscle activities were considered for the least used characters (characters of the fourth line including “Z” “Q” “V” “J” “K” “;” “X” “(“ ”)”). Fig. 2 indicates the speller interface and the distribution of the characters in each symbol on the screen. For example, to select “A” the subject should gaze at the upper right symbol without any muscle activity. To select “B”, “C” and “Z”, the subject should gaze at that symbol at the same time as performing right wrist flexion, left wrist flexion, or both wrist flexion, respectively.
The speller interface. (a) Character encoding. No muscle activity was considered for characters of the first line including “A” “O” “T” “E” ‘Space’ “delete” “S” “I” “N”. We assigned adjacent character “B” and “C” to one symbol. Both muscle activity was considered for characters of the fourth line including “Z” “Q” “V” “J” “K” “;” “X” “(“ ”)”. (b) Distribution of the characters in each symbol on the screen. For example, to select “A” the subject should gaze at the upper right symbol without any muscle activity. To select “B”, “C” and “Z”, the subject should gaze at that symbol and at the same time perform right wrist flexion, left wrist flexion, or both wrist flexion, respectively.
Each subject participated in one offline training experiment (including five sessions) and one online testing experiment (including 10 sessions). In other words, each subject took part in five training sessions and 10 testing sessions. In each session, the subject was required to spell ‘BRAIN COMPUTER INTERFACE’ that contained 24 characters. One session consisted of several trials. In each trial, nine symbols were flickered during a 3-s period. During the flickering period, subjects were required to gaze at the target symbol and perform wrist flexion at any desired moment. The subject’s target cue was detected by SSVEP and the subgroup was determined by EMG. The EMG activity recognition and SSVEP detection were performed simultaneously, immediately after the end of each trial. Fig. 3 shows the general outline of this system. Feedback of the recognized character was provided in real time. A 0.5-s period was considered as the rest period. If the subject determined the correct target, they were to type the next target, otherwise they were to select ‘Delete’ to clear the mistake. Fig. 4 indicates the spelling procedure for the characters ‘BRA’. In order to select ‘B’ and ‘R’, the subject was to gaze at the upper right symbol and at the same time perform right wrist flexion and left wrist flexion, respectively. To select ‘A’, the subject was to gaze at that symbol without any muscle activity.
Flowchart outlining the detection algorithm.
Character selection pattern for spelling “BRA”.
Generally, in a BCI system, increasing the time duration increases system accuracy, which improves the ITR. However, the time parameter has a negative effect on the ITR, resulting in a trade-off between the ITR and accuracy [84]. Results of previous studies have indicated that the 2-s time duration provides a good trade-off between these two parameters [38, 49]. Accordingly, we chose a value of 2 s for the time window length.
In the data analysis, latency was considered in the determination of the time window. This was done with the aim of increasing the SSVEP classification accuracy, which could lead to an increase the ITR. To do this, by considering the latency, the processing was realized on a 2-s time window. Fig. 5 shows the timeline of a single trial in this experiment. Individual differences in mental reaction time are due to the apparent latency of the recorded SSVEP [85]. In other words, the SSVEP recovery time is different among the individuals.
A single trial timeline. Latency varies from 0 to 1 s with a step of 0.1 s. The highlighted 2-s time window represents the window corresponding to the determined value of latency.
Reported latency is between 600 and 800 ms in most previous studies. In some cases, visual fatigue due to the repetitive SSVEP has also increased this value to 1000 ms [35]. Accordingly, we determined the value of 1 s as the maximum latency. During subsequent analysis on values from 0 to 1 s with a step of 0.1 s, the optimal value was determined.
Canonical correlation analysis [86] was used for SSVEP classification. This method applies reference signals composed of sine and cosine pairs at the same frequency of the stimulation frequency and its optional harmonics. As the dynamics of the brain act as a low-pass filter, the high harmonic components are removed [87]. In [88], no noticeable changes in accuracy were found by increasing the number of harmonics from two to three. In the present study, we applied reference signals at the same frequencies as the stimulation frequencies, considering two harmonics.
EMG activity was recognized in the state that the signal envelope was higher than the threshold [49]. For muscle envelope determination, the signal of the re-reference electrode was subtracted from the EMG signal. Following, a 50 Hz notch filter and a 10–450 Hz sixth order band-pass Butterworth filter were applied to remove the noise. Finally, the signal was rectified and a 3-Hz low-pass finite impulse response (FIR) filter was applied.
Threshold determination was conducted through an offline experiment. For this goal, EMG data of both muscles were recorded during 10 s in which subjects made a maximum voluntary contraction (MVC). This was repeated three times and the EMG envelopes were averaged over the trials. The threshold was considered as 20% of the mean value of the averaged value [89].
The accuracy and reliability of the information transmission are two important factors in a real-time biomedical system. In [90], the Gabor prototype frames have been optimized to provide accurate and reliable information transmission in a real-time biomedical sensor. We evaluated the overall system performance by the quantitative efficiency parameters such as accuracy and ITR. The accuracy was determined as Eqn. 1.
where
There are several methods to estimate the ITR [92]. In the present study, the Wolpaw’s definition was used as Eqn. 2.
where
System speed (character per minute) is the other important parameter in speller evaluation, which is calculated as Eqn. 3 [93].
First, the effect of considering latency on SSVEP classification accuracy was evaluated. For this end, we computed the mean accuracy and ITR for two time windows with/without latency. The first time window was considered as a 2-s time window after the determined latency. The second window was selected immediately from the start of stimulation, which contained the sum of the latency duration and 2 s of the subsequent signal. Fig. 6 shows the mean accuracy and ITR for these two time windows, based on the latency. These mean values were calculated across 10 subjects for different latencies. The results show that the graph corresponding to the first time window is always higher than the second one. This means that considering the latency increased the classification accuracy and the ITR. As shown clearly in the graph of the accuracy in terms of latency, increasing the latency improves the classification accuracy. However, in the second time window, the window length has increased by adding the latency duration, which is expected to increase the accuracy, but this initial time has decreased the accuracy. Meanwhile, taking into account the latency (removing the latency from the window length) in the first time window increased the accuracy. Conversely, it is obvious that increasing the window length decreases the ITR, which is clear from the graph of the ITR in terms of latency. Our results showed that there is a tradeoff between the accuracy and the ITR at the latency value of 0.2 s which leads to highest accuracy and ITR. Following, the accuracy and the ITR of the proposed speller were determined for each subject individually. Fig. 7 illustrates the outcomes for each subject and the mean results for different latency values. The accuracy was calculated across 10 sessions and the ITR value was estimated based on the accuracy value. The value of T was calculated as a sum of a 2-s time window, 0.5-s rest time and the latency. This figure depicts that the optimal latency is subject dependent. The mean graphs show that the accuracy increased with increasing latency, but the ITR increased until 0.2 s and then decreased. As a result, the optimal ITR value was specified for a latency of 0.2 s. Our results are therefore presented based on this value.
Mean classification accuracy and ITR based on latency, for time windows with/without latency. Results are plotted based on different latencies. The first time window selected as a 2-s window after the determined latency. The second time window considered immediately from the start of stimulation, which was determined as the sum of the latency and 2 s of the subsequent signal.
Mean accuracy and ITR based on the latency across all subjects. Values were plotted for different latencies.
Table 2 shows the online classification accuracy, ITR, total time, and system speed with consideration of a latency value equal to 0.2 s. To determine the system speed, the total time to complete the task was recorded, which also included the time period to clear errors. As shown in this Table, subject 3 exhibited the best performance, with an ITR of 113.6 bit/min while maintaining 99.6% accuracy. The mean accuracy and ITR were 91.2% and 96.1 bit/min, respectively. The average spelling time to spell the phrase was 68.9 s and so the proposed speller achieved a speed of 20.9 char/min.
Subject ID | Classification accuracy (%) | ITR (bit/min) | Total time (s) | Typing speed (char/min) |
1 | 96.4 | 105.8 | 68.58 | 20.99 |
2 | 83.6 | 81.9 | 80.46 | 17.89 |
3 | 99.6 | 113.6 | 65.34 | 22.03 |
4 | 94.5 | 101.8 | 70.74 | 20.35 |
5 | 96.4 | 105.8 | 68.04 | 21.16 |
6 | 86.6 | 87 | 77.22 | 18.64 |
7 | 84.1 | 82.7 | 79.38 | 18.14 |
8 | 85.3 | 84.7 | 78.84 | 18.26 |
9 | 98 | 109.5 | 66.42 | 21.68 |
10 | 87 | 87.7 | 75. 06 | 19.18 |
Mean |
91.2 |
96.1 |
73.01 |
19.83 |
All parameters of this table were calculated with consideration of a latency value of 0.2 s. For each subject, the results are the averages of 10 sessions. Total Time was the spent time by each subject to spell the complete phrase. The system speed was obtained by dividing the total number of characters by the total time. Subject ID, subject identification code; SD, standard deviation; ITR, information transfer rate.
Table 3 (Ref. [34, 36, 37, 39, 40, 41, 45, 49, 94, 95]) compares the results of the present study in terms of mean accuracy, ITR, and typing speed, with the findings of other state-of-the-art studies. Depending on the spelling task, studies were divided into two categories, cue-based and copy-spelling. In cue-based spelling, subjects were required to choose the target character in random order. In the copy-spelling task, subjects were required to spell a predetermined sentence.
Spelling task | Reference | Modality | Classification accuracy (%) | ITR (bit/min) | Typing speed (char/min) |
Cue-based spelling | [94] | SSVEP | 96 | - | 13 |
[36] | SSVEP | 98.8 | 61.6 | - | |
[49] | SSVEP-EMG | 80.5 | 83.2 | - | |
[34] | SSVEP-EMG | 85.8 | 90.9 | - | |
Copy-spelling | [39] | SSVEP | 92.3 | 37.6 | - |
[95] | SSVEP | 90.8 | 21.9 | - | |
[37] | SSVEP | 84.6 | 20.5 | - | |
[40] | SSVEP | 92.8 | 11.2 | 2 | |
[41] | SSVEP | 97.9 | 23.8 | 2 | |
[45] | SSVEP | 93 | 31.8 | - | |
[34] | SSVEP-EMG | 82.6 | - | 7.8 | |
Copy-spelling | Proposed system | SSVEP-EMG | 91.2 | 96.1 | 20.9 |
Studies were divided into two categories based on the spelling task. In cue-based spelling, subjects were asked to select target characters in random order and a cue indicated the target character that subjects were required to type. In the copy-spelling task, subjects were required to spell a sentence.
Although some spellers achieved a high classification accuracy, the speed and ITR were often low in those systems, indicating that accuracy alone is not a suitable criterion for evaluating the performance of a system. ITR represents a tradeoff between the character transmission required time, quantity of character information, and probability to correctly transmit and receive it. Therefore, this parameter can independently compare the performance of two systems.
It should also be mentioned that many studies have calculated the ITR, while others have reported the speed of the system. For better comparison, we reported results for both of these evaluation criteria. As shown in Table 3, our proposed speller showed a considerable enhancement in overall system efficiency compared with state-of-the-art studies based on SSVEP or SSVEP-EMG.
To measure the efficiency of each modality, accuracy results of the SSVEP and EMG were reported individually. Fig. 8 illustrates the results of each subject considering a latency of 0.2 s. As shown in this figure, the SSVEP accuracy varies for different subjects, ranging from 80 to 100%. However, the EMG classification accuracy was approximately 100% for all subjects.
Classification accuracy of each modality. Results are presented for a latency of 0.2 s.
Excessive workload causes human errors, which leads to a decrease in system performance. In order to estimate the workload of BCI systems, the National Aeronautics and Space Administration-Task Load Index (NASA-TLX) multidimensional questionnaire has been widely applied. In the raw form of this questionnaire, six subscales with the same weights are considered to calculate subjective workload. Each subscale is scored from 0 to 100 with 5-point steps, by each subject, and the total workload is determined by averaging scores for all subscales. Each user in our study completed the raw NASA-TLX questionnaire and the scores are shown in Table 4. The workload score was 15.8, which indicated that the proposed speller was acceptable to all subjects.
Subject | Mental demand | Physical demand | Temporal demand | Performance | Effort | Frustration | Workload |
1 | 10 | 15 | 25 | 10 | 15 | 20 | 15.8 |
2 | 35 | 15 | 30 | 15 | 30 | 25 | 25 |
3 | 15 | 15 | 5 | 5 | 5 | 5 | 8.3 |
4 | 45 | 30 | 20 | 5 | 20 | 5 | 20.8 |
5 | 70 | 10 | 40 | 20 | 30 | 10 | 30 |
6 | 15 | 20 | 25 | 10 | 10 | 20 | 16.7 |
7 | 10 | 10 | 5 | 5 | 10 | 15 | 9.2 |
8 | 15 | 15 | 10 | 10 | 5 | 10 | 10.8 |
9 | 5 | 10 | 10 | 10 | 5 | 5 | 7.5 |
10 | 45 | 5 | 5 | 20 | 5 | 5 | 14.2 |
Mean |
26.5 |
14.5 |
17.5 |
11 |
13.5 |
12 |
15.8 |
NASA-TLX, National Aeronautics and Space Administration-Task Load Index.
To evaluate the effect of increasing the text length as well as the impact of different characters (especially characters with lower frequency rates), another phrase that uses all the characters once was evaluated. To this end, subjects were asked to type the phrase ‘THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG’. Results of the system performance are shown in Table 5 and the raw scores of the NASA-TLX questionnaire are reported in Table 6. The mean accuracy and ITR were 90.9% and 95.4 bit/min, respectively. The average spelling time to spell the phrase was 126.7 s; therefore, our proposed speller achieved a speed of 20.4 char/min. The workload score was 14.8, which showed that the system was also acceptable for all subjects while spelling the new phrase.
Subject ID | Classification accuracy (%) | ITR (bit/min) | Total time (s) | Typing speed (char/min) |
1 | 95.4 | 103.7 | 123.6 | 20.9 |
2 | 88.4 | 90.2 | 134.5 | 19.2 |
3 | 98.5 | 110.7 | 119.7 | 21.6 |
4 | 87.2 | 88.0 | 128.6 | 20.1 |
5 | 98.0 | 109.5 | 121.7 | 21.2 |
6 | 96.1 | 105.2 | 118.1 | 21.8 |
7 | 82.0 | 79.3 | 131.8 | 19.6 |
8 | 90.3 | 93.6 | 129 | 20 |
9 | 81.1 | 77.8 | 131.8 | 19.6 |
10 | 91.8 | 96.4 | 127.8 | 20.2 |
Mean |
90.9 |
95.4 |
126.7 |
20.4 |
Subject | Mental demand | Physical demand | Temporal demand | Performance | Effort | Frustration | Workload |
1 | 10 | 10 | 10 | 5 | 15 | 5 | 9.2 |
2 | 30 | 15 | 20 | 15 | 30 | 25 | 22.5 |
3 | 15 | 20 | 5 | 10 | 5 | 10 | 10.8 |
4 | 45 | 30 | 20 | 25 | 20 | 5 | 24.2 |
5 | 40 | 10 | 30 | 20 | 30 | 10 | 23.3 |
6 | 15 | 20 | 25 | 10 | 10 | 20 | 16.7 |
7 | 25 | 15 | 5 | 15 | 10 | 15 | 14.2 |
8 | 15 | 15 | 10 | 10 | 5 | 10 | 10.8 |
9 | 20 | 15 | 5 | 10 | 10 | 5 | 10.8 |
10 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
Mean |
22 |
15.5 |
13.5 |
12.5 |
14 |
11 |
14.8 |
An HBCI is used to enhance overall efficiency by combining two physiological signals. To improve ITR, we developed an HBCI speller with 36 characters, using only nine symbols and the combination of SSVEP and EMG. In our proposed speller, three main factors influenced the speller efficiency.
The first factor is the simultaneous combination of SSVEP and EMG applied in the proposed structure. In this 36-character speller, the number of flickering symbols on the screen was decreased from 36 to nine symbols, using the simultaneous combination of these two signals. Reducing the number of flickering symbols decreases the frequency recognition error by reducing the adverse effects of neighboring flickers which cause user fatigue [96]. By reducing the number of symbols, the stimulation frequency step is increased, which also decreases the error. Finally, it increases the recognition accuracy.
The second factor is the use of the character encoding scheme. The frequency distribution for the character set, including ‘Space’ and 26 Latin letters, provided a further understanding of the distinction among characters. Using the character distribution, character categorization was therefore provided based on the frequency rate. For this purpose, less muscular activity was assigned to more commonly used characters. Therefore, the selection of these characters required less dual attention. This reduced the recognition error which enhances the ITR.
The third factor is applying the latency in the SSVEP signal processing that improved the accuracy. In [49], it was suggested that decreasing the rest time to less than 1 s while the accuracy remained stable, the ITR would be improved. In the present study we showed that, as SSVEP requires several cycles to reach steady state [58], it is possible to reduce the subject’s rest time and devote this time duration to compensate for the SSVEP latency. In other words, by reducing the rest time from 1 to 0.5 s and assigning this time to the next stimulation, and so taking into account the latency in the analysis of each trial, accuracy and the ITR were improved. According to the findings of the present study, the optimal value of latency varies among individuals. The value of 0.2 was determined as the optimal latency, averaged over 10 subjects.
We have built a comprehensive speller containing all Latin characters, from which any desired phrase can be typed using only nine flickers on the screen, which is much more applicable compared with P300-based systems that use more flickers to provide a limited number of predetermined control commands [97]. Our speller also improved the ITR compared with SSVEP-based spellers with tree structures that require multiple steps to select each character, greatly enhancing the efficiency compared with the P300-EMG speller that uses the EMG only to correct spelling errors [63]. This system also has better performance compared with other SSVEP-EMG HBCI spellers [34, 49].
It should also be mentioned that in [34], the ITR and speed results were reported for two different experimental conditions. In that study, the ITR was calculated in the random order-spelling task, while the system speed was reported for the copy-spelling task. Generally, the experimental plan of the copy-spelling task and the random order-spelling task are different. In other words, in the copy-spelling task, additional time has been added after each character selection stage to help the subject to better focus. The total duration of a single trial is therefore increased in this task. Because of the similarity of the spelling task, it is more accurate to compare our findings with the results of the copy-spelling task. We can therefore say that our proposed speller significantly improves typing speed. This speed enhancement was because we devoted less time between two consecutive trials, which significantly influenced the speed. The less time required to select two consecutive characters was due to the character encoding scheme. Assigning less muscular activity to the most commonly used characters had the advantage that the subject used less muscle activity and in many cases they were not required to use any at all. This considerably reduced the required subjective attention to select the next character and therefore the subjective workload. Further experimental investigations on the effect of increasing the text length as well as the impact of the existence of different characters on system performance showed that our proposed system can be used effectively to type all kinds of sentences.
Our results clearly demonstrate that the EMG classification accuracy was highly desirable, which could be due to the number of repetitions that the corresponding muscle has been involved with to select a character. The findings of previous studies indicate that the accuracy decreased with an increase in muscle activity repetitions. Increasing the number of characters also requires more frequent repetitions of muscle activity which is time-consuming and reduces the ITR [34, 49]. The use of only one wrist flexion decreased the recognition error. Furthermore, one-time wrist flexion occurs in a short time period and both SSVEP and EMG signals also have the ability to detect during short time periods. Therefore, in future studies, by improving the extracted features and applying novel classification methods, the duration of each trial could be reduced and thereby the ITR would be improved.
User-friendliness is a crucial factor of the system usability evaluation [98]. In an SSVEP-based system, individuals are influenced by the adjacent flickers, which might cause fatigue [96]. In this study, reducing required symbols on the screen decreased the subject’s eye fatigue and so caused the speller to become more user-friendly. Conversely, decreasing the subjective workload by reducing the required EMG activity also decreased the subject’s muscle fatigue. Both features enabled the system to be more convenient, causing it to be more user-friendly. Pattern learning was the only training required and, given the low number of symbols, less time was taken.
We could further develop our proposed speller by simply increasing the number of characters. To this end, we could increase the number of symbols on the screen. Furthermore, the number of EMG-based clusters could also be increased. This could be done by utilizing the various types of residual muscular capabilities of disabled people. We could also apply various types of muscular activities with different intensities, to increase the number of characters.
Although this HBCI speller may be non-functional for completely paralyzed patients, some patients might still be able to control one hand or other muscles such as facial muscles. For patients with the ability to control only one hand, the EMG activities could be generalized to joint flexion and extension [99, 100, 101]. Our proposed speller is also applicable for Parkinson’s disease patients, as they might not be able to control a real keyboard but could perform wrist flexion.
To improve system efficiency in future studies, it would be effective to first calculate the subject’s optimal latency during the offline test, and then perform the online experiment based on the optimal latency of that subject.
An online hybrid speller based on the simultaneous combination of SSVEP and EMG was designed in this study. This speller provided 36 characters using only nine symbols. The SSVEP was utilized to determine the subgroup of the target character and the EMG was applied to determine the target character in that subgroup. The use of a character encoding scheme based on frequency rate and applying latency enhanced accuracy. The mean accuracy and ITR were 91.2% and 96.1 bit/min, respectively. The speed of the proposed system was 20.9 char/min, which is significantly higher than the findings of state-of-the-art studies based on SSVEP or SSVEP-EMG. Generally, speller performance was improved compared with previously reported BCI speller systems.
ALS, Amyotrophic Lateral Sclerosis; BCI, Brain-Computer Interface; bit/min, Bits
per Minute; char/min, Chars per Minute; cm, Centimeter; ERP, Event Related
Potential; Eq, Equation; Fig, Figure; HBCI, Hybrid Brain-Computer Interface;
ECoG, Electrocorticography; EEG, Electroencephalogram; EMG, Electromyogram; GND,
ground; ITR, Information Transfer Rate; k
The data sets generated during the current study are available in the homepage of Biomedical Signal Recording and Processing Laboratory of Semnan University [https://amaleki.profile.semnan.ac.ir/#downloads].
AM designed and conceptualized the research study and supervised the research process. SS performed the research, conducted experiments, analyzed the data, and wrote the original manuscript. Both authors contributed to editorial changes in the manuscript. Both authors read and approved the final manuscript. Both authors have participated sufficiently in the work and agreed to be accountable for all aspects of the work.
All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance to the ethical principles and the national norms and standards approved by the Ethics Committee of Semnan University of Medical Sciences and Health Services (approval number: IR.SEMUMS.REC.1398.133).
Not applicable.
This research received no external funding.
The authors declare no conflict of interest.
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.