†These authors contributed equally.
The cingulo-opercular network (CON), dorsal attention network (DAN), and ventral attention network (VAN) are prominently activated during attention tasks. The function of these task-positive networks and their interplay mechanisms in attention is one of the central issues in understanding how the human brain manipulates attention to better adapt to the external environment. This study aimed to clarify the CON, DAN, and VAN’s functional hierarchy by assessing causal interactions. Functional magnetic resonance imaging (fMRI) data from human participants performing a visual-spatial attention task and correlating Granger causal influences with behavioral performance revealed that CON exerts behavior-enhancing influences upon DAN and VAN, indicating a higher level of CON in top-down attention control. By contrast, the VAN exerts a behavior-degrading influence on CON, indicating external disruption of the CON’s control set.
A set of attention-related brain regions were suggested important for many cognitive/behavioral functions such as navigating the environment, filtering external information, focusing on goals during tasks, working memory, self-regulation, and volitional control [1, 2, 3, 4, 5]. These task-positive brain regions were suggested comprising several segregated, but cooperative, intrinsic functional networks to support attention and cognitive control related tasks [3, 6, 7, 8, 9, 10, 11]. One of the networks is called the cingulo-opercular network (CON) [7, 8], comprising the dorsal anterior cingulate cortex (dACC) and bilateral anterior insula (AI). The other one is called the frontoparietal attention system that can be further divided into the dorsal attention network (DAN) [6, 12, 13], anchored in the bilateral frontal eye field (FEF) and intraparietal sulcus (IPS), and the ventral attention network (VAN) [6, 12, 13], anchored in the right middle frontal gyrus (MFG) and right temporoparietal junction (TPJ). Although these cortical networks were frequently mentioned in attention and cognitive control related task-activation studies and resting-state analysis [14, 15, 16], their functional roles and how they interact with each other still need further elucidation. For example, whether the top-down control signals come from DAN or CON remains debated [17].
An attention networks hypothesis proposes that the top-down control signals from DAN enable better processing of current focus, while VAN works as a filter sending bottom-up interference when distractors are salient or behaviorally relevant and may cause attention reorientation [17, 18, 19]. This hypothesis was supported by studies with various methodologies such as lesion investigation [20], effective connectivity analysis based on both functional magnetic resonance imaging (fMRI) [21, 22, 23] and electroencephalographic (EEG) source localization [24].
In addition to the interaction within the frontoparietal network, the salience network hypothesis proposes that CON underlies the function of saliency detection, which regulates both stimuli selection and focusing of attention [25, 26]. On the other hand, Dosenbach and his colleagues emphasized the goal-directed aspect of CON and proposed that the CON works as a core control center for implementing top-down task control in all kinds of attention-demanding tasks [7]. Structural and functional connectivity studies on both brain-damaged patients [27, 28] and normal participants performing high demanding tasks [29, 30] showed that interference to CON contributes to inferior behavioral performance, indicating that CON may underlie the top-down control, which regulates task-negative activities to prevent internal interference to attention. It is worth noting that Corbetta et al. [17] speculated that the top-down attention control signals might come from CON besides DAN, and the interference signal from VAN might even interfere with the task control maintained by CON. However, how CON interplays with DAN and VAN remains unclear.
In summary, the relationship between those task-positive networks in attention needs to be further elucidated. Especially, many efforts have been made to explore how DAN and VAN interplay in attention, but whether and how the third one, CON, which was also found frequently activated in many attention tasks, interplay with DAN and VAN remains largely unknown. To address these questions, we directly examine the inter-network causal influences between CON and DAN/VAN in attention tasks to infer the source of the top-down regulation and that of bottom-up interference. The current analysis, together with our previous findings regarding DAN-VAN interaction [22], may help to elucidate the task-positive networks’ functional hierarchical structure. Specifically, we accomplished the examination by analyzing the fMRI data recorded from healthy human participants attending an experiment, including multiple sessions of visual-spatial attention tasks (The same dataset analyzed in Wen et al., 2012) [22]. We applied General Linear Modeling (GLM) to identify regions of interest (ROIs) in CON, DAN, and VAN, and then assessed the directional influences between CON and DAN/VAN using Granger causality (GC) and correlated those influences with behavioral performance to address the functional significance of the directional connections. The current work adding CON to the analysis may expand the proposed attention network interaction model from the DAN-VAN model to a CON-DAN-VAN model.
Here we provide an outline of our experimental framework. The current study used the same dataset analyzed in our previous study [22], specifically designed for Granger causality and behavior joint analysis:
We preprocessed the images and carried out a GLM analysis to identify the regions of interest (ROIs).
i. We extracted and preprocessed the fMRI time series of each ROI using Granger
causality analysis (GCA) to assess the ROIs’ directional connections. ii. We calculated the correlation between the connection strength and behavioral
performance to assess the inter-ROI interactions’ functional significance. iii. We combined and averaged the forgoing inter-ROI causal influences to yield
inter-network causal influence, with their behavioral significance being assessed
by correlating with behavioral performance.
The third and fourth steps aimed to intuitively depict the task-positive networks’ functional hierarchy during this visual-spatial attention task.
The detailed information of the experiment and analysis is provided below.
The current study used the same dataset analyzed in our previous study [22].
Specifically, twenty young, healthy right-handed human participants with normal
or corrected-to-normal vision participated in the experiment. All participants
had no history of taking medicine, psychiatric illness, or any brain surgery that
might affect their central neural system. Because the attention task was very
demanding and we want to observe the natural fluctuation of behavior mainly
related to the varying attention level instead of other factors, the participants
had to be well trained and well prepared for the six runs of task-scanning. Each
participant’s training procedure included multiple out-scanner training sessions
and one in-scanner training in 2-4 days before the MRI scanning day. Each of
these training sessions lasted 0.5~1 hour. All participants
underwent a screening and a short warm-up session before the MRI sessions. Only
13 of them finished the whole six runs. One of them was later excluded from the
analysis because of severe image artifacts, which reflected shadow artifacts on
both sides of the brain. The included participants were 24
The current study adopted a mixed blocked/event-related design. The task protocol contained six runs, each with four blocks balanced in an ABBA or BAAB arrangement to counterbalance the temporal confound. The experimental timeline is schematically illustrated in Fig. 1A. Each block lasted one minute and was followed by a 20 second fixation period. The attention (A) blocks and passive-view (B) blocks shared the same timeline and stimuli except for the color of the crosshair at the fixation point (light red and light green, balanced across participants). In each trial of attention (A) blocks, participants were cued to direct and maintain covert attention to the left or right hemifield. Following a 2500 ms delay, a standard or a target stimulus of 100 ms in duration appeared either in the attended hemifield (valid trial) or the unattended hemifield (invalid trial). The standard stimulus was a circular checkerboard. The target stimulus was also a circular checkerboard but slightly smaller than the standard stimulus (10% smaller in radius). The standard stimulus appeared 80% of the time with 50% validity, and the target stimulus appeared 20% of the time with 50% validity. The trials were pseudo-randomly arranged so that the valid trials and the invalid trials in each block were evenly matched. Participants were required to make a speeded keypress response only to the valid targets. Since the proportion of valid target stimuli to all stimuli was small (10%), motor processing activation was weak at the block level. It would not affect the attention task’s activation results. On the other hand, it would avoid motor processing related component which might contaminate the time series for GC analysis at the block level. In passive-view (B) blocks, the stimulus presentation schedule remained the same, but neither attention nor response was required. The participant just maintained fixation. Each block lasted ~60 s (15 trials), with 20 s fixation periods inserted between successive blocks. The participants had to be well trained and well prepared to maintain their performance during the six runs of task-scanning, which lasted about an hour. The training procedure of each participant included multiple out-scanner training sessions and one in-scanner training in the 2-4 days before the scanning day. Each of these training sessions lasted 0.5~1 hour. The participants were fully instructed and went through the out-scanner training to get familiar with the task.

Experimental paradigm: region of interest (ROI) defined using
task activation and mean ROI blood oxygen level-dependent (BOLD) changes. (A):
An example of the stimulus and timeline for one of the trials in an attention
block. (B): The task-positive networks activated in the current task (T
All MRI data were acquired using a 3T Magnetom Trio whole-body MRI system
(Siemens AG, Erlangen, Germany) at Beijing Normal University MRI Center. The
functional scanning was performed using a T2*-weight echo-planar imaging sequence
(echo time, 30 ms; repetition time, 2000 ms; flip angle, 90
The fMRI data were preprocessed using SPM8 software
(http://www.fil.ion.ucl.ac.uk/spm). The preprocessing protocol included slice
timing, motion correction, anatomical co-registration, normalizing to a Montreal
Neurological Institute (MNI) space (voxel size, 3 mm
The forgoing preprocessed data with spatial smoothing were fed to a GLM for activation analysis and ROI selection. The data were also preprocessed using the above steps except spatial smoothing. These none-spatial-smoothing versions of preprocessed data were fed to GCA for network analysis.
To define the ROIs for network analysis, we conducted a GLM analysis using SPM8
software. For first-level analysis, the regressor is generated by convolving the
rectangular function representing the block sequence with a canonical hemodynamic
response function (HRF). The individual activation maps were generated using the
contrasts of attention condition against passive-view condition. For second-level
random effect analyses, the individual contrast maps were fed to a one-sample
t-test to yield a group-level activation map. False discovery rate (FDR)
control was applied to correct for multiple comparisons (t
To avoid false-positive results which may confound the ROI selection and to make
sure the ROIs defined matched the well-proposed CON, DAN, and VAN network, we
first compared the GLM activation results with the spatial pattern of the region
associated with “spatial attention” according to the online meta-analysis
(https://neurosynth.org/, 147 studies, uniformity test, P
To elucidate the functional hierarchy of the task-positive networks, a directed
network model would be meaningful. We chose GCA, which is widely used in
examining the directed influence between time series to accomplish this goal [15, 22, 34, 35, 36]. The fundamental idea of GC is if the history of time series X
facilitates the prediction of time series Y’s future, then we say there is a
Granger causal influence from X to Y [37]. GC value of X
One of the mathematical realizations of estimating the Granger causality is comparing the autoregressive (AR) prediction performance of the univariate prediction and multivariate regression (MVAR) performance. For example, the GC from X to Y can be defined as
where
Assuming X is one ROI and Y being another, our GC calculation between X and Y
has three major steps. 1) Extracting the time series: Time series of each voxel
in X and Y were extracted from the preprocessed functional images without spatial
smoothing. The time series were then converted to percentage BOLD signal changes
by subtracting the mean signal value during the inter-block baseline fixation
period and then dividing the difference by the mean value. 2) Making the time
series stationary and zero-mean: For each voxel, each block’s percentage change
signals were averaged within each condition (attend or passive-view) to yield the
block-wise BOLD response. For the attention condition, the block-wise response
was subtracted from the percentage BOLD change in each block and each voxel to
yield residual BOLD time-series. For each block, the residual BOLD time-series’
first five-time points were discarded to eliminate the transient effects. The
temporal mean of the remained time points of each block was removed to meet the
zero-mean requirement assumed by autoregressive model estimation in GCA [34]. 3)
Calculating GC values: each voxel in X were paired with a voxel in Y, and GC
values were calculated for each voxel pair and averaged across all pairs to yield
overall GC values between X and Y, including GC of X
Investigating the change of GC values across different conditions on the group level is more meaningful than merely observing the raw GC values at the individual level. The former may reveal the cognitive significance of the directional connections and to mitigate the confounds caused by individual differences and noise background [15, 29, 39]. Accordingly, we employed a framework to correlate behavioral performance with the causal influence between brain regions. The framework formed the foundation of the network construction based on behavior-correlated inter-ROI connections. Specifically, for each subject, the GC values and behavioral performance (either accuracy or response time (RT)) for each attention block were converted into z-scores. For the convenience of combining the two behavioral measures, we multiplied the RT z-score by -1 so that larger scores for both measures indicated better performance. The attention blocks were then sorted according to z-scores and assigned to 10 levels, each containing three neighboring blocks. The sorting assured that the first level denoted the worst performance (lowest accuracy or longest RT), and the last level, the best performance (highest accuracy or shortest RT). For each voxel pair and level, the three blocks’ GC values were averaged to represent the GC strength corresponding to the performance at that level. Spearman’s rank correlation analysis was then performed to examine the relationship between level-GC and level-performance, assessed using either accuracy or RT.
Both activation analysis and GCA were calculated at the block level. Therefore, all events were included in GLM and GC estimation. When calculating block-level mean RT, the responses to standard stimuli (false alarm) were excluded. However, the RT of the trials of the missing target was set at 2 times the participant’s average hit RT because previous behavioral studies have shown that in visual-motor experiments, RT in target-absent trials is usually twice as long as the average RT of hit trials [43, 44].
To identify whether a directional connection, for example, A
The GC values of all cross-network ROI-pairs were calculated and averaged to
yield the network-pairs’ GC values. For example, let A and B be two separate
networks and a
Twelve subjects performed the experiment according to instructions. For each
subject, reaction time and response accuracy varied from block to block. The mean
reaction time was 426.80
The GLM analysis yielded an activation map highlighting three major
task-positive networks: CON, DAN, and VAN (t
Network | ROI Name | t value | P (FDR) | MNI coordinate (mm) | ||
x | y | z | ||||
CON | dACC | 14.64 | 6 | 12 | 48 | |
r AI | 16.57 | 36 | 27 | 6 | ||
l AI | 14.11 | -30 | 21 | 0 | ||
DAN | r FEF | 10.68 | 30 | 0 | 57 | |
l FEF | 9.63 | -30 | 3 | 54 | ||
r IPS | 18.38 | 42 | -48 | 51 | ||
l IPS | 12.90 | -30 | 63 | 54 | ||
VAN | r pMFG | 16.56 | 45 | 15 | 27 | |
r aMFG | 11.48 | 42 | 51 | 21 | ||
r TPJ | 9.88 | 42 | -48 | 36 | ||
Abbreviations: AI, anterior insula; aMFG, anterior middle frontal gyrus; CON, cingulo-opercular network; dACC, dorsal anterior cingulate cortex; DAN, dorsal attention network; FEF, frontal eye field; IPS, intraparietal sulcus; l, left; pMFG, posterior middle frontal gyrus; r, right; TPJ, temporoparietal junction; VAN, ventral attention network. |

Comparing the group activation results of the current
spatial-attention task with the spatial pattern of the regions associated with
“spatial attention” according to the online meta-analysis
(https://neurosynth.org/, 147 studies, uniformity test, P
To assess the causal connections between different networks’ ROIs and the
behavioral significance of those connections, we calculated the GC values between
the ROIs and correlated these values with behavioral performance (see Methods
section for details). Because we previously investigated the causal interactions
between DAN and VAN, we did not repeat these calculations [22]. The current work
mainly focused on the connections between CON and DAN/VAN, which were not
assessed in our previous studies and were less often discussed in other studies.
The causal connections significantly correlated with behavioral performance are
shown in Fig. 3A. Generally, most of the behaviorally significant causal
connections from the CON ROIs to the DAN and VAN ROIs were behavior-enhancing. In
the opposite direction, the behavioral significance became inconsistent across
the ROI-pairs. The connections from the bilateral FEF or raMFG to the dACC and
those from the left IPS (lIPS) or the right temporoparietal junction (rTPJ) to
the rAI were behavior-degrading, while most of the connections from the bilateral
FEF or rIPS to the AI were behavior-enhancing. Fig. 3B illustrated an example of
identifying behaviorally significant connections between dACC and rFEF in which
the causal influence of dACC

Behaviorally significant causal connections on the both region of interest (ROI) and network levels. (A): A schematic summary of the behaviorally significant connections between CON and DAN/VAN ROIs. Each cell refers to a directional connection. The abbreviations in the cell indicate the behavior measurements significantly correlated with the causal influence. AC, accuracy; RT, response time. Red color denotes positive correlation (behavior enhancement), while blue denotes negative correlation (behavior degradation). (B): Inter-network causal influences as functions of accuracy. The conventions are the same as above. (C): A schematic depiction of the behaviorally significant causal interactions between the CON and the other two task-positive networks. Red color denotes behavior-enhancement, and blue denotes behavior-degradation. The previously reported [22] causal interactions between DAN and VAN and their behavioral significance were also displayed.
The inter-ROI causal influences were averaged within four categories
CON
We used fMRI and Granger-causality-based network-behavioral joint analysis to examine the visual-spatial attention-related network’s functional hierarchy. Our examination of the block-level activation showed sustained activation of the three proposed networks. They included the cingulo-opercular network (CON), dorsal attention network (DAN), and ventral attention network (VAN). The results were consistent with previous literature [6, 7, 17] and enabled the following Granger-causality-based neural-behavioral analysis. By assessing the causal influence between CON and DAN/VAN and correlating those influences with behavioral performance, we observed behavior-enhancing influences from CON to DAN and VAN and behavior-degrading influences from VAN to CON on both inter-ROI level and inter-network level.
During attention blocks, the participants frequently directed attention according to cue and maintained attention for seconds (top-down attention control) and frequently be affected by invalid stimulus (bottom-up interference), which cause interference even reorienting. The three processes were proposed to activate CON, DAN, and VAN. By contrasting the attention blocks against the passive view blocks, our primary purpose is to validate the three networks’ activation and provide reliable ROIs for GC analysis rather than inspecting complex visual-spatial components at the trial level, which is not the focus of the current study and had been done by many previous studies. To avoid defining ROIs based on false-positive activation results, we carefully compared the current study’s activation map with those independently reported in previous literature, especially those dedicated to defining the classical DAN, VAN, and CON according [6, 7, 17, 18, 45]. Our results of the activation were consistent with the previous studies. Further comparison with the meta-analysis could be regarded as a double-check of the ROI selection efficacy (Fig. 2).
Visual attention is considered to be controlled by the frontoparietal attention
system, including the DAN and VAN. The CON is thought to integrate internal and
extra personal information to regulate other brain areas and guide behavior,
which is crucial for task-control [8, 46, 47]. On the other hand, the DAN
initiates and maintains goal-directed top-down signals important for attention
control [17, 48, 49]. The findings that stronger CON
The finding that stronger CON
Our results also showed bidirectional interactions between VAN and CON. Emerging
evidence has shown that attention reoriented to a new source output from the VAN
interrupts ongoing selection in the DAN, which, in turn, shifts the attention
toward the novel object of interest [15, 22]. Our results showed that a stronger
VAN
Our results demonstrated the relationship between CON and the frontoparietal attention system, and we further proposed a functional hierarchy model in the visual attention task. Our previous study emphasized the relationship between DAN and VAN [22]. Combined with the present results, the CON appeared to be the highest-ranking network in the hierarchy, suggesting that it may regulate the frontoparietal attention network by transmitting top-down signals to accomplish attention goals. Generally, the top-down processing from the CON to the DAN specializes in selecting and linking stimuli and responses to guarantee attention performance. In contrast, the CON regulation of top-down signals prevents interference by VAN to ignore distraction. VAN may also break the attentional set maintained by CON to enable attentional reorienting. These findings add to our understanding of the brain’s functional hierarchy from the perspective of network connectivity.
Granger causality analysis was employed in the present study to construct a functional hierarchy network employing BOLD fMRI data during the attention task. Granger causality analysis is an exploratory approach, which is not restricted to the preselection of interacting regions and assumptions about the structure and the direction between the brain regions. Therefore, GCA (unlike SEM and dynamic causal modeling (DCM)) does not appear to have the issues of the model’s misspecification and inaccurate results [39]. While the applicability of Granger causality to fMRI data is debated [54], it is noteworthy that from a statistical point of view, if a time series is analyzable by functional connectivity measures such as temporal correlation and coherence [55], it is analyzable by Granger causality. Recent work shows that both resting-state and task-state fMRI data are well described by autoregressive models, which are the basis for deriving Granger causality [9, 40, 41, 56].
Admittedly, GCA is suggested vulnerable to noise [54] and sometimes limited by the AR model order. However, by applying GCA, the information streams between ROIs would be more directly measured and intuitively displayed. When ROI A is influencing ROI B, which means there is a steady information stream from A to B. Thus, when predicting ROI B’s future activity, the activity of ROI A would contribute to the prediction in addition to using the history of ROI B merely. By applying GCA, such an information stream could be quantified by the coefficient of the history of A on future B, which reveals the synchrony of activation and reflects dependencies between ROIs. In the present study, GCA is used to build the hierarchy of task-positive networks, which not only depicted the organization of the structure but also demonstrated the information streams and provided the bases for further analyses on the functions of the ROIs (i.e., target brain regions) in a visual-spatial attention task.
It is worth noting that avoiding false-positive results in the first step of ROI definition using GLM activation result is crucial for the subsequent analysis [57]. Therefore, we did not define the ROI only by considering the task GLM activation results based on the current dataset. In practice, we carefully compare the current study’s activation map with those independently reported in previous literature and the activation map by meta-analysis (see Method). As mentioned, our activation results were consistent with the previous studies in the literature and the meta-analysis, which excluded false-positive confound.
Besides, our GC analysis did not rely on the same information that the GLM
analysis relied on. Traditionally, we represent the regional BOLD activity as
y(t) = x(t) +
Although the number of participants was a limitation of our study, we adopted a mixed blocked/event-related design, which allowed us to reveal the target networks’ hierarchy. The experiment was designed specifically for GC-behavior joint analysis, which is suitable for evaluating the behavior significance of effective connectivities. First, the experiment had to contain a sufficient number of blocks for a single condition (12 attention blocks in six runs in our case); the more blocks, the merrier. Second, each block needed to be sufficiently long for GC estimation; the longer, the better. Third, because the attention task was very demanding and we wanted to observe the natural fluctuation of behavior mainly related to the varying strength of GC rather than other factors, the participants had to be well trained and well prepared to maintain their performance during the six runs of task-scanning, which lasted for approximately 1 hour. The first and the second methodological demands linked to the third practical issue greatly limited our ability to employ this kind of design in massive recoding studies.
The second limitation is that we could only perform GC-behavior joint analysis on the block level signals, which contain frequent trial events. By itself, the current GC-behavior joint analysis cannot distinguish whether the control signal is related to the task level control or the trial level control. Although CON is considered to perform at the task-level [7], while the interaction between DAN and VAN is discussed at the trial-level [6, 17], and it is possible that the causal influence exerted from CON may reflect task-level control signals, while those from DAN and VAN may reflect trial-level control signals, a more sophisticated framework that could elaborate the control signals on trial-level is still needed in future studies.
It is worth noting that the current study only considered a specific visual spatial-attention task paradigm. Whether this functional hierarchy survives other attention paradigms such as visual feature attention paradigm, auditory attention paradigm, or even other attention-demanding paradigms remains unclear. Therefore, studies using more sophisticated designs, a larger sample size, and more attention paradigms should be carried out to elucidate further the generality of the functional hierarchy of CON, DAN, and VAN in the future.
This study obtained significant findings in its assessment of the functional hierarchy of the CON and frontoparietal attention network in the context of behavior-correlated causal interactions. We found that the CON and the DAN may regulate the VAN activity by top-down signals, whereas the VAN exerted a bottom-up influence on the activity of the other two networks. Based on fMRI data with Granger causality analysis, our findings suggested a hierarchy of behavior-correlated causal influence among CON, DAN, and VAN.
AC, accuracy; AI, anterior insula; aMFG, anterior middle frontal gyrus; BOLD, blood oxygen level-dependent; CON, cingulo-opercular network; CSF, cerebrospinal fluid; dACC, dorsal anterior cingulate cortex; DAN, dorsal attention network; FDR, False discovery rate; FEF, frontal eye field; fMRI, Functional magnetic resonance imaging; GC, Granger causality; GCA, Granger causality analysis; GLM, General Linear Modeling; HRF, hemodynamic response function; IPS, intraparietal sulcus; pMFG, posterior middle frontal gyrus; ROI, regions of interest; regions of interest; RT, response time; TPJ, temporoparietal junction; VAN, ventral attention network.
W.XT. conceived and designed the experiments; Z.P. and Y.R. analyzed the data; W.XT., Z.P., Y.R., L. Z., L.Y., D.M, L.R., and W.X. wrote the paper.
All participants signed a written informed consent beforehand, which abided by the Helsinki Declaration, and all research activities were authorized by the Brain Imaging Center at Beijing Normal University.
Thanks to all the peer reviewers and editors for their opinions and suggestions.
The present study was supported by the fund for building world-class universities (disciplines) of the Renmin University of China.
The authors declare no conflict of interest.