Abstract
Objective assessment of the brain’s responsiveness in comatose patients on Extracorporeal Membrane Oxygenation (ECMO) support is essential to clinical care, but current approaches are limited by subjective methodology and inter-rater disagreement. Quantitative electroencephalogram (EEG) algorithms could potentially assist clinicians, improving diagnostic accuracy. We developed a quantitative, stimulus-based algorithm to assess EEG reactivity features in comatose patients on ECMO support. Patients underwent a stimulation protocol of increasing intensity (auditory, peripheral, and nostril stimulation). A total of 129 20-second EEG epochs were collected from 24 patients (age 56.9±15.1, 10 females, 14 males) on ECMO support with a Glasgow Coma Scale < 8. EEG reactivity scores (R-scores) were calculated using aggregated spectral power and permutation entropy for each of five frequency bands (δ, θ, α, β, γ). Parameter estimation techniques were applied to R-scores to identify properties that replicate the decision process of experienced clinicians performing visual analysis. Spectral power changes from audio stimulation were concentrated in the β band, whereas peripheral stimulation elicited an increase in spectral power across multiple bands, and nostril stimulation changed the entropy of the γ band. The findings of this pilot study on R-Score lay a foundation for a future prediction tool with clinical applications.
Keywords: Quantitative EEG, EEG reactivity, Disorder of consciousness, Coma, ECMO, regression analysis
1. Introduction
The use of extracorporeal membrane oxygenation (ECMO) to provide mechanical circulatory support for treatment of patients with refractory cardiopulmonary failure is rapidly increasing1. Patients on ECMO support commonly develop brain injuries that lead to coma and other Disorders of Consciousness (DOC), signifying serious neurologic injury2. Persistence of DOCs has become one of the major reasons for clinicians to subject these patients to withdrawal of life sustaining treatment (WLST)3,4 which in turn make it the highest antecedent factor in deaths of patients on ECMO support5,6.
Determining a patient’s level of consciousness is of tremendous significance, both practically and ethically, in directing the course of treatment during ECMO support7,8. Although functional MRI and PET (both resting-state and stimulus-based) can contribute to an assessment of the functional integrity of brain networks in patients with DOC, neuroimaging studies are often not feasible in critically ill patients in the ICU, especially patients on ECMO support, as the ECMO circuit is not compatible with MRI. EEG offers a unique benefit when evaluating this patient population by bringing a functional neurological monitoring tool to the bedside.
Electroencephalographic Reactivity (EEG-R) is defined as a change in cerebral EEG activity, such as in amplitude or frequency, due to external stimulation9,10. Prior studies have found that a lack of EEG-R may be associated with increased mortality and unfavorable outcomes in patients with DOC11,12,13. Ref. 14 studied 50 patients who were unconscious after traumatic brain injury, cerebrovascular disease, or anoxia, finding 92% of patients who showed EEG-R recovered consciousness within five months14. While EEG-R may be a marker of poor outcome, when Ref. 15 studied 28 patients with unresponsive wakefulness syndrome after acute brain injury, it showed a reappearance of EEG-R at 6 months despite the absence of EEG-R early in the course of acute brain injury. This led the authors to suggest that there may be remodeling of the electrical brain activity organization due to plastic adaptations that attempt to restore brain connections and functions after a severe brain injury15.
Despite the potential of EEG-R as a marker of consciousness recovery, the current assessment methods for EEG-R are limited by heterogeneous approaches and lack of standardization10,11. The primary assessment method for EEG-R in current clinical practice is visual analysis of reactivity (VAR), however, this method is subjective and limited by the examiner’s experience and skill14,15. Furthermore, VAR only provides a qualitative response as to whether the patient exhibits EEG reactivity or not. Another complicating issue is the wide variability of the type and intensity of stimulation provided. An objective method to measure EEG-R (triggered by a pre-defined and graded stimulation) is necessary to provide a more robust assessment.
As we seek to develop a reliable method of assessing reactivity with more detailed and precise assessment of level of consciousness, we undertook this pilot study to develop an objective method of assessing patients with DOC by applying quantitative EEG (qEEG) techniques to investigate the decision-making process clinicians use when performing VAR. The application of qEEG to consciousness of patients with DOC has not been sufficiently studied6,11, especially regarding the changes in different frequency bands following different types of stimulation in patients with DOC16. While attempts at developing algorithms to quantify EEG-R have been made6,11, these algorithms remain unused in clinical practice as they do not provide an interpretation of their findings that aligns with the clinical understanding of patient issues. Herein we seek to address this concern with a bottom-up approach that seeks not to quantity the clinical data in isolation, but rather to analyze the decisions made by practicing clinicians using that clinical data. This will both preserve the clinical relevance of the analysis and improve the reliability of the clinical approach. Herein we have two aims:
To characterize the frequencies of EEG that occur in response to three different types of stimuli – audio, peripheral, and nostril.
To develop an objective and graded assessment of EEG-R in comatose patients on ECMO support
To accomplish these goals, we developed a stimulus-based qEEG algorithm that uses spectral power and permutation entropy to evaluate the level of EEG response in different frequency bands. These responses are then compared against the interpretations of EEG reactivity of experienced clinicians to generate a model for the probability of the presence of EEG-R in a given trial’s EEG. Based on the features selected during the model generation, we have identified features of significance to identifying EEG-R under pre-defined graded stimulation conditions.
2. Methods
2.1. Subjects and study protocol
This study was undertaken with the approval of the Institutional Review Board of Johns Hopkins University. Subjects were enrolled in a prospective observational study of ECMO patients in whom continuous EEG was conducted as part of standard clinical care17. This study was a multidisciplinary endeavor coordinating efforts from the cardiac intensive care, cardiovascular surgical intensive care, medical intensive, and neurocritical care units, and the neurophysiology department. Adult ECMO patients (age >18 years) with Glasgow Coma Scale score (GCS) of 3–8 were enrolled from November 2017 to April 2020. As part of routine multidisciplinary clinical care, all patients were followed by a neurocritical care physician from time of ECMO cannulation until hospital discharge or death. Baseline neurological examination of each patient was assessed on day one of ECMO cannulation. Cessation or reduction of sedative medications (“sedation holidays”) were routinely given for each patient as deemed safe by the treating physician(s). Sedation holidays were usually undertaken between days three to five of ECMO support and EEG was placed to allow an EEG assessment while sedation was minimized or absent. GCS was collected at time of routine clinical exams when off sedation. Continuous EEG was performed as a clinical standard of care for patients with DOC and collected in a clinical cEEG database. EEG data were de-identified and transferred for further analysis.
2.2. EEG data collection
A montage of 19 electrodes, placed according to the international 10–20 system, of continuous EEG was used to record at a sampling frequency of 200 Hz. Studies were performed using Nihon Kohden digital EEG systems (Nihon Kohden, Tokyo, Japan). The collected EEG signals were de-identified and stored in the European Data Format (EDF). VAR and qEEG analysis were then conducted independently.
2.3. Stimulation trial
The stimulation protocol was conducted by an experienced clinical team of fellows and EEG technicians as part of clinical care. Three types of stimulation were administered to patients: auditory, peripheral, and nostril. A set of stimulation trials was performed in the same order each time (with increasing intensity of stimulation):
Auditory Stimulation (AUD): with increasing loudness, calling of the patient’s name followed by clapping near their ear
Peripheral Stimulation (PER): pinching the finger and toe nailbeds bilaterally
Nostril Stimulation (NOS): inserting a cotton swab deep into the nasal cavity
Each individual stimulation trial was performed at least once and was followed by a rest period to allow for separation of the effects from each individual stimulation. Each subject had at least one set of stimulation trials using all three stimuli, and some subjects had more than one. Those patients that had more than one set of stimulation trials had at least one day in between each set. As EEG features and patterns may change due to the dynamic nature of acute critical illness18,19, each stimulation trial was classified as a separate EEG sample, and samples from the same patient were treated as independent for purposes of analysis.
When stimulation was performed, annotations and timing marks were added to the EEG signal to synchronize the onset of stimulation and recorded EEG. Subjects’ responses were observed and noted by the clinical team.
2.4. Visual analysis of EEG-R
All EEG data were reviewed retrospectively by one neurologist (S.-M.C.) and one neurophysiology specialist (E.K.R.). EEG data were de-identified during VAR, such that clinicians were blinded to which of the patients’ EEG they were examining, nor were they aware of the survival or current state of the patient whose EEG they were examining. Each sample of EEG data was classified, following standard clinical practice of consensus discussion, with “presence of visual reactivity” (R+) or “absence of visual reactivity” (R−) based on previously established criteria23. R+ indicates that a subject demonstrated EEG-R in response to at least one type of stimulation, while R− indicates that a patient did not show a response to any stimulation. To prevent biasing of patient outcomes, determination from visual analysis by these clinicians was not used to guide patient treatment, nor were the clinicians informed of the progression of the patient’s condition.
2.5. Artifact detection and data preprocessing
Data were processed in MATLAB (version 2020a on macOS Big Sur, MathWorks, Natick, MA) using both scripts developed by the authors and scripts included in the publicly available software package EEGLAB20.
A finite impulse response band-pass filter (0.1 to 50 Hz) was used to remove ambient electrical noise. An independent component analysis was performed to identify common artifacts, such as muscle activity, eye movements, or body movements, and data containing any of these artifacts were excluded from analysis. To remove eye movement (electro-ocular, EOG) and muscle and motion (electro-muscular, EMG) artifacts, the EEG data were processed by a second-order blind identification algorithm from the Automatic Artifact Removal (AAR) toolbox in EEGLAB21,22.
An epoch consisting of two consecutive ten-second, periods of EEG recording corresponding to the time before and after a stimulation was extracted from each trial. Each ten-second period was additionally divided into five two-second periods (sub-epochs) for further processing. Each stimulation period lasted from 3–5 seconds; these were not included in the record. Figure 1 depicts the stimulation protocol and data processing methodology. Following EEG data preprocessing, the spectral power and permutation entropy of the frequency bands [δ (0.5–4 Hz), θ (4–8 Hz), α (8–13 Hz), β (13–30 Hz), γ (30–50 Hz)] were computed, resulting in ten metrics.
2.6. Quantitative EEG measures
2.6.1. Spectral measures
Following artifact filtering and epoch separation, discrete Fourier transform was applied to the data over a rectangular window. Using the transformed signal, the spectral power for each EEG band in each time segment was computed as the average power in each frequency range over each epoch.
2.6.2. Complexity measures
Permutation entropy (PE) is used to detect dynamic EEG changes in patients with DOC23. PE is a quantitative measure of the complexity of a dynamic system24, chosen here for its relatively high resistance to noise in comparison to other signal aggregation measures. The probability of the appearance of each symbol in the EEG is estimated according to Shannon’s entropy formula25:
(Eq. 1) |
where pi is the probability of a symbol. In this study, an embedding dimension of 4 and a value of τ = [12, 6, 4, 2, 1] for delta, theta, alpha, beta and gamma, respectively, were used for the permutation entropy algorithm. In order to reduce aliasing effects when τ > 1, a low pass filter with cutoff frequency 50/τ Hz was applied. As with spectral power, PE was computed for each EEG channel in each frequency band in each time segment.
2.7. Computation of reactivity score
To reduce the potential effect of noise from and the dimensionality of each individual channel, each metric (the spectral power or permutation entropy in a given frequency band) was aggregated across all 19 EEG channels; the default aggregation method used was to calculate the mean of the channels, however other methods were also applied (see Algorithm Permutations). The computation of reactivity score was then based on a standard score (SS):
(Eq. 2) |
where SSmetric,post is the standard score of metric in a particular post-stimulation sub-epoch post;; metricpost is the aggregate value of metric (either spectral power or permutation entropy) across all EEG channels at time post; meanpre and stdpre are the mean and standard deviation of the aggregate value of metric over the pre-stimulation period. This standardizes the time series of each metric in the post-stimulation period against the pre-stimulation period. Thus, a positive SSmetric,post in the post-stimulation period indicates an increase from the pre-stimulation level while a negative SSmetric,post indicates a decrease. The value of a metric was considered to be indicative of the presence of EEG-R if it had an absolute normalized value higher than the threshold of 1.96, chosen as the approximate value of the 97.5 percentile point of the standard normal distribution. Alternative threshold values for this criterion were also tested (see Algorithm Permutations).
In order to aggregate the values of SS across the five-time segments of the post-stimulation period, the mean value of the difference between the criterion and any metric in any time segment that met the criterion were computed (see Algorithm Permutations). The resulting Reactivity Score (R-score) then becomes:
(Eq. 3) |
where SSmetric,post is any standard score in the time series of a metric that meets the criterion for EEG-R; Rmetric is the R-score for a specific metric (i.e. sub-R-score). This results in ten sub-R-scores - one for each of the metrics calculated for the sub-epoch. The R-score for an epoch is then the sum of all of the sub-R-scores.
2.8. Algorithm permutations
A subset of parameters of the algorithm used to calculate the R-scores from the EEG were identified for analysis by permutation. Parameter permutations were included based solely on mathematical validity (rather than underlying physiology) to limit bias regarding potential neurological mechanisms of effectiveness:
Baseline high end limit of frequencies
Aggregation across multiple EEG channels
Threshold of standard deviations from baseline
Aggregation of peak values post-stimulation
Aggregation of multiple SS readings
All combinations of these permutations of the algorithm were used to calculate R-scores that were fed into the modeling stage. The goal of this permutational approach was to blindly assess which combination provided the best approximation of the conclusions drawn by the expert neurologists, and thus reverse engineer through algorithmic means the methods being applied by clinicians.
2.9. Baseline comparison
An EEG epoch during which no stimulation was performed (baseline) was taken from each patient and the reactivity scores for each of these epochs were computed. The baseline scores were divided into two cohorts: those judged by the expert clinicians to contain signs of reactivity (R+) and those that were not (R−). The groups were then compared by a Cramer–von Mises two-sample test at α = 0.01 under the null hypothesis that the baseline EEG-R scores for the groups were the same. Verifying that the baseline values are of equivalent underlying distribution is key to validating the algorithm.
2.10. Statistical modeling
Statistical analyses were performed using MATLAB (version 2021a on 64-bit Windows 10, MathWorks, Natick, MA). In order to avoid bias, different investigators performed the reactivity score algorithm permutations (Y.Z.) and the statistical modeling (A.W.). Details of permutations used were anonymized prior to being transmitted between investigators and thus the statistical analysis was blinded to which permutations were used in each set of reactivity scores.
In order to convert from the metric space defined by the algorithm for scoring the presence of reactivity in the filtered EEG into a probability space that could be used for prediction of patient outcome in a graded and intuitive manner, each of the five sub-bands of spectral power and permutation entropy, as well as the total values thereof, were analyzed in a paradigm of univariate and bivariate stepwise logistic regression models. Models were generated for four binary outcome classes: one for each class of stimulation (AUD, PER, NOS), and one in which all stimulation classes were merged (MER). Models were provided a set of twelve potential predictors (five sub-band R-scores for spectral power, one total R-score for spectral power, five sub-band R-scores for permutation, one total R-score for permutation entropy) and one response variable (the VAR decision). Models were considered successful if both the overall model and all coefficients fulfilled a P < 0.10 threshold. Generated models were then pruned using χ2 goodness of fit test against a baseline (predictor-less) model; deviance value comparison for selection between nested or branched models; and magnitude and significance of penultimate step coefficients to limit overfitting. In order to account for the restricted sample size and correct for potential population bias in our use of a sample of convenience, a King-Zeng Prior Correction (KZPC) factor26 was implemented during the model generation process, selecting τ = 0.48 based on the clinical population occurrence rate of patient EEG-R reported by two recent multi-center studies11,12. This logistic regression model produces the target graded probability value of the clinician judgement for the presence of EEG-R for each epoch. In order to provide a baseline measure of differentiation to illustrate the potential capabilities of the approach, the output of the models were dichotomized with a 50% threshold. These dichotomized example models were ranked by sensitivity and specificity against the outcome criterion and only those models performing at greater than fifty percent on both were kept.
3. Results
In this pilot study, we analyzed twenty-four (24) subjects (age 56.9±15.1, 10 females, 14 males) as the final study cohort. Fifteen (15) additional subjects were initially enrolled, however, were removed prior to analysis due to insufficient compliance with the study protocol (material interruptions or incomplete trials) or predominance of artifacts in a large percent of epochs. Table 1 summarizes the patient characteristics. Seventeen (70%) of these patients had a GCS of three. All subjects had a GCS less than eight at time of study enrollment, however, certain subjects later showed improvement in GCS; GCS values listed in Table 1 are values at conclusion of the study.
Table 1.
Characteristic | No. (%) | |
---|---|---|
Age, mean (SD) | 57 (15.1) | |
Sex | ||
Female | 10 (41.7%) | |
Male | 14 (58.3%) | |
Glasgow Coma Scale (GCS) | ||
Three | 17 (70.8%) | |
Four | 1 (4.2%) | |
Five | 2 (8.3%) | |
Six Eleven |
2 (8.3%) 2 (8.3%) |
|
ECMO Cannulation Method | ||
VA-ECMO | 21 (87.5%) | |
VV-ECMO | 3 (12.5%) | |
Number of Trials Received by Patient | ||
One | 19 | |
Two | 4 | |
Four | 1 | |
Visual Analysis of Reactivity Results | ||
R+ | 5 (21%) | |
R− | 19 (79%) | |
Clinical Outcome | ||
Survived | 3 (12.5%) | |
Deceased | 21 (87.5%) | |
Withdrawal of Life-Sustaining Treatment | 18 (75%) |
Nineteen (79%) of the patients whose EEG were visually examined by the experienced clinicians were rated as lacking evidence of reactivity (R−). A total of 153 epochs of EEG samples were analyzed, 24 of which were during periods of no stimulation for comparison of subject baselines. The remaining 129 were epochs following stimulation: 20 epochs of auditory stimulation, 87 of peripheral stimulation (approx. four per patient), and 22 of nostril stimulation. We removed certain epochs from each category due to artifacts based on our pre-defined approach.
As part of the experimental approach of this pilot study, we investigated whether (1) certain frequency bands of EEG and (2) certain algorithmic methods of data aggregation can more accurately describe the presence of EEG-R post-stimulation. To the first question, our approach to stepwise logistic regression model generation found a strong response: out of the thirty-three example statistical models we identified across all stimulation categories, nearly half (16 of 33) used the total spectral power for their estimation of the presence of reactivity. The next most common metrics identified by our algorithm were the spectral power in the beta and gamma bands, at six (18%) and seven (21%) models each. The most commonly identified permutation entropy metrics were the total and the alpha band at three models each (9%); notably, for identification of reactivity in the event of nostril stimulation, permutation entropy in the gamma band was the only predictor identified other than total spectral power.
3.1. Baseline comparison
The cohort sizes for the Cramer-von Mises two-sample test were NBaseline,R+=5 and NBaseline,R−=21, with a computed test statistic of 0.1848. Given the critical value for these cohort sizes at α = 0.01 is 0.7322, we thus fail to reject the null hypothesis that the baseline values are similarly distributed. This indicates that the algorithm is identifying qualities particular to the post-stimulation period rather than qualities intrinsic to the EEG of the reactive and unreactive cohorts.
3.2. Stimulation Prediction Outcomes
Of the 20, 87, 22, and 129 epochs analyzed for AUD, PER, NOS, and MER, 13 (65%), 74 (85%), 16 (73%), and 103 (80%) were from patients rated R− on visual analysis of the EEG.
Due to the large number of permutations tested on the algorithm used to generate the reactivity, a similarly large number of statistical models needed to be generated to cover all of the scores. Those example models that met the earlier stated threshold for p-value and pruning criteria were then ranked, separately and equally weighted, on sensitivity (percent of true positives correctly predicted) and specificity (percent of true negatives correctly predicted); positive and negative in this experiment correspond to whether the clinicians identified the patient as R+ or R−. As the outcome in this case was binary, both sensitivity and specificity needed to be higher than 50% to outperform chance. Specifically, 8400 regression models were generated for each stimulation type; the vast majority of these did not reach statistical significance or were otherwise pruned. Out of the total 8400 potential models generated for each of AUD, PER, NOS, and MER stimulation, only 65 (0.77%), 272 (3.2%), 55 (0.65%), and 994 (11.8%), respectively, were judged as acceptable by the inclusion criteria.
For AUD example models (Table 2.A), nine achieved a sensitivity higher than 50%, and six of these achieved a specificity of 100%; all but one of them identified the spectral power in the beta band as the sole predictor, while the last identified total spectral power.
Table 2.
ID | Sens | Spec | β0 | β1 | β2 | β3 | ||||
---|---|---|---|---|---|---|---|---|---|---|
A | A17N141 | 0.75 | 1 | −5.8048 | SP β | 0.50557 | ||||
A17N241 | 0.75 | 1 | −5.53 | SP β | 0.50957 | |||||
A17N341 | 0.75 | 1 | −5.9588 | SP β | 0.50481 | |||||
A25N115 | 0.75 | 1 | −7.3088 | Σ SP | 0.4283 | |||||
A25N233⇂ | 0.75 | 1 | −4.8534 | SP β | 0.94765 | |||||
B.i. | P11P233 | 1 | 0.905 | −4.3646 | Σ SP | 0.87268 | ||||
P16P123 | 1 | 0.905 | −8.9812 | Σ SP | 1.7075 | |||||
P16P223 | 1 | 0.905 | −7.2785 | Σ SP | 1.7332 | |||||
P16P224 | 1 | 0.905 | −5.2171 | Σ SP | 0.8784 | |||||
P16P323 | 1 | 0.952 | −11.2081 | Σ SP | 1.93491 | |||||
P24P315 | 1 | 0.905 | −4.3227 | PE β | 2.512 | |||||
B.ii. | P13N314 | 0.8 | 1 | −9.9757 | Σ SP | 1.1336 | ||||
P13N344 | 0.8 | 1 | −10.0246 | Σ SP | 0.687891 | |||||
P16N113 | 0.8 | 1 | −9.6192 | Σ SP | 1.4901 | |||||
P16N143 | 0.8 | 1 | −8.767 | Σ SP | 0.89944 | |||||
P16N243 | 0.8 | 1 | −8.1398 | Σ SP | 0.99405 | |||||
P16N313 | 0.8 | 1 | −10.6728 | Σ SP | 1.51366 | |||||
P26N233 | 0.8 | 1 | −4.9966 | PE α | 5.1785 | |||||
C | N11N213 | 0.75 | 1 | −3.5383 | PE γ | 2.949 | ||||
N11N215 | 0.75 | 1 | −3.3905 | PE γ | 3.126 | |||||
N14N221 | 0.75 | 1 | −5.4054 | Σ SP | 0.688 | |||||
D.i. | M27P323 | 0.77 | 0.93 | −2.9508 | Σ SP | 0.21612 | Σ PE | 0.40031 | ||
M21P323 | 0.77 | 0.91 | −2.8731 | Σ SP | 0.23945 | Σ PE | 0.3944 | |||
M24P115 | 0.77 | 0.93 | −1.212 | SP γ | 0.80736 | |||||
M24P125 | 0.77 | 0.93 | −1.3519 | SP γ | 0.84225 | |||||
M24P325⇂ | 0.77 | 0.91 | −1.3851 | SP γ | 0.71274 | |||||
D.ii. | M11P245 | 0.85 | 0.86 | −3.235 | SP β | 0.51628 | PE α | 3.0789 | PE α^2 | −0.5699 |
M17P324 | 0.85 | 0.86 | −3.0437 | Σ SP | 0.091721 | Σ PE | 0.42878 | |||
M27P132 | 0.85 | 0.84 | −1.684 | SP γ | 0.13735 | PE δ | 1.3705 | PE θ | 1.1538 | |
M25P124 | 0.85 | 0.84 | −1.1798 | SP θ | −1.5418 | SP γ | 0.38649 | PE δ | 1.5527 |
For PER example models (Table 2.B.i), six achieved a sensitivity of 100% with a specificity higher than 90%. All but one of these models identified the total value of spectral power as its predictor, while the last identified the permutation entropy of the beta band. Conversely, seven models (Table 2.B.ii) achieved 100% specificity with a sensitivity of 80%; all but one of which identified the total value of spectral power, while the last identified the permutation entropy in the alpha band.
For NOS example models (Table 2.C), four achieved a sensitivity higher than 50%, with three of these achieving a specificity of 100%. Two of them identified the permutation entropy in the gamma band as the sole predictor, while the other identified total spectral power. While the gamma band has the most potential for contamination from EMG artifacts, we are confident that our noise and artifact reduction protocol removed unrelated activity from the gamma band based on the existing evidence of these techniques in other applications21,22.The R-scores for reactive and non-reactive patients differed substantially in the gamma band, many of the latter showing minimal gamma band activity at all.
Finally, for MER example models (Table 2.D.i), seven models performed well on both measures, achieving higher than 75% sensitivity and higher than 90% specificity. Two of these models used the combination of the total spectral power and total permutation entropy, with a stronger weight placed on the latter, while the other five used solely spectral power in the gamma band. Alternatively, four models (Table 2.D.ii) achieved higher than 84% on both sensitivity and specificity, however, each model identified different predictors.
3.3. Successful Permutations
When evaluating the breakdown of permutations by examining their prevalence in distinct categories of stimulation example models (Figure 2), one can clearly see that the majority of these used the full range of frequencies, as is to be expected given the prevalence of beta and gamma frequency ranges and total sums in the identified predictors. There is a similar pattern for the AUD and NOS categories on the standard deviation threshold permutation; these permutations involve changing the initial threshold (1.96) at which a post-stimulation baseline deviation is recognized as displaying reactivity to be more (1.64) or less (2.58) permissive. Both AUD and NOS evaluation showed increased accuracy when assigned the latter permutation, indicating their epochs to have a lower signal-to-noise ratio in comparison to the PER epochs, which would lead to higher false positive deviations. Conversely., the PER and merged categories included more permissive models with lower thresholds.
Regarding aggregation of EEG channels, models for AUD and PER stimulation found measures of dispersion (StdDev, IQR, RMS) to be more useful than measures of central tendency (Mean, Median) or position (Max, Upper Quartile) in determining presence of reactivity; further, all such models found a positive association between the predictor and the response. As broader dispersion indicates greater differences between input factors, this suggests that reactivity for these two types of stimulation may suggest a differential effect on different brain regions. Conversely, example models for NOS used measures of central tendency, indicating an overall strong response from all regions. The merged category showed, as intuition would suggest, a balance of both groups. None of the example models found the maximum value of the EEG channels to be useful in predicting decision of reactivity, providing support for the need to examine multiple electrode channels.
Regarding aggregation when examining post-stimulation peaks, which used a two-factor matrix for its permutations (mean vs max, first peak vs all peaks), there was a more mixed response. The example models for the merged class identified the mean of the first peak as the predominantly useful method in making prediction, whereas the AUD class did not use this in any of its example models, and the PER class was split between the mean of the first peak and the mean of all peaks. The AUD and PER classes found similar levels of utility in the application of extrema, to either set of peaks.
Finally, for aggregation of multiple successful readings, the PER category predominantly applies the mean of the readings and thus provides a more stable reading, whereas the AUD and MER classes found the measures of extrema to be more accurate. This is, however, expected given that the PER category had more trials.
4. Discussion
In this pilot study, our application of parameter estimation methods to EEG in reference to clinical decisions found that, for comatose ECMO patients identified as displaying EEG-R, audio stimulation induced spectral power changes concentrated in the β band, whereas peripheral stimulation was found to elicit an increase in spectral power across multiple bands, and nostril stimulation induced changes in the permutation entropy of the γ band. Focusing on these changes allowed for generation of proof-of-concept logistic regression models that could potentially be used to provide a graded probability of the presence of EEG-R.
We were able to determine these using a novel methodology with a statistical method that quantifies changes in aggregate EEG metrics following a graded stimulation protocol and compared the scores against a reference method of visual analysis using a broad array of parameter estimation techniques. The proposed stimulus-based qEEG-R score and accompanying algorithms were able to provide a probabilistically-graded measure of EEG-R in comatose patients with ECMO support. Through systematic analysis of the concordance of these responses, we also determined that different types of stimulation induce different changes in the EEG, supporting earlier findings that different algorithmic approaches can function better on different types of stimulation18,19.
The total spectral power was found to be a strong predictor, and this is intuitive in that it is partially definitional to EEG-R: Prior studies on EEG-R have similarly found strongly discriminatory responses from the beta27 and gamma28 ranges post-stimulation. Our approach to algorithmic permutation, however, was able to ascertain additional details that have not been previously investigated. These include the increased impact of signal dispersion on AUD and PER stimulation and the difference in noise tolerance between the different stimulation classes.
Some studies suggest8,11 that the presence of EEG-R was associated with better outcomes, however, the process by which to assess and measure EEG-R is lacking consensus. It is our goal that the results of this pilot study will lead to the development of a reliable and efficient assessment of reactivity of individual patient EEG. With proper validation, our methodology may enhance EEG reactivity assessment in areas where experienced clinicians may not be available to provide a traditional visual assessment. It is necessary as part of this process that we identify an appropriate translational paradigm in which this algorithm could be successfully applied.
It is important to highlight that further research with long-term neurological outcome data is necessary to evaluate this qEEG method in clinical practice, as the high rate of WLST limits accurate interpretation of the data in any neurological prognostication study28. As this protocol was implemented in a critical care environment with a high-risk patient population, the ability to perform a large number of precise trials on patients was limited by the need to avoid disruption to patient care efforts. Reflecting on the current state of practice, we used visual analysis of reactivity by experienced clinicians; while done routinely, this unfortunately is acknowledged in the field with limitations on inter-rater reliability with this approach16,28. In order to preserve clinical potential of this approach, the stimulation protocol was also performed in accordance with existing clinical standards, rather than with the higher standard of precision more typical of laboratory research. Nonetheless, even with this more open approach, the example models were, likewise without optimization of predictive levels, able to successfully predict the expert interpretation of reactivity at a higher rate than prior research has found for inter-expert agreement29; that is to say, this algorithm performed more in alignment with the interpretations of the clinicians than interpretations of different clinicians typically are to each other. This is important as it shows the level of accuracy that can be achieved with the limited quantity of data and variables available as the goal of this algorithm was in identifying parameters used by clinicians. Separate from the design concerns, there may have been potential contamination from non-random stimulus artifacts, such as EMG, that could have mimicked brain reactivity. This contamination would especially affect the result when changes in higher frequency components are considered as evidence of reactivity – as was the case in those models that our algorithm herein identified the gamma band for prediction. We are confident, however, that our noise reduction and artifact separation techniques eliminated the majority of potential for confusion in this domain.
To the best of our knowledge, this pilot study is the first focused on the ECMO patient population to apply qEEG to measure the EEG-R in comatose patients. The methods used in this study will next need to be validated and replicated in other types of healthy and comatose patient populations in the future; those particular parameters which showed greater effectiveness can provide further direction on how to refine this algorithm to better identify reactivity for specific classes of stimulation and neurological injury. The parameters identified herein could also be applied to filter EEG prior to display to clinicians in order to improve efficiency of VAR. Further, while prospective studies utilizing EEG-R and its effects on different measures are typically applied on a population basis, the method proposed herein could be applied not only to an individual patient, but in real time as stimulation is provided. By basing our qEEG examination on identification of parameters of EEG-R that align with the current techniques used by physicians, we aim to position this algorithm to serve as a complement to an existing clinician that enhances clinical accuracy, reliability, and efficiency.
5. Conclusion
The qEEG algorithm applied to the EEG and the R-scores generated from it were able to identify features of EEG that accurately identify a majority of those conclusions made by experienced clinicians even with minimal optimality efforts. These findings support our contention that a quantitative method by which to grade EEG-R in real-time at an individual patient level is technically feasible in an ECMO patient population experiencing long-term DOC.
Acknowledgements
This research was supported in part by the grant R01HL071568-15 (NVT, RG).
Contributor Information
AUTUMN WILLIAMS, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
YINUO ZENG, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
ZIWEI LI, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
NITISH THAKOR, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
ROMERGRYKO G. GEOCADIN, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
JAY BRONDER, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
NIRMA CARBALLIDO MARTINEZ, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
EVA K. RITZL, Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
SUNG-MIN CHO, Department of Neurology, Johns Hopkins University School of Medicine, 600 N. Wolfe Street, Phipps 455, Baltimore, MD, USA.
References
- 1.McCarthy FH, McDermott KM, Kini V, Gutsche JT, Wald JW, Xie D, Szeto WY, Bermudez CA, Atluri P, Acker MA, et al. , “Trends in US extracorporeal membrane oxygenation use and outcomes: 2002–2012”, in Seminars in thoracic and cardiovascular surgery, Vol. 27, 2 (Elsevier, 2015), pp. 81–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Thiagarajan RR, Barbaro RP, Rycus PT, Mcmullan DM, Conrad SA, Fortenberry JD, Paden ML, et al. , “Extracorporeal life support organization registry international report 2016”, ASAIO journal 63, 60–67 (2017). [DOI] [PubMed] [Google Scholar]
- 3.Carlson J, Enriquez C, Whitman G, Choi D, Geocadin R, and Cho S-M, Predictors of withdrawal of life sustaining treatments in ecmo patients (4792), 2020. [Google Scholar]
- 4.Turgeon AF, Lauzier F, Simard J-F, Scales DC, Burns KE, Moore L, Zygun DA, Bernard F, Meade MO, Dung TC, et al. , “Mortality associated with withdrawal of life-sustaining therapy for patients with severe traumatic brain injury: a Canadian multicentre cohort study”, Cmaj 183, 1581–1588 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Luyt C-E, Bréchot N, Demondion P, Jovanovic T, Hékimian G, Lebre- ton, Nieszkowska A, Schmidt, Trouillet J-L, Leprince P, et al. , “Brain injury during venovenous extracorporeal membrane oxygenation”, Intensive care medicine 42, 897–907 (2016). [DOI] [PubMed] [Google Scholar]
- 6.Cho S-M, Choi CW, Whitman G, Suarez JI, Martinez NC, Geocadin RG, and Ritzl EK, “Neurophysiological findings and brain injury pattern in patients on ECMO”, Clinical EEG and neuroscience, 1550059419892757 (2019). [DOI] [PubMed] [Google Scholar]
- 7.Williams SB and Dahnke MD, “Clarification and mitigation of ethical problems surrounding withdrawal of extracorporeal membrane oxygenation”, Critical care nurse 36, 56–65 (2016). [DOI] [PubMed] [Google Scholar]
- 8.Amorim R. L. O. d., Nagumo MM, Paiva WS, Andrade A. F. d., and Teixeira MJ, “Current clinical approach to patients with disorders of consciousness”, Revista da Associação Médica Brasileira 62, 377–384 (2016). [DOI] [PubMed] [Google Scholar]
- 9.Hirsch L, LaRoche S, Gaspard N, Gerard E, Svoronos A, Herman S, Mani R, Arif H, Jette N, Minazad Y, et al. , “American clinical neurophysiology society’s standardized critical care eeg terminology: 2012 version”, Journal of clinical neurophysiology 30, 1–27 (2013). [DOI] [PubMed] [Google Scholar]
- 10.Kondziella D, Bender A, Diserens K, van Erp W, Estraneo A, Formisano R, Laureys S, Naccache L, Ozturk S, Rohaut B, et al. , “European academy of neurology guideline on the diagnosis of coma and other disorders of consciousness”, European journal of neurology 27, 741–756 (2020). [DOI] [PubMed] [Google Scholar]
- 11.Benghanem S, Paul M, Charpentier J, Rouhani S, Salem OBH, Guillemet L, Legriel S, Bougouin W, Pène F, Chiche JD, et al. , “Value of eeg reactivity for prediction of neurologic outcome after cardiac arrest: insights from the Parisian registry”, Resuscitation 142, 168–174 (2019). [DOI] [PubMed] [Google Scholar]
- 12.Azabou E, Navarro V, Kubis N, Gavaret M, Heming N, Cariou A, Annane D, Lofaso F, Naccache L, and Sharshar T, “Value and mechanisms of eeg reactivity in the prognosis of patients with impaired consciousness: a systematic review”, Critical Care 22, 1–15 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Admiraal MM, van Rootselaar A-F, and Horn J, “Electroencephalographic reactivity testing in unconscious patients: a systematic review of methods and definitions”, European journal of neurology 24, 245–254 (2017). [DOI] [PubMed] [Google Scholar]
- 14.Logi F, Pasqualetti P, and Tomaiuolo F, “Predict recovery of consciousness in post-acute severe brain injury: the role of eeg reactivity”, Brain injury 25, 972–979 (2011). [DOI] [PubMed] [Google Scholar]
- 15.Bagnato S, Boccagni C, Prestandrea C, Fingelkurts AA, Fingelkurts AA, and Galardi G, “Changes in standard electroencephalograms parallel consciousness improvements in patients with unresponsive wakefulness syndrome”, Archives of Physical Medicine and Rehabilitation 98, 665–672 (2017). [DOI] [PubMed] [Google Scholar]
- 16.Gerber PA, Chapman KE, Chung SS, Drees C, Maganti RK, Ng Y.-t., Treiman DM, Little AS, and Kerrigan JF, “Interobserver agreement in the interpretation of EEG patterns in critically ill adults”, Journal of Clinical Neurophysiology 25, 241–249 (2008). [DOI] [PubMed] [Google Scholar]
- 17.Cho S-M, Ziai W, Mayasi Y, Gusdon AM, Creed J, Sharrock M, Stephens RS, Choi CW, Ritzl EK, Suarez J, et al. , “Noninvasive neurological monitoring in extracorporeal membrane oxygenation”, Asaio Journal 66, 388–393 (2020). [DOI] [PubMed] [Google Scholar]
- 18.Wiley SL, Razavi B, Krishnamohan P, Mlynash M, Eyngorn I, Meador KJ, and Hirsch KG, “Quantitative eeg metrics differ between outcome groups and change over the first 72 h in comatose cardiac arrest patients”, Neurocritical care 28, 51–59 (2018). [DOI] [PubMed] [Google Scholar]
- 19.Spalletti M, Carrai R, Scarpino M, Cossu C, Ammannati A, Ciapetti M, Buoninsegni LT, Peris A, Valente S, Grippo A, et al. , “Single electroencephalographic patterns as specific and time-dependent indicators of good and poor outcome after cardiac arrest”, Clinical Neurophysiology 127, 2610–2617 (2016). [DOI] [PubMed] [Google Scholar]
- 20.Delorme A and Makeig S, “EEGLab: an open source toolbox for analysis of single-trial eeg dynamics including independent component analysis”, Journal of neuroscience methods 134, 9–21 (2004). [DOI] [PubMed] [Google Scholar]
- 21.Ferdousy R, Choudhory AI, Islam MS, Rab MA, and Chowd-hory MEH, “Electro-oculographic and electromyographic artifacts removal from EEG”, in 2010 2nd international conference on chemical, biological and environmental engineering (IEEE, 2010), pp. 163–167. [Google Scholar]
- 22.Tang AC, Sutherland MT, and McKinney CJ, “Validation of sobi components from high-density eeg”, NeuroImage 25, 539–553 (2005). [DOI] [PubMed] [Google Scholar]
- 23.Cao Y, Tung, Gao J, Protopopescu VA, and Hively LM, “Detecting dynamical changes in time series using the permutation entropy”, Physical review E 70, 046217 (2004). [DOI] [PubMed] [Google Scholar]
- 24.Henry M and Judge G, “Permutation entropy and information recovery in nonlinear dynamic economic time series”, Econometrics 7, 10 (2019). [Google Scholar]
- 25.Fellinger R, Klimesch W, Schnakers C, Perrin F, Freunberger R, Gruber W, Laureys S, and Schabus M, “Cognitive processes in disorders of consciousness as revealed by eeg time–frequency analyses”, Clinical Neurophysiology 122, 2177–2184 (2011). [DOI] [PubMed] [Google Scholar]
- 26.King G and Zeng L, “Logistic regression in rare events data”, Political analysis 9, 137–163 (2001). [Google Scholar]
- 27.Chen W, Liu G, Sui Y, Zhang Y, Lin Y, Jiang M, Huang H, Ren G, and Yan J, “EEG signal varies with different outcomes in comatose patients: A quantitative method of electroencephalography reactivity”, J. Neu. Meth 342, 108812 (2020) [DOI] [PubMed] [Google Scholar]
- 28.Bai Y, Lin Y, and Zieman U, “Managing disorders of consciousness: the role of electroencephalography.”, J Neurol (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hermans MC, Westover MB, van Putten MJ, Hirsch LJ, and Gaspard N, “Quantification of EEG reactivity in comatose patients”, Clinical neurophysiology 127, 571–580 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]