Information-Theoretic Analysis of Refractory Effects in the P300 Speller Vaishakhi Mayya∗, Boyla Mainsah∗, and Galen Reeves∗† ∗Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA †Department of Statistical Science, Duke University, Durham, NC, USA Abstract—The P300 speller is a brain-computer interface that enables people with neuromuscular disorders to communicate based on eliciting event-related potentials (ERP) in electroen- cephalography (EEG) measurements. One challenge to reliable 7 communication is the presence of refractory effects in the 1 P300 ERP that induces temporal dependence in the user’s 0 EEG responses. We propose a model for the P300 speller 2 as a communication channel with memory. By studying the n maximum information rate on this channel, we gain insight into a the fundamental constraints imposed by refractory effects. We J constructcodebooksbasedontheoptimalinputdistribution,and 2 compare them to existing codebooks in literature. Fig. 1: Example of a P300 speller visual interface. The flash 1 group is the set of characters that are illuminated. I. INTRODUCTION ] A brain-computer interface (BCI) is a system that monitors T electrophysiological signals and translates the information I . encoded in these signals into commands that are relayed to a The order in which the target and non-target stimulus events s are presented plays a significant role in the ERP elicitation c computer [1]. The P300 speller, developed by Farwell and [ process.Thisisdue,inpart,torefractoryeffects[5],wherethe Donchin [2], is a BCI that provides an alternative means ability to elicit a strong ERP response to every target stimulus 1 of communication for individuals with severe neuromuscular event presentation is affected by the target-to-target interval v diseases, such as amyotrophic lateral sclerosis [3], that impair 3 neural pathways that control muscles. In the extreme case (TTI), which is the amount of time between stimulus events. If 1 a P300 ERP is elicited following the presentation of a target, of locked-in syndrome, individuals lose all voluntary muscle 3 the amplitude of a successive P300 ERPs elicited in response controlandareunabletocommunicateverballyorviagestures. 3 to subsequent target events with low TTI may be attenuated or 0 The P300 speller relies on eliciting and detecting event- distorted [6]. The precise behavior of the refractory effects is . related potentials (ERP) embedded in electroencephalography 1 not well understood and can vary depending on the user and 0 (EEG) data. These ERPs are elicited in response to specific the type of system [6]. 7 stimuluseventswithinthecontextofanoddballparadigm.The 1 user is presented with a sequence of stimulus events that fall In this paper, we develop and analyze a communication v: into one of two classes: a rare oddball, i.e. target stimulus, and model for the P300 speller that allows us to understand some i a frequent non-target stimulus [4]. The presentation of the rare of the fundamental constraints imposed by refractory effects. X target stimulus event elicits an ERP response that includes a Our channel model consists of a finite state machine (FSM) ar distinct positive deflection called the P300 signal. followed by a memoryless noisy channel; together they form a The P300 allows a user to communicate one character at finitestatechannel(FSC).TheFSMusesL+1statestomodel a time. In a visual P300 speller, the user is presented with a refractory effect that last for L time steps. The memoryless an array of choices on a screen, such as the grid shown in noise channel describes the mapping between an ERP and Figure 1. While the user focuses on a target character, subsets the response generated by processing the EEG measurements. of characters, called flash groups, are sequentially illuminated We study properties of the distributions that maximize the on the grid. In this context, the illumination of a flash group mutualinformationrateonthischannel.Weprovideanexplicit is a stimulus event. Under ideal conditions, a P300 ERP is characterizationoftheoptimalinputdistributioninthenoiseless elicited each time that the target character is presented. The case, and we use the generalized Blahut-Arimoto algorithm elicited ERPs are embedded in noisy EEG data. Following (GBAA) [7], [8] to compute the optimal input distributions each stimulus event, a time window of the EEG waveform is numerically for the noisy case. We then use the optimal analyzed to determine the likelihood that the stimulus event distributiontodesigncodebooks(i.e.sequencesofflashgroups) contains the target character. After a sufficient amount of data that can be used for the P300 speller. Performance is assessed has been collected, the target character is estimated based on using numerical simulations. the observed responses. Therestofthisdocumentisorganizedasfollows.SectionII, reviews the previous approaches to designing flash groups for dependoncurrentandpreviousinputs.Inthiscase,itispossible the P300 speller and provides relevant background concerning to describe a state sequence that is uniquely determined by the channels with memory. Section III introduces our channel input sequence, i.e. P(Y |X ,S )=P(Y |S ,S ). n n n−1 n n n−1 model and a method to design flash groups using this model. We focus on the setting where the channel input {X } n The results in are presented Section IV. is an r-th order, time-invariant Markov process. Following Kavcˇic´ [7], the distribution on {X } is parametrized using an II. BACKGROUND |X|r×|X|r transition matrix P annd the corresponding mutual A. Current codebooks used for P300 speller information rate is defined according to Within the context of the P300 speller, a codebook C is a 1 W ×N binary matrix that indicates which of the W character R(P)=nl→im∞nI(X1n;Y1n|S0), (1) areflashedacrossN trials.Eachrowofthematrixcorresponds where Xn = [X ,··· ,X ]. For an IFSC, the limit exists to the flash pattern (or codeword) of a specific character. The 1 1 n and is independent of the distribution on S . The maximum columns correspond to flash groups. Given that a user’s target 0 character is w, the entry C(w,n) indicates whether the target information rate over r-th order Markov sources is defined character is flashed in the n-th trial. according to Withinthissetting,previousworkhavefocusedonthedesign R = maxR(P), (2) r of codebooks in order to increase the accuracy of the P300 P∈Pr speller [2], [9]–[13]. In the row-column paradigm (RCP) [2], where P is the set of all transition matrices for an r-th order r the codebook is generated using a random permutation of the Markov source. A distribution P∗ ∈P is said to be optimal r charactersinrowsandcolumnsinagridlayout,suchastheone if it achieves the maximum in (2). Note that R provides a r showninFigure1.Duetotherandomizedorderofpresentation lower bound on the capacity of the channel. of the row and column flash groups in the RCP, characters are Kavcˇic´ [7] provides a stochastic method for solving the op- often flashed twice consecutively. timization problem (2) for ISI IFSCs based on a generalization The checkerboard paradigm (CBP) [9] was developed to of the Blahut-Arimoto algorithm. This method is extended to mitigaterefractoryeffectsbyimposingaminimumtimeinterval a larger class of IFSCs by Vontobel et al. [8]. between target character presentations. Due to the method of construction of the codebook in the CBP, the duration of the III. PROPOSEDCHANNELMODELANDCODEBOOKDESIGN minimum target interval depends on the specific geometry of A. P300 speller channel model the grid layout. Other approaches have used ideas from coding theory The P300 speller is modeled as a cascade of the FSM and to construct codebooks, such as maximizing the minimum a noisy memoryless channel, as shown in Figure 2. The states Hamming distance [10], [11], e.g. the D10 codebook [10]. of the FSM model the memory in the channel induced by the However, in online studies, these codebooks resulted in similar refractory period. Throughout this paper, the time step refers orworseperformancewhencomparedtotheRCP.Onepossible to the duration between the presentation of successive flash explanation for this behavior is the fact the design of these groups. A refractory period that lasts L time steps is modeled codebooks did not account for refractory effects. using L+1 states. TheexplicitconnectionbetweentheBCIandaninformation The channel input, X , is a binary variable that indicates n channelhasbeenconsideredpreviously[10],[14].Forexample, whether the target character is present (X =1) or not present n Omar et al. [14] represented the BCI-based communication (X =0) in the flash groups in the n-th trial. The state, S , n n (based on motor imagery) as a memoryless binary symmetric represents whether the channel is in the ground state, or one channel. However, in a P300 speller context, a memoryless of L possible refractory states. The transitions between states channel assumption does not account for system memory due are determined according to to refractory effects. G, if X =0, S =G or S =R n n−1 n−1 L B. Information rates for finite-state channels S = R , if X =0, S =R ,l∈{2,··· ,L} n l n n−1 l−1 The channel model we study is an indecomposable finite- R , if X =1 , 1 n state channel (IFSC) [15]. The channel input is a discrete- time process {X } supported on a finite alphabet X. The and are illustrated in Figure 3. Note that the state transition is n channel state at time n is modeled by a random variable a deterministic function of the previous L channel inputs. Sn ∈ {1,··· ,k}. The channel output at time n is a The intermediate output, Zn, is a binary variable that is random variable Y ∈ Y whose distribution is a func- equal to one if and only if the input is one and the channel in n tion of the input X and state S . The channel is de- not in a refractory state, i.e. n n−1 fined by the conditional distribution P(Yn,Sn|Xn,Sn−1) = 1, if X =1, S =G P(Yn|Xn,Sn−1)P(Sn|Xn,Sn−1). n n−1 A channel is said to be an intersymbol interference (ISI) Zn = 0, if Xn =1, Sn−1 =Rl,l∈{1,2··· ,L} channel if the state transitions and the output of the channel 0, if X =0 . n Finite state channel Xn Zn Yn 1 1 1 W Encoder FSM Memoryless Channel Decoder W(cid:99) Fig. 2: Model of the P300 speller communication channel. The target character, W, is encoded with a codeword, Xn, which 1 is transmitted through a cascade of a finite state machine (FSM), with an intermediate output, Zn, and a noisy memoryless 1 channel. The output sequence, Yn, is used to obtain an estimate of the target character, W(cid:99). 1 When the channel is in one of the refractory states, Z =0, Observe that if the channel input is drawn according to a n independently of the input. The output, Y , is the observed distribution P in the constrained set P˜ , then the mapping n L response that depends only on the intermediate output Z . between the input and noiseless output is invertible, and thus n In the context of the P300 speller, the distribution on Xn the mutual information is equal to the entropy of the input: 1 is induced by the codebook, C, and the distribution over the I(Xn;Yn|S )=H(Xn|S )−H(Xn|Yn,S ) target character. The intermediate output Z indicates whether 1 1 0 1 0 1 1 0 n a P300 ERP was elicited for the n-th trial. The probabilistic =H(X1n|S0). mapping from Z to Y models the noise induced by the EEG n n Therefore, the maximum information rate in the constrained measurement and classification process. setting can be expressed as The case L=1 was studied in our previous work [16]. In this paper, we study the behavior of channels with general L. max R(P)= max lim 1H(Xn|S ) P∈P(cid:101)L P∈P(cid:101)Ln→∞n 1 0 B. Codebook design H (a) = max b , (4) Our approach to codebook design is to find distributions on a∈[0,1]1+La theinputsequence{X }thatmaximizethemutualinformation n where H (p)=−plogp−(1−p)logp is the binary entropy rateinthechannel,andthendesigncodebooksthatapproximate b function.Moreover,bydifferentiationitcanbeverifiedthatthe these distributions. For the P300 speller, the channel input is maximum in (4) is achieved by the unique solution a∗ ∈[0,1] a function of the user’s target character and the codebook, to the equation C(w,n). For a fixed codebook of length n, the randomness a=(1−a)L+1. (5) in the input is due to the randomness in the target character. In order to design codebooks with good properties, we use a In the case of L = 1, the optimal value can be computed √ random construction in which the rows of the codebook are explicitly as a∗ = 3− 5. As pointed out in [19], 1−a∗ is the drawn i.i.d. from the distribution that maximizes the mutual 2 inverse of the golden ratio. information. We refer to this construction as a memory-based The next result shows that this rate also provides an upper codebook for the model with refractory period of length L bound on the mutual information rate. (MBC(L)). Proposition 1. Consider the P300 speller channel model in IV. RESULTS Figure 2, with L refractory states. For any distribution on the channelinput{X }andinitialstateS ,themutualinformation A. Analysis of the optimal input distribution n 0 satisfies This section studies properties of the maximum information 1 rate in the noiseless case, where the observed response Yn limsup I(Xn;Yn|S )≤RUB, (6) is equal to the intermediate output Zn. In this setting, the n→∞ n 1 1 0 L optimizationproblemcanbeexpressedintermsofmaximizing where entropy rates of run-length limited sequences [17], [18]. H (a) To facilitate our analysis, we find it useful to introduce a RUB = max b . (7) L a∈[0,1]1+La constrained class of Markov sources. Specifically, we define P(cid:101)r to be the set of all r-th order Markov processes of the Proof: Starting with the data processing inequality, we form: have P(Xn =1|Xnn−−r1)=(cid:40)a0,, iofthXernnw−−ir1se=0. (3) n1I(X1n,Y1n|S0)≤ n1I(X1n,Z1n|S0) 1 = H(Zn|S ), Note that every sequence drawn according to this distribution n 1 0 has at least r zeros between ones. In the rest of the paper, where the second step follows from the chain rule for mutual we will focus exclusively on the setting where the memory in information and the fact that Z is a deterministic function of n source is matched to the memory in the channel, i.e. r =L. the channel input. Next, we note that the IFSC must transition 0 0 0 0 0 0 G RL RL−1 ··· R2 1 R1 1 1 1 1 Fig. 3: Model of the P300 ERP elicitation process as an FSM. The input, X , which governs the state transitions is marked n over the arrows. G is the ground state and R ,l=1,2...L are refractory states. l through all L refractory states between successive ones, and Accuracy with different refractory states thus{Z }isan(L,∞)constrainedbinarysequence[17],[18]. 1 n L = 1 Therefore, by [17, Theorem 1], it follows that 0.9 L = 2 L = 3 1 H (a) 0.8 limsup H(Zn|S )≤ max b . (8) n→∞ n 1 0 a∈[0,1]1+La 0.7 y ac0.6 ur In light of Proposition 1, we see that as far as the noiseless cc0.5 A case is concerned, we can find an optimal distribution by 0.4 restricting our attention to the constrained set P(cid:101)L. 0.3 Proposition 2. Consider the P300 speller channel model in 0.2 Figure 2, with no noise (i.e. Y = Z ) and L refractory n n states. Let P∗ be the optimal distribution in the constrained 0.10 0.5 1 1.5 2 2.5 3 set P(cid:101)L with parameter a∗ defined by (5). Then, P∗ achieves <2 Noise power the maximum information rate for the channel, i.e. Fig. 5: Performance of the MBC(L) as L increases. R (P∗)= max R (P)= max R (P). (9) L L L P∈P(cid:101)L P∈PL character is transmitted across the channel. The received We remark that the distribution described in Proposition 2 sequence is used to estimate the target character using an might not be the only distribution that achieves the maximum. optimal decoder. The accuracy is the percentage of characters In [16], we showed that in a channel with L = 1, when the that are correctly estimated over 1000 runs. inputisafirstorderMarkovsource,thereareatleasttwoinput Figure 4 compares the performance of the codebooks transition matrices for which the maximum information rate is for channels with L = 1,2 refractory states, where the achieved. corresponding MBC(L) is generated based on the number In the presence of noise, the problem of optimizing the of channel states. In both cases, the MBC(L) performs better informationrateR(P)overr-thorderMarkovsourcesP ∈P r than all the other codebooks for a given channel. can be solved numerically using the GBAA [8]. Figure 5 shows the performance of MBC(L) associated B. Simulation results with a channel with L = 1,2,3 refractory states. The figure illustrates the decrease in the performance of MBC(L) as L This section uses numerical simulations to compare the increases. performance of our memory-based codebook design with the Theseresultsillustratethebenefitsthatcanbeachievedwhen RCP, CBP, and D10 codebooks. We estimate accuracy as the codebooks are designed based on the process underlying a function of the channel noise parameter and the number the generation of refractory effects. The extent to which our of states in the channel. Following the setup in [20], these model is representative of refractory effects in the P300 speller simulations apply to the P300 speller layout shown in Figure 1. is an important direction for future work. The noise is modeled using an additive white Gaussian noise (AWGN) channel with noise power σ2. V. CONCLUSION UsingtheGBAA,wefindtheoptimalinputtransitionmatrix for general σ2. We then generate codebooks that are optimized This paper develops a communication model to represent for channel noise and memory, MBC(L), as described in the ERP elicitation process in the P300 speller and then Section III. uses this model to design codebooks that are optimized as In simulations, we select one of 36 characters uniformly a function of the length of the refractory period and channel as the target character. The codeword associated with that noise. Simulation results suggest that this flexible framework FSC with one refractory state FSC with two refractory states 1 1 MBC(1) 0.9 CBP 0.9 D10 RCP 0.8 0.8 0.7 0.7 Accuracy00..56 Accuracy00..56 0.4 0.4 MBC(2) 0.3 0.3 CBP D10 0.2 0.2 RCP 0.1 0.1 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3 <2 Noise power <2 Noise power (a) (b) Fig. 4: Codebook performance as a function of channel noise parameter, σ2, for the P300 speller channel with (a) L=1 and (b) L=2 refractory states. The MBC(L) outperforms the other codebooks in both channel models. for codebook design could lead to improved performance in [12] R.Ma,N.Aghasadeghi,J.Jarzebowski,T.Bretl,andT.P.Coleman,“A settings where refractory effects have a significant impact on stochasticcontrolapproachtooptimallydesigninghierarchicalflashsets inP300communicationprostheses,”NeuralSystemsandRehabilitation the accuracy of the P300 speller. The performance of the Engineering,IEEETransactionson,vol.20,no.1,pp.102–112,2012. memory-based codebook needs to be verified with EEG data [13] G.Cuntai,M.Thulasidas,andW.Jiankang,“HighperformanceP300 and validated with online implementation. spellerforbrain-computerinterface,”inBiomedicalCircuitsandSystems, Dec2004. [14] C. Omar, A. Akce, M. Johnson, T. Bretl, R. Ma, E. Maclin, M. Mc- REFERENCES Cormick,andT.Coleman,“Afeedbackinformation-theoreticapproach to the design of brain-computer interfaces,” International Journal of [1] J.Wolpaw,N.Birbaumer,D.McFarland,G.Pfurtscheller,andT.Vaughan, Human-computerInteraction,vol.27,pp.5–23,2011. “Brain-computer interfaces for communication and control,” Clinical [15] R.G.Gallager,InformationTheoryandReliableCommunication. New Neurophysiology,vol.113,no.6,pp.767–91,2002. York,NY,USA:JohnWiley&Sons,Inc.,1968. [2] L. A. Farwell and E. Donchin, “Talking off the top of your head: [16] V. Mayya, B. Mainsah, and G. Reeves, “Modeling the P300-based Toward a mental prosthesis utilizing event-related brain potentials,” brain-computerinterfaceasachannelwithmemory,”September2016, Electroencephalogr Clin Neurophysiol, vol. 70, no. 6, pp. 510–523, unpublishedconferencepaperat:54thAnnualAllertonConferenceon 1988. Communication,Control,andComputing,Allerton. [3] E. W. Sellers, T. M. Vaughan, and J. R. Wolpaw, “A brain-computer [17] E.ZehaviandJ.K.Wolf,“Onrunlengthcodes,”IEEETransactionson interface for long-term independent home use,” Amyotrophic Lateral InformationTheory,vol.34,no.1,pp.45–54,January1988. Sclerosis,vol.11,no.5,pp.449–455,2010. [18] K. A. S. Immink, P. H. Siegel, and J. K. Wolf, “Codes for digital [4] S. Sutton, M. Braren, J. Zubin, and E. R. John, “Evoked-potential recorders,”IEEETransactionsonInformationTheory,vol.44,no.6,pp. correlates of stimulus uncertainty,” Science, vol. 150, no. 3700, pp. 2260–2299,October1998. 1187–8,1965. [19] H.Permuter,P.Cuff,B.VanRoy,andT.Weissman,“Capacityofthe [5] J. Jin, E. W. Sellers, and X. Wang, “Targeting an efficient target-to- trapdoor channel with feedback,” IEEE Transactions on Information targetintervalforP300spellerbrain–computerinterfaces,”Medical& Theory,vol.54,no.7,2008. biologicalengineering&computing,vol.50,no.3,pp.289–296,2012. [20] B. O. Mainsah, L. M. Collins, and C. S. Throckmorton, “Using the [6] S.Martens,N.Hill,J.Farquharetal.,“Overlapandrefractoryeffectsin detectability index to predict p300 speller performance,” Journal of abraincomputerinterfacespellerbasedonthevisualP300event-related NeuralEngineering,vol.13,no.6,p.066007,2016. potential,”Journalofneuralengineering,vol.6,no.2,p.026003,2009. [7] A.Kavcˇic´,“OnthecapacityofMarkovsourcesovernoisychannels,”in GlobalTelecommunicationsConference,2001.GLOBECOM’01.IEEE, vol.5,2001,pp.2997–3001. [8] P. O. Vontobel, A. Kavcˇic´, D. M. Arnold, and H.-A. Loeliger, “A generalizationoftheBlahut–Arimotoalgorithmtofinite-statechannels,” IEEETransactionsonInformationTheory,vol.54,no.5,pp.1887–1918, 2008. [9] G. Townsend, B. K. LaPallo, C. B. Boulay, D. J. Krusienski, G. E. Frye, C. K. Hauser, N. E. Schwartz, T. M. Vaughan, J. R. Wolpaw, and E. W. Sellers, “A novel P300-based brain-computer interface stimulus presentation paradigm: Moving beyond rows and columns,” ClinNeurophysiol,vol.121,no.7,pp.1109–20,2010. [10] J.Hill,J.Farquhar,S.Martens,F.Biessmann,andB.Schlkopf,“Effects of stimulus type and of error-correcting code design on BCI speller performance,”inAdvancesinNeuralInformationProcessingSystems 21. CurranAssociates,Inc.,2009,pp.665–672. [11] J.Geuze,J.D.Farquhar,andP.Desain,“Densecodesathighspeeds: Varying stimulus properties to improve visual speller performance,” Journalofneuralengineering,vol.9,no.1,p.016009,2012.