IEEE/ACMTRANSACTIONSONCOMPUTATIONALBIOLOGYANDBIOINFORMATICS,VOL.18,NO.5,SEPTEMBER/OCTOBER 2021 1645 EEG-Based Brain-Computer Interfaces (BCIs): A Survey of Recent Studies on Signal Sensing Technologies and Computational Intelligence Approaches and Their Applications XiaotongGu,ZehongCao ,AlirezaJolfaei ,PengXu ,DongruiWu , Tzyy-PingJung ,andChin-TengLin Abstract—Brain-Computerinterfaces(BCIs)enhancethecapabilityofhumanbrainactivitiestointeractwiththeenvironment.Recent advancementsintechnologyandmachinelearningalgorithmshaveincreasedinterestinelectroencephalographic(EEG)-basedBCI applications.EEG-basedintelligentBCIsystemscanfacilitatecontinuousmonitoringoffluctuationsinhumancognitivestatesunder monotonoustasks,whichisbothbeneficialforpeopleinneedofhealthcaresupportandgeneralresearchersindifferentdomainareas. Inthisreview,wesurveytherecentliteratureonEEGsignalsensingtechnologiesandcomputationalintelligenceapproachesinBCI applications,compensatingforthegapsinthesystematicsummaryofthepastfiveyears.Specifically,wefirstreviewthecurrentstatus ofBCIandsignalsensingtechnologiesforcollectingreliableEEGsignals.Then,wedemonstratestate-of-the-artcomputational intelligencetechniques,includingfuzzymodelsandtransferlearninginmachinelearninganddeeplearningalgorithms,todetect, monitor,andmaintainhumancognitivestatesandtaskperformanceinprevalentapplications.Finally,wepresentacoupleof innovativeBCI-inspiredhealthcareapplicationsanddiscussfutureresearchdirectionsinEEG-basedBCIresearch. Ç 1 INTRODUCTION The corresponding human BCI systems aim to translate human cognition patterns using brain activities. They use 1.1 AnOverviewofBrain-ComputerInterfaces(BCIs) recordedbrainactivitytocommunicatewithacomputerto 1.1.1 WhatisaBCI? controlexternaldevicesorenvironmentsinamannerthatis THE first research papers on brain-computer interfaces compatible with the intentions of humans [4], such as con- (BCIs)werereleasedinthe1970s.Theseworksaddressed trollingawheelchairorrobotasshowninFig.1. analternativetransmissionchannelthatdoesnotdependon There are two primary types of BCIs. The first type is thenormalperipheralnerveandmuscleoutputpathsofthe active and reactive BCIs. An active BCI derives patterns brain[1].TheearliestconceptofaBCIproposedmeasuring from brain activity, which is directly and consciously con- anddecodingbrainwavesignalstocontrolaprostheticarm trolledbytheuserindependentofexternaleventstocontrol andcarryoutadesiredaction[2].Later,aformaldefinition a device [5]. A reactive BCI extracts outputs from brain oftheterm’BCI’wasinterpretedasadirectcommunication activities in reaction to external stimulation, which is indi- pathway between the human brain and an external device rectlymodulated bythe usertocontrol anapplication.The [3].Inthepastdecade,humanBCIshaveattractedsubstan- second type is passive BCIs, which explore user’s percep- tialattentionduetotheirextensiveresearchpotential. tion, awareness, and cognition without the purpose of vol- untary control, resulting in an enriched human-computer interaction(HCI)withimplicitinformation[6]. (cid:1) XiaotongGuiswiththeUniversityofTasmania,TAS7005,Australia E-mail:[email protected]. (cid:1) ZehongCaoiswithSTEM,theUniversityofSouthAustralia,Adelaide, 1.1.2 ApplicationAreas Australia.E-mail:[email protected]. The promising future of BCIs has encouraged the research (cid:1) AlirezaJolfaeiiswiththeMacquarieUniversity,NSW2109,Australia. community to interpret brain activities to establish various E-mail:[email protected]. (cid:1) PengXuiswiththeUniversityofElectronicScienceandTechnologyof research directions for BCIs. General researchers from a China,610054,China.E-mail:[email protected]. widerangeofareasofexpertise,includingcomputerscience, (cid:1) DongruiWuiswiththeHuazhongUniversityofScienceandTechnology, data analysis, engineering, education, health, and psychol- 430074,China.E-mail:[email protected]. (cid:1) Tzyy-PingJungiswiththeUniversityofCalifornia,SanDiego,CA92093 ogy,canfindrelatedcontentandresearchtrendsrelevantto USA.E-mail:[email protected]. theirrespectiveareasinthissurveypaper.Here,weaddress (cid:1) Chin-TengLiniswiththeUniversityofTechnologySydney,NSW2007, the best-known application areas in which BCIs have been Australia.E-mail:[email protected]. widelyexploredandapplied:(1)ABCIisrecognisedtooffer Manuscriptreceived30May2020;revised4Dec.2020;accepted22Dec.2020. thepotentialforanapproachthatusesintuitiveandnatural Dateofpublication19Jan.2021;dateofcurrentversion7Oct.2021. human mechanisms of cognitive processing to facilitate (Correspondingauthor:ZehongCao.) DigitalObjectIdentifierno.10.1109/TCBB.2021.3052811 interactions[7].Sincethecommonmethodsofconventional 1545-5963©2021IEEE.Personaluseispermitted,butrepublication/redistributionrequiresIEEEpermission. Seeht_tps://www.ieee.org/publications/rights/index.htmlformoreinformation. 1646 IEEE/ACMTRANSACTIONSONCOMPUTATIONALBIOLOGYANDBIOINFORMATICS,VOL.18,NO.5,SEPTEMBER/OCTOBER 2021 Fig.1.Frameworkofabrain-computerinterface(BCI). HCIaremostlyrestrictedtomanualinterfacesandthemajor- ity of other designs are not being extensively adopted [8], BCIs change how HCIs could be used in complex and demanding operational environments. BCIs could revolu- tionise mainstream HCIs for use in different areas such as computer-aided design (CAD) [9]. Using BCIs to monitor user states for intelligent assistive systems is also substan- Fig.2.BCIcontributestovariousfieldsofresearch. tially conducted in entertainment and health areas [10]. (2) AnotherareawhereBCIapplicationsarebroadlyusedisas Overall, BCIs have contributed to various fields of game controllers for entertainment. Some BCI devices are research. As briefly shown in Fig. 2, they are involved in inexpensive,easilyportableandeasytoequip,whichmakes gameinteractionentertainment,robotcontrol,emotionrec- them feasible for broad use in entertainment communities. ognition, fatigue detection, sleep quality assessment, and The compact and wireless BCI headsets developed for the clinicalfields,suchasabnormalbraindiseasedetectionand gamingmarketareflexible,mobile,andrequirelittleeffortto prediction including applications to seizure, Parkinson’s setup.Thoughtheiraccuracyisnotasprecise asotherBCI disease,Alzheimer’sdisease,andschizophrenia. devicesusedinmedicalareas,theyarestillpracticalforgame developers and have been successfully commercialised for theentertainmentmarket.Somespecificmodels[11]arecom- 1.1.3 BrainImagingTechniques bined with sensors to detect more signals such as facial Brain-sensingdevicesforBCIscanbecategorisedintothree expression,featuresfromwhichcanupgradetheusabilityfor groups: invasive, partially invasive, and non-invasive [20]. entertainmentapplications.(3)BCIshavealsobeenfulfilling Ininvasive andpartiallyinvasivedevices,brainsignalsare significant roles in neurocomputing for pattern recognition collectedfromintracorticalandelectrocorticography(ECoG) andmachinelearningbasedonbrainsignalsandanalysisof electrodeswithsensorstappingdirectlyintothebrain’scor- computational expert knowledge. Recently, studies have tex. As the invasive devices insert electrodes into the brain shown[12],[13],[14]thatnetworkneuroscienceapproaches cortex,eachelectrodeoftheintracorticalcollectiontechnique have been used to quantify brain network reorganisation canprovidespikingtoproducethepopulation’stimedevel- from different varieties of human learning. The results of opinganoutputpattern,whichcausesonlyasmallsample thesestudiesindicateoptimisationofadaptiveBCIarchitec- ofthecompletesetofneuronsinbondedregionstobepre- tures and the prospect of revealing the neural basis and sented because microelectrodes can only detect spiking futureperformanceofBCIlearning.(4)Inthehealthcarefield, when they are in the proximity of a neuron. In this case, thebrainwaveheadset,whichcancollectexpressiveinforma- ECoG,asanextracorticalinvasiveelectrophysiologicalmon- tionwiththesoftwaredevelopmentkitprovidedbytheman- itoring method, uses electrodes attached under the skull. ufacturer,hasbeenutilisedtofacilitateeffectivecontrolofa With lower surgical risk, a rather high signal-to-noise ratio robotbyseverelydisabledpeoplethroughsubtlemovements suchasneckmotionandblinking[15]. BCIshavealsobeen (SNR),andahigherspatialresolutioncomparedwithintra- usedinassistingpeoplewhohavelostthemuscularcapacity cortical signals of invasive devices, ECoG offers superior to restore communication and control over devices. A potentialinthemedicalarea.Inparticular,ECoGhasawider broadlyinvestigatedclinicalareaisfocusedonimplementing bandwidthtogathersignificantinformationfromfunctional BCI spelling devices. One well-known application of this brainareastotrainahigh-frequencyBCIsystem,aswellas domainisaP300-basedspeller.BybuildingupontheP300- highSNRsignalsthatarelesspronetoartefactsarisingfrom, basedspellerandusingaBCI2000platform[16],[17],theBCI forinstance,musclemovementsandeyeblinks. speller achieves a positive result with respect to non- Eventhoughreliableinformationofcorticalandneuronal experienced users using this brain-controlled spelling tool. dynamics can be provided by invasive or partially invasive AnotherapplicationareaofBCIistheauthenticationprocess BCIs,whenconsideringeverydayapplications,thepotential within the cybersecurity field [18], which is to utilise EEG- benefitofincreasedsignalqualityisneutralisedbythesurgi- basedsystemstodecidewhethertorejectoraccepttheclaim- cal risks and need for long-term implantation of invasive ingidentityofasubject.TheP300-BCI-basedauthentication devices[21].Recentstudiesfocusedoninvestigatingthenon- systemproposedbyYuetal.[19]showedthegoodpotential invasivetechnology,whichusesexternalneuroimagingdevi- ofBCIforhighlysecuredauthenticationsystems. cestorecordbrainactivity,includingfunctionalnear-infrared GUETAL.:EEG-BASEDBRAIN-COMPUTERINTERFACES(BCIS):ASURVEYOFRECENTSTUDIESONSIGNALSENSING... 1647 spectroscopy(fNIRS),functionalmagneticresonanceimaging A BCI makes it possible to observe a specific brain’s (fMRI), and electroencephalography (EEG). Specifically, response to specific stimuli events by recording small fNIRSusesnear-infrared(NIR)lighttoassesstheaggregation potential changes in EEG signals immediately after visual levelofoxygenatedhaemoglobinanddeoxygenatedhaemo- or audio stimuli appear. This phenomenon is formally globin. fNIRS depends on the haemodynamic response or calledevent-relatedpotentials(ERPs),definedasslightvol- blood-oxygen-level-dependent(BOLD)responsetoformulate tagesoriginatinginthebrainasresponsestospecificstimuli functional neuroimages [22]. Because of the power limits of orevents[27],whicharethenseparatedintovisualevoked light and spatial resolution, fNIRS cannot be employed to potentials (VEPs) and auditory evoked potentials (AEPs). measure corticalactivitypresentedunder 4cminthe brain. For EEG-based BCI studies, the P300 wave is a representa- Additionally,duetothefactthatbloodflowchangesslower tivepotentialresponseofanERPelicitedinthebraincortex thanelectricalormagneticsignals,Hbanddeoxy-Hbexhibit ofamonitoredsubject,presentingasapositivedeflectionin slow and steady variations, so the temporal resolution of voltagewithalatencyofroughly250to500ms[28].InVEP fNIRSiscomparativelylower thanthatofelectricalormag- tasks,rapidserialvisualpresentation(RSVP),theprocessof netic signals. fMRI monitors brain activities by assessing continuouslypresentingmultipleimagespersecondathigh changesrelatedtobloodflowinbrainareas,anditrelieson display rates, is considered to have potential in enhancing themagneticBOLDresponse,whichenablesfMRItohavea human-machinesymbiosis [29]. Steady-State visual evoked higher spatial resolution and collect brain information from potentials (SSVEPs) are a resonance phenomenon originat- deeper areas than fNIRS, since magnetic fields have better ing mainly in the visual cortex when a person is focusing penetrationthanNIRlight.However,similarlytofNIRS,the visualattentiononalightsourceflickeringwithafrequency drawback of fMRI with low temporal resolutions is evident above4Hz[30].Inaddition,thepsychomotorvigilancetask becauseofthebloodflowspeedconstraint.Withthemeritsof (PVT) is a sustained-attention, reaction-timed task of mea- relying on the magnetic response, the fMRI technique also suring the speed with which subjects respond to a visual hasanotherflawsincethemagneticfieldsaremoreproneto stimulus,anditcorrelateswiththeassessmentofalertness, be distorted by deoxy-Hb than by Hb molecules. The most fatigue,orpsychomotorskills[31]. significantdisadvantagesoftheuseoffMRIindifferentsce- nariosarethatitrequiresanexpensiveandheavyscannerto 1.2 OurMotivationsandContributions generatemagneticfieldsandthatthescannerisnotportable 1.2.1 Motivations andrequiressubstantialeffortformovement. Recent (2015-2019) EEG survey articles focus primarily on separately summarising statistical features or patterns, col- 1.1.4 EEG-BasedBCIs lecting classification algorithms, or introducing deep learn- EEGsignals,featuringdirectmeasurementsofcorticalelec- ing models. For example, a recent survey [32] provided a tricalactivityandhightemporalresolution,havebeenpur- comprehensiveoutlineofthelatestclassificationalgorithms suedextensivelyinmanyrecentBCIstudies[23],[24].Asthe usedinEEG-basedBCIs,whichincludeadaptiveclassifiers, mostgenerallyusednon-invasivetechnique,EEGelectrodes transfer and deep learning, matrix and tensor classifiers, can be installed in a headset device to collect EEG signals; and several other miscellaneous classifiers. Although [32] thisdeviceisgenerallyreferredasanEEG-basedBCIsystem. believesthatdeeplearningmethodshavenotdemonstrated Consideringtherelativeincreasesinsignalquality,reliabil- convincing enhancement over some state-of-the-art BCI ityandmobilitycomparedwithotherimagingapproaches, methods,theresultsreviewedrecentlyin[33]illustratedthat non-invasiveEEG-baseddeviceshavebeenusedasthemost some deep learning methods, for instance, convolutional popularmodalityforreal-worldBCIsandclinicaluse[25]. neural networks (CNNs), generative adversarial networks EEG headsets can collect signals in several non-overlap- (GANs), and deep belief networks (DBNs), have achieved ping frequency bands (e.g., Delta, Theta, Alpha, Beta, and outstanding performance in classification accuracy. How- Gamma).Thisisbasedonthepowerfulintra-bandconnection ever,there is no comparison drawnbetweendeep learning with distinct behavioural states [26], and the different fre- neural networks and conventional machine learning meth- quencybandscanpresentdiversecorrespondingcharacteris- ods to prove the improvement of modern neural network tics and patterns. Furthermore, the temporal resolution is algorithms in EEG-based BCIs. Another recent survey [34] exceedinglyhigh,reachingthemillisecondlevel,andtherisk systematically reviewed articles published between 2010 for subjects is very low compared with invasive and other and 2018 that applied deep learning to EEG in diverse non-invasivetechniquesthatrequirehigh-intensitymagnetic domains by extracting and analysing various datasets to field exposure. In this survey, we discussed different high identifytheresearchtrend(s),butitdidnotincludeinforma- portability and comparatively inexpensive EEG devices. A tionaboutEEGsensorsorhardwaredevicesthatcollectEEG drawbackoftheEEGtechniqueisthatthesignalshavealow signals. Additionally, an up-to-date survey article released spatialresolutionbecauseofthelimitednumberofelectrodes; in early2019 [35] reviewed brain signal categories for BCIs however,thetemporalresolutionisconsiderablyhigh.When and deep learning techniques for BCI applications, with a usingEEGsignalsforBCIsystems,theinferiorSNRneedsto discussion of applied areas for deep-learning-based BCIs. beconsideredbecauseobjectivefactorssuchasenvironmental While this survey provided a systematic summary of rele- noiseandsubjectivefactorssuchasfatiguestatusmightcon- vant publications between 2015 and 2019, it did not thor- taminate the EEG signals. The recent research conducted to oughlyinvestigatemachinelearning,deeptransferlearning, cope with this disadvantage of EEG technology is also dis- andfuzzymodelsthatareusedfornon-stationaryandnon- cussedinoursurvey. linearEEGsignalprocessing. 1648 IEEE/ACMTRANSACTIONSONCOMPUTATIONALBIOLOGYANDBIOINFORMATICS,VOL.18,NO.5,SEPTEMBER/OCTOBER 2021 1.2.2 Contributions Therecentreviewarticleswementionedabovelackacom- prehensive survey of recent EEG sensing technologies, sig- nal enhancement, relevant machine learning algorithms withtransferlearningandfuzzymodels,anddeeplearning methodsforspecificBCIapplications,especiallyforhealth- care systems. To develop and disseminate a BCI in real- world applications, we need to cover all of these technolo- giesasopposedtofocusingonsignalprocessingormachine learningalone.Inoursurvey,weaimtoaddressallofthese limitations and include the recently published BCI studies from 2015-2019. The main contributions of this study are summarisedasfollows: (cid:1) Advances in sensors and sensing technologies (Section 2). (cid:1) Characteristics of signal enhancement and online processing(Section3). (cid:1) Recentmachinelearningalgorithmsoftransferlearn- ingandfuzzymodelsforBCIapplications(Section4). (cid:1) Recent deep learning algorithms and combined approachesforBCIapplications(Section5). Fig.3.CommercialisedEEGdevicesforBCIapplications. (cid:1) Evolution of healthcare systems and applications in BCIs(Section6). sensors and scalps. The wet sensors relying on electrolytic Intermsoftherelationshipsamongeachcontribution,we gelsprovideacleanconductivepath.Theapplicationofthe focus on EEG-based BCI research and discuss the pathway gelinterfaceisutilisedtodecreasetheskin-electrodecontact from signal processing to applications by first introducing interface impedance, which could be uncomfortable and the EEG signal and recent advancements in sensors and inconvenient for users and too time-consuming and labori- devicesfornon-invasivemeasurement(Section2).Consider- ous for everyday use [36]. Without the conductive gel, the ingtheimportanceofnoiseremovalforEEGsignalprocess- electrode-skinimpedancecannotbemeasured,andthequal- ing, we then review the recent studies on EEG signal ityofmeasuredEEGsignalscouldbecompromised. enhancementandartefacthandlingtechnologies(Section3). Regardingtherecentdevelopmentoffeatureextractionfrom processed EEG signals and analysis, we survey the recent 2.1.2 DrySensorTechnology studiesemployingmachinelearning(Section4)andseveral Because the use of wet electrodes for collecting EEG data well-recognised deep learningalgorithms with diverse BCI requires attaching sensors over the experimenter’s skin, applications(Section5).AmongalloftheseEEG-basedBCI whichisnotdesirableinreal-worldapplications,thedevel- applications,weconcentrateonreviewingtherecentstudies opmentofdrysensorsofEEGdeviceshasbeendramatically regardingthetransformationofhealthcaresystemsandBCI enhancedoverthepastseveralyears[37].Oneofthemajor healthcare applications (Section 6), as we believe that these advantages of dry sensors, compared with wet counter- represent the most relevant sections and trends of EEG- parts, is that they substantially enhance system usability basedBCIresearch. [38]; the headset is also very easy to wear and remove, whichevenallowsskilleduserstoutiliseitbythemselvesin 2 ADVANCESINSENSINGTECHNOLOGIES a short period of time. For example, Siddharth et al. [39] designed biosensors to measure physiological activities to 2.1 AnOverviewofEEGSensors/Devices overcome the limitations of wet-electrode EEG equipment. Advancedsensortechnologyhasenabledthedevelopmentof These biosensors are dry-electrode based, and the signal smaller and smarter wearable EEG devices for lifestyle and quality is comparable to that obtained with wet-electrode related medical applications. In particular, recent advances systems,butwithouttheneedforskinabrasionorprepara- inEEGmonitoringtechnologiespavethewayforwearable, tion or the useof gels. In a follow-up study [39], novel dry wireless EEG monitoring devices with dry sensors. In this EEG sensors that could actively filter the EEG signal from section,wesummarisetheadvancesinEEGdeviceswithwet ambient electrostatic noise were designed and evaluated ordrysensors.Wealsolistandcomparesomecommercially with an ultra-low-noise and high-sampling analogue-to- availableEEGdevices,asshowninFig.3andTable1,witha digital converter module. The study compared the pro- focusonthenumberofchannels,samplingrate,stationaryor posed sensors with commercially available EEG sensors portable nature of the devices, and coverage of companies (Cognionics Inc.) in an SSVEP BCI task, and the SSVEP thatcancatertothespecificneedsofEEGusers. detection accuracy was comparable between two sensors, with74.23percentaverageaccuracyacrossallsubjects. 2.1.1 WetSensorTechnology Furthermore,regardingthetrendofwearablebiosensing Fornon-invasiveEEGmeasurements,wetelectrodecapsare devices,Chietal.[40],[41]reviewedanddesignedwireless attached to users’ scalps with gels as the interface between devices with dry and non-contact EEG electrode sensors. GUETAL.:EEG-BASEDBRAIN-COMPUTERINTERFACES(BCIS):ASURVEYOFRECENTSTUDIESONSIGNALSENSING... 1649 TABLE1 OverviewofEEGDevices Brand Product Wearable Sensortype Channelno. Locations Samplingrate Transmission Weight http://neurosky.com/NeuroSky MindWave Yes Dry 1 F 500Hz Bluetooth 90g https://www.emotiv.com/Emotiv EPOC(+) Yes Dry 5-14 F,C,T,P,O 500Hz Bluetooth 125g https://choosemuse.com/Muse Muse2 Yes Dry 4-7 F,T Bluetooth https://openbci.com/OpenBCI EEGElectrodeCapKit Yes Wet 8-21 F,C,T,P,O Cable https://wearablesensing.com/Wearable DSI24;NeuSenW Yes Wet;Dry 7-21 F,C,T,P,O 300/600Hz Bluetooth 600g Sensing https://www.ant-neuro.com/ANT eegomylab/eegosports Yes Dry 32-256 F,C,T,P,O Upto16kHz Wi-Fi 500g Neuro http://www.neuroelectrics.com/ STARSTIM;ENOBIO Yes Dry 8-32 F,C,T,P,O 125-500Hz Wi-Fi;USB Neuroelectrics http://www.gtec.at/G.tec g.NAUTILUSseries Yes Dry 8-64 F,C,T,P,O 500Hz Wireless 140g https://www.advancedbrainmonitoring. B-Alert Yes Dry 10-24 F,C,T,P,O 256Hz Bluetooth 110g com/AdvancedBrainMonitoring https://www.cognionics.net/Cognionics Quick Yes Dry 8-30;(64-128) F,C,T,P,O 250Hz/500Hz/1 Bluetooth 610g kHz/2kHz https://mbraintrain.com/mBrainTrain Smarting Yes Wet 24 F,C,T,P,O 250-500Hz Bluetooth 60g https://www.brainproducts.com/ LiveAmp Yes Dry 8-64 F,C,T,P,O 250Hz/500Hz/1 Wireless 30g productdetails.php?id=63BrainProducts kHz https://www.brainproducts.com/ AntiCHapmp Yes Dry 32-160 F,C,T,P,O 10kHz Wireless 1.1kg productdetails.php?id=63BrainProducts https://www.biosemi.com/BioSemi ActiveTwo No Wet(Gel) 280 F,C,T,P,O 2/4/8/16kHz Cable 1.1kg https://www.egi.com/EGI GES400 No dry 32-256 F,C,T,P,O 8kHz Cable http://compumedicsneuroscan.com/ Quick-Cap+Grael4k No Wet 32-256 F,C,T,P,O Cable CompumedicsNeuroscan http://www.mitsar-medical.com/Mitsar SmartBCIEEGHeadset Yes Wet 24-32 F,C,T,P,O 2kHz Bluetooth 50g http://mindo.com.tw/Mindo Mindoseries Yes Dry 4-64 F,C,T,P,O Wireless Abbreviation:Frontal(F),Central(C),Temporal(T),Parietal(P),andOccipital(O) Chietal.[41]developedanewintegratedsensorcontrolling brandsoffermorethanonedevice,andtherefore,thenum- the sensitive input node to attain prominent input imped- bers of channels in Table 1 exhibit a wide range. The low- ance, with complete shielding of the input node from the resolution devices mainly cover the frontal and temporal active transistor by bond pads, to the specially built chip locations,someofwhichalsodeploysensorstocollectEEG package. Their experiment results, using data collected signals from five locations, while the medium- and high- fromanon-contactelectrodeonthetopofthehair,demon- resolution groups can cover scalp locations more compre- stratedamaximuminformationtransferrateof100percent hensively. The numbers of channels also affect the EEG accuracy, thus indicating the promising future for dry and signal sampling rate of each device, with the low- and non-contact electrodes as viable tools for EEG applications medium-resolution groups having a general sampling rate andmobileBCIs. of 500 Hz and the high-resolution group achieving a sam- TheaugmentedBCI(ABCI)conceptwasproposedin[38] pling rate of higher than 1,000 Hz. Fig. 3 presents all com- foreverydayenvironments,withwhichsignalsarerecorded mercialisedEEGdeviceslistedinTable1. viabiosensorsandprocessedinreal-timetomonitorhuman behaviour. An ABCI comprises non-intrusive and quick- 3 SIGNALENHANCEMENTANDONLINE setupEEGsolutions,whichrequireminimalornotraining, PROCESSING toaccuratelycollectlong-termdatawiththebenefitsofcom- fort, stability, robustness, and longevity. In their study of a 3.1 ArtefactHandling broad range of approaches to ABCIs, developing portable Based on a broad category of unsupervised learning algo- EEGdevicesusingdryelectrodesensorsisasignificanttar- rithms for signal enhancement, blind source separation getformobilehumanbrainimaging,andfutureABCIappli- (BSS)estimatesoriginalsourcesandparametersofamixing cationswillbebasedonbiosensingtechnologyanddevices. systemandremovestheartefactsignals,suchaseyeblinks and movement, present in the sources [42]. There are sev- 2.2 CommercialisedEEGDevices eral prevalent BSS algorithms in BCI research, including Table1lists21productsof17brandsalongwitheightattrib- principal component analysis (PCA), canonical correlation utestoprovideabasicoverviewofEEGheadsets.Theattri- analysis(CCA)andindependentcomponentanalysis(ICA). bute ’Wearable’ indicates whether the monitored human PCA, one of the simplest BSS techniques, converts corre- subjects could wear the devices and move around without lated variables to uncorrelated variables, named principal movement constraints, which partially depends on the components(PCs),bytheorthogonaltransformation.How- transmission types and whether the headset devices are ever, the artefact components are usually correlated with connected to software via wireless technologies such as EEG data, and the drift potentials are similar to EEG data, Wi-Fi or Bluetooth or via tethered connections. The num- both of which would cause the PCA to fail to separate the bersofchannelsofeachEEGdevicecanbecategorisedinto artefacts [43]. CCA separates components from uncorre- threegroups:alow-resolutiongroupwith1to32channels, lated sources and detects a linear correlation between two a medium-resolution group with 33 to 128 channels, and a multidimensional variables [44] that have been applied in high-resolution group with more than 128 channels. Most muscleartefactremovalfromEEGsignals.IntermsofICA, 1650 IEEE/ACMTRANSACTIONSONCOMPUTATIONALBIOLOGYANDBIOINFORMATICS,VOL.18,NO.5,SEPTEMBER/OCTOBER 2021 it decomposes observed signals into the independent com- muscleartefacts,anditisanefficientsignalprocessingand ponents (ICs) and reconstructs clean signals by removing enhancementtoolinhealthcareEEGsensornetworks.Other the ICs that contain artefacts. ICA is the most utilised approaches are combining typical methods, such as using approach for artefact removal from EEG signals, so we BSS-CCA followed by spectral-slope rejection to reduce review the methods that utilise ICA to support signal high-frequencymusclecontamination[49]andindependent enhancementinthefollowingsections. vectoranalysis(IVA)thattakesadvantageofbothICAand CCA by exploiting higher-order statistics (HOS) and 3.1.1 EyeBlinksandMovements second-order statistics (SOS) simultaneously to achieve high performance in removing muscle artefacts [50]. A Eye blinks are more prevalent when eyes are open, while more extensive survey on muscle artefact removal from rolling of the eyes might influence the eyes-closed condi- EEGcanbefoundin[43]. tion. Additionally, the signals of eyemovements located in thefrontalareacanaffectfurtherEEGanalysis.Tominimise the influence of eye contamination in EEG signals, visual 3.1.3 IntroducingaSignalEnhancementToolbox inspection of artefacts and data augmentation approaches EEGLABisoneofthemostwidelyusedMATLABtoolboxes areoftenusedtoremoveeyecontamination. forprocessingofEEGandotherelectrophysiologicaldata.It Artefact subspace reconstruction (ASR) is an automatic wasdevelopedbytheSwartzCenterforComputationalNeu- component-basedmechanismusedasapre-processingstep roscience and provides an interactive graphic user interface to effectually remove large-amplitude or transient artefacts for users to apply ICA, time/frequency analysis (TFA) and thatcontaminate EEGdata.Thereareseverallimitations of standard averaging methods to the recorded brain signals. ASR.ASReffectuallyremovesartefactsfromtheEEGsignals EEGLABextensions,previouslycalledEEGLABplugins,are collectedfromastandard20-channelEEGdevice,forwhich toolboxesthatprovidedataprocessingandvisualisationfunc- single-channel EEG recordings cannot be applied. Further- tionsfor the EEGLABuserstoprocessthe EEGdata.Atthe more, without substantial cut-off parameters, the effective- time of writing, there are 106 extensions available on the ness of removing regularly occurring artefacts, such as eye EEGLAB website, with a broad functional range including blinks and eye movements, is limited. A possible enhance- dataimporting,artefactremoval,andfeaturedetectionalgo- mentbyusinganICA-basedartefactremovalmechanismas rithms. Many extensions have been developed for artefact a complement to ASR cleaning has been proposed [45]. A removal.Forinstance,theautomaticartefactremoval(AAR) recent study in 2019 [46] also considered ICA and used an toolboxisforautomaticocularandmuscularartefactremoval automaticICclassifierasaquantitativemeasuretoseparate fromEEG;cochlearimplantartefactcorrection(CIAC),asits brain signals and artefacts for signal enhancement. The nameimplies,isanICA-basedtoolparticularlyforcorrecting above studies extended Infomax ICA [47], and the results electricalartefactsarisingfromcochlearimplants;themulti- showed that ASR removes more eye artefacts than brain pleartefactrejectionalgorithm(MARA)toolboxusesEEGfea- componentsbyusinganoptimalASRparameterbetween20 tures in the temporal, spectral and spatial domains to and30. optimise a linear classifier to solve the component-reject For online processing of EEG data in near real time to versus-acceptproblemandtoremovelosselectrodes;andthe remove artefacts, a combination method of online ASR, ’icablinkmetrics’toolboxaimsatselectingandremovingICA online recursive ICA, and an IC classifier was proposed in components associated with eye blink artefacts using time- [45]toremovelarge-amplitudetransientsaswellastocom- domainmethods.Someofthetoolboxeshavemorethanone pute, categorise, and remove artefactual ICs. For the eye majorfunction,suchasartefactrejectionandpre-processing: movement-related artefacts, the results of their proposed “clean_rawdata“isasuiteofpre-processingmethodsinclud- methods have a fluctuating saccade-related IC EyeCatch ingASR for correctingand cleaningcontinuousdata.ARfit- score, and the altered version of EyeCatch in their study is Studiocanbeappliedtoquicklyandintuitivelycorrectevent- stillnotanidealmethodofeliminatingeye-relatedartefacts. relatedspikyartefactsasthefirststepofdatapre-processing using“ARfit“.ADJUSTidentifiesandremovesartefact-inde- 3.1.2 MuscleArtefacts pendent components automatically without affecting neural sourcesordata.Amorecomprehensivelistoftoolboxfunc- Contamination of EEG data by muscle activity is a well- tionscanbefoundontheEEGLABwebsite. recognisedandchallengingproblem.Theseartefactscanbe generatedbyanymusclecontractionorstretchinproximity totherecordingsites,suchaswhenthesubjecttalks,sniffs, 3.2 EEGOnlineProcessing swallows, etc. The degree of muscle contraction or stretch ForneuronalinformationprocessingandBCIs,theabilityto willaffecttheamplitudeandwaveformofartefactsinEEG monitorandanalysecortico-corticalinteractionsinrealtime signals.Commontechniquesthathavebeenusedtoremove isoneofthetrendsinBCIresearch,alongwiththedevelop- muscle artefacts include regression methods, CCA, empiri- ment of wearable BCI devices and effective approaches to calmodedecomposition(EMD),BSS,andEMD-BSS[43].A removeartefacts.Itischallengingtocreateareliablereal-time combination of the ensemble EMD (EEMD) and CCA, systemthatcancollect,extractandpre-processdynamicdata named EEMD-CCA, was proposed to remove muscle arte- with artefact rejection and rapid computation. In the model facts by [48]. By testing on real-life, semi-simulated, and proposedby[51],EEGdatacollectedfromawearable,high- simulated datasets under single-channel, few-channel, and density (64-channel), and dry EEG device are first recon- multichannel settings, the study results indicate that the structed by a 3,751-vertex mesh, anatomically constrained EEMD-CCA method can effectively and accurately remove low resolution electrical tomographic analysis (LORETA), GUETAL.:EEG-BASEDBRAIN-COMPUTERINTERFACES(BCIS):ASURVEYOFRECENTSTUDIESONSIGNALSENSING... 1651 singularvaluedecomposition(SVD)-basedreformulationand automated anatomical labelling (AAL) before being for- warded to the Source Information Flow Toolbox (SIFT) and vectorautoregressive(VAR)model.Byapplyingregularised logistic regression and testing on both simulation and real data,theirproposedsystemiscapableofreal-timeEEGdata analysis.Later,Mullenetal.[37]expandedtheirpriorstudy byincorporatingASRforartefactremoval;theyimplemented theanatomicallyconstrainedLORETAtolocalisesourcesand added the alternating direction method of multipliers and cognitive-state classification. The evaluation results of the Fig.4.Structureofmachinelearning,deeplearning,andfuzzymodels. proposedframeworkonsimulatedandrealEEGdatademon- stratethefeasibilityofthereal-timeneuronalsystemforcog- used when the data used for training are neither classified nitive state classification and identification. A subsequent norlabelled.Itinvolvesonlyinputdataandreferstoaprob- study[52]aimedtopresentdataforinstantaneousincremen- ability function to describe a hidden structure, such as talconvergence.Onlinerecursiveleastsquares(RLS)whiten- groupingorclusteringofdatapoints. ing and an optimised online recursive ICA algorithm Performing conventional machine learning requires the (ORICA) were validated for separating the blind sources creationofamodelfortrainingpurposes.InEEG-basedBCI fromhigh-densityEEGdata.Theexperimentalresultsprove applications, various types of models have been used and the proposedalgorithm’s capability to detectnon-stationary developed for machine learning. In the last ten years, the high-density EEG data and to extract artefact and principal leadingfamiliesofmodelsusedinBCIsincludelinearclassi- brain sources quickly. The open-source Real-time EEG fiers,neuralnetworks,non-linearBayesianclassifiers,near- Source-mappingToolbox(REST),whichprovidessupportfor est neighbour classifiers, and classifier combinations [54]. onlineartefactrejectionandfeatureextraction,isavailableto The linear classifiers, such as linear discriminant analysis inspiremorereal-timeBCIresearchindifferentdomainareas. (LDA), regularised LDA, and support vector machine (SVM),classifydiscriminantEEGpatternsusinglineardeci- 4 MACHINELEARNINGANDFUZZYMODELSINBCI sionboundariesbetweenfeaturevectorsforeachclass.Inthe APPLICATIONS context of neural networks, they assemble layered human neuronstoapproximateanynon-lineardecisionboundaries, Inthissection,asshowninFig.4,weintroducetworesearch wherethemostcommontypeinBCIapplicationisthemulti- areas,machinelearning(ML)andfuzzymodels,andreview layer perceptron (MLP) that typically uses only one or two the recent research outcomes of BCI applications. Machine hidden layers. Moving to a non-linear Bayesian classifier, learningisasubsetofcomputationalintelligencethatcom- suchasaBayesianquadraticclassifierandahiddenMarkov prisesnumerousresearchareas,includingtransferlearning model (HMM), the probability distribution of each class is (TL)anddeeplearning(DL).InSection5,weintroduceand modelled,andBayesrulesareusedto selectthe classto be survey some state-of-the-art architectures of deep learning assignedtotheEEGpatterns.Consideringthephysicaldis- neural networks, including CNNs, GANs, recurrent neural tancesofEEGpatterns,nearestneighbourclassifiers,suchas networks (RNNs), long short-term memory (LSTM), deep theknearestneighbour(kNN)algorithm,proposetoassign transfer learning (DTL), as well as their applications and aclasstotheEEGpatternsbasedonthenearestneighbour. recentresearchesinBCI.Fuzzymodels,whichapplyfuzzy Finally,classifiercombinationscombinetheoutputsofmul- rules,fuzzylogic,andfuzzymeasuretheory(suchasfuzzy tipleclassifiersaboveortraintheminawaythatmaximises integrals) to a fuzzy inference system (FIS), are preferable theircomplementarity.Theclassifiercombinationsusedfor forprocessingnon-linearandnon-stationaryEEGsignalsin BCIapplicationscanbeenhanced,voting,orstackedcombi- BCI research. We also introduce fuzzy neural networks nationapproaches. (FNNs), which are a combination of FIS and deep learning Additionally, to apply machine learning algorithms to neuralnetworks,andFNNBCIapplications. EEG data, we need to pre-process EEG signals and extract features from the raw data, such as frequency band power 4.1 AnOverviewofMachineLearning features and connectivity features between two channels Machinelearningreliesongeneralpatternsofreasoningby [55].Fig.5demonstratesanEEG-baseddatapre-processing, computersystemstoexploreaspecifictaskwithoutprovid- patternrecognitionandmachinelearningpipelinetoextract ing explicit coded instructions. Machine learning tasks are featuresthatrepresentEEGdatainacompactandrelevant generallyclassifiedintoseveraltypes,includingsupervised way. learningandunsupervisedlearning[53].Intermsofsuper- vised learning, it usually divides the data into two subsets 4.2 TransferLearning duringthelearningprocess:atrainingset(adatasettotrain 4.2.1 WhatisTransferLearning amodel)andatestset(adatasettotestthetrainedmodel). Supervised learning can be used for classification and Transfer learning is a set of methodologies to enhance the regression tasks by applying what has been learned in the performanceofaclassifierinitiallytrainedononetask(also training stage using labelled examples to test the new data extended to one session or subject) based on information (testing data) in order to classify types of events or predict gainedwhilelearninganothertask.Transferringknowledge futureevents.Incontrast,unsupervisedmachinelearningis from the source domain to the target domain can act as a 1652 IEEE/ACMTRANSACTIONSONCOMPUTATIONALBIOLOGYANDBIOINFORMATICS,VOL.18,NO.5,SEPTEMBER/OCTOBER 2021 above, is that the training data used to train the classifier andthetestdatausedtoevaluatetheclassifierbelongtothe same feature space and follow the same probability distri- bution. However, this hypothesis is often violated in many applicationsbecauseofhumanvariability[58].Forexample, achangeintheEEGdatadistributiontypicallyoccurswhen data are acquired from different subjects or across sessions and time for the same subject. Additionally, since the EEG signals are variant and not static, extensive BCI sessions exhibitinconsistencyintheclassificationperformance[59]. Thus,transferlearningaimsatcopingwithdatathatvio- latethishypothesisbyexploitingknowledgeacquiredwhile learningagiventaskforsolvingadifferentbutrelatedtask. The advances of transfer learning can relax the restrictions of BCIs since there is no need to calibrate from the begin- ning point, there is less noise for transferred information, and transfer learning relies on previously usable data to increasethesizeofdatasets.Alllearningalgorithmstransfer knowledge to different tasks/domains, in which situation theskillsshouldbetransferredtoenhancetheperformance andavoidanegativetransfer. Fig. 5. Data pre-processing, pattern recognition and machine learning pipelineinBCIs. 4.2.3 WheretoTransferinBCIs InBCIs,discriminativeandstationaryinformationcouldbe bias or a regulariser for solving the target task. Here, we transferredacrossdifferentdomains.Theselectionofwhich provideadescriptionoftransferlearningbasedonthesur- types of information to transfer is based on the similarity veyofPanandYang[56].Thesourcedomainisknown,and between the target and source domains. If the domains are thetargetdomaincanbeinductive(known)ortransductive very similarand the datasample is small,then discrimina- (unknown). Transfer learning is classified under three sub- tive information should be transferred; if the domains are settings in accord with the source and target tasks and different while there could be common information across domains: inductive transfer learning, transductive transfer the target and source domains, thenstationary information learning,andunsupervisedtransferlearning. should be transferred to establish more invariable systems In inductive transfer learning, available labelled data of [60],[61]. thetargetdomainarerequired,whilethetasksofthesource Domainadaptation,arepresentativemethodoftransduc- and target can be different regardless of their domains. tive transfer learning, attempts to find a strategy for trans- Inductive transfer learning can be further categorised into forming the data space in which the decision rules will two cases based on the availability of the labelled data. If classifyalldatasets.Covariateshiftingisatechniquethatis data are available, a multitask learning method will be highlyrelatedtodomainadaptation,whichisthemostfre- applied to the target and source tasks to learn the model quent situation encountered in BCIs. In covariate shifting, simultaneously.Otherwise,aself-taughtlearningtechnique the input distributions of the training and test samples are shouldbedeployed.Intermsoftransductivetransferlearn- different, while the conditional distributions of output val- ing, the labelled data are available in the source domain uesarethesame[62].Thereexistsanessentialassumption: instead of the target domain, while the target and source the marginal distribution of data changes from subjects (or tasks are identical regardless of the domains (of the target sessions)tosubjects(orsessions),andthedecisionrulesfor and source). Transductive transfer learning can be further thismarginaldistributionremainunchanged.Thisassump- categorised into two cases based on whether the feature tionallowsustore-weightthetrainingdatafromothersub- spacesbetweenthesourcedomainandtargetdomainarethe jects (or previous sessions) to correct the differences in the same. If they are the same, then the sample selection bias/ marginaldistributionsforthedifferentsubjects(orsessions). covariance shift method should be applied. Otherwise, a Naturally, the effectiveness of transfer learning strongly domainadaptationtechniqueshouldbedeployed.Unsuper- depends on how closely the two circumstances are related. vised transfer learning is applied if labelled data are not TransferlearninginBCIapplicationscanbeusedtotransfer availableineitherthesourceortargetdomainwhilethetar- information(a)fromtaskstotasks,(b)fromsubjectstosub- get and source tasks are related but different. The target of jects, and (c) from sessions to sessions. As shown in Fig. 6, unsupervisedtransferlearningistoresolveclustering,den- givenatrainingdataset(e.g.,sourcetask,subject,orsession), sity estimation, and dimensionality reduction tasks in the weattempttofindatransformationspaceinwhichtraining targetdomain[57]. themodelwillimprovetheclassificationorpredictionofsam- plesfromanewdataset(e.g.,targettask,subject,orsession). 4.2.2 WhyDoWeNeedTransferLearning Transfer From Tasks to Tasks. In the BCI domain, where Oneofthemajorhypothesesinconventionalmachinelearn- EEGsignalsarecollectedforanalysisofsubjects,themental ing, such as the supervised learning techniques described tasks and the operational tasks could be different but GUETAL.:EEG-BASEDBRAIN-COMPUTERINTERFACES(BCIS):ASURVEYOFRECENTSTUDIESONSIGNALSENSING... 1653 Their empirical results showed the potential of transfer learningfrom subjects to subjectsin anunsupervised EEG- basedBCI.In[69],HeandWuproposedanoveldifferent-set domainadaptationapproachfortask-to-taskandsubject-to- subjecttransfer,whichconsiderstheverychallengingcasein whichthesourcesubjectandthetargetsubjecthavepartially orcompletelydifferenttasks.Forexample,thesourcesubject mayperformleft-handandright-handMIs,whereasthetar- get subject may perform feet and tongue MIs. They intro- ducedapracticalsettingofdifferentlabelsetsforBCIsand proposedanovellabelalignment(LA)approachtoalignthe sourcelabelspacewiththetargetlabelspace.LAonlyneeds as few as one labelled sample from each class of the target subject,inwhichlabelalignmentcanbeusedasapre-proc- essingstepbeforedifferentfeatureextractionandclassifica- tion algorithms and can be integrated with other domain Fig.6.TransferlearninginBCIs. adaptation approachesto achieve even better performance. ForapplyingtransferlearninginBCIs,especiallyEEG-based interdependentinsomesituations.Forinstance,inalabora- BCIs, subject-to-subject transfer among the same tasks is toryenvironment,amentaltaskmightbetoassessadevice morefrequentlyinvestigated. action, such as mental subtraction or motor imagery (MI), Forsubject-to-subjecttransferinsingle-trialERPclassifi- whiletheoperationaltaskisthedeviceactionitselfortheper- cation, Wu [70] proposed both online and offline weighted formanceofthedeviceactionarisingfromERPs.Transferring adaptation regularisation (wAR) algorithms to reduce the decisionrulesbetweendifferenttaskswouldintroducenovel calibration effort. Experiments on a VEP oddball task and signalvariationsandaffecttheerror-relatedpotential,which threedifferentEEGheadsetsdemonstratedthatbothonline presents as a response recognised as an error by users [63]. andofflinewARalgorithmsareeffective.Wualsoproposed Thestudyof[64]showedthatthesignalvariationsoriginat- asourcedomainselectionapproach,whichselectsthemost ingfromtask-to-tasktransfersubstantiallyinfluencedclassi- beneficial source subjects for transfer. This approach can fication feature distribution and the classifier performance. reduce the computational cost of wAR by approximately Otherresultsoftheirstudyarethattheaccuracybasedonthe 50 percent without sacrificing the classification performance, baseline decreased when operational tasks and subtasks thusmakingwARmoresuitableforreal-timeapplications. were generalised, while the differences in features were Veryrecently,Cuietal.[71]proposedanovelapproach, largercomparedwithnon-errorresponses. feature weighted episodic training (FWET), to completely Transfer From Subjects to Subjects. For EEG-based BCIs, eliminate the calibration requirement in subject-to-subject before applying features learned by the conventional transferinEEG-baseddriverdrowsinessestimation.Itinte- approaches to different subjects, a training period of pilot gratesfeatureweightingtolearntheimportanceofdifferent data on each new subject is required due to inter-subject features and episodic training for domain generalisation. variability[65].Inthedrivingdrowsinessdetectionstudyof FWET does not need any labelled or unlabelled calibration Weietal.[66],inter-andintra-subjectvariabilitywereevalu- datafromthenewsubjectandhencecouldbeveryusefulin ated, and the feasibility of transferring models was vali- plug-and-playBCIs. dated by implementing hierarchical clustering in a large- TransferFromSessionstoSessions.Theassumptioninses- scale EEG dataset collected from many subjects. The pro- sion-to-session transfer learning in BCIs is that features posed subject-to-subject transfer framework comprises a extracted by the training module and algorithms can be large-scale model pool, which ensures that sufficient data appliedtoadifferentsessionofasubjectforthesametask. areavailableforpositivemodeltransfertoobtainprominent Itisimportanttoevaluatewhatisincommonamongtrain- decoding performance and small-scale baseline calibration ing sections to optimise the decision distributions among datafromthetargetsubjectasaselectorofdecodingmodels differentsessions. in the model pool. Without jeopardising the performance, Alamgir et al. [72] reviewed several transfer learning their driving drowsiness detection results demonstrated methodologies in BCIs that explore and utilise common 90percentcalibrationtimereduction. training data structures of several sessions to reduce the InBCIs,cross-subjecttransferlearning,suchastheleast- trainingtimeandenhancetheperformance.Buildingonthe squares transformation (LST) method proposed by Chiang comparisonandanalysisofothermethodsintheliterature, etal.[67],couldbeusedtodecreasethetrainingdatacollec- Alamgir et al. proposed a general framework for transfer tion time. The experiments conducted to validate the LST learning in BCIs, which is in contrast to a general transfer method performance of cross-subject SSVEP data demon- learning study that focuses on domain adaptation where strated the capability of reducing the number of training individual session feature attributes are transferred. Their templatesforanSSVEPBCI.Inter-andintra-subjecttransfer framework regards decision boundaries as random varia- learning is also applied under unsupervised conditions bles, so the distribution of decision boundaries could be when no labelled data are available. He and Wu [68] pre- obtainedfromprevioussessions.Withanalteredregression sentedamethodtoalignEEGtrialsdirectlyintheeuclidean method and the consideration of feature decomposition, space across different subjects to increase the similarity. their experiments on amyotrophic lateral sclerosis patients 1654 IEEE/ACMTRANSACTIONSONCOMPUTATIONALBIOLOGYANDBIOINFORMATICS,VOL.18,NO.5,SEPTEMBER/OCTOBER 2021 usinganMIBCIrevealeditseffectivenessinlearningstruc- ture.Therearealsosomeproblematicconditionsofthepro- posed transfer learning method, including the difficulty in balancingtheinitialisationofspatialweightsandtheneces- sityofaddinganextraloopinthealgorithmfordetermining thespectralandspatialcombination. Inoneofthelatestparadigmsstudyingimaginedspeech, inwhichahumansubjectimaginesutteringawordwithout e physical sound or movement, GarcA-a-Salinas et al. [73] proposed a method to extract code words related to the EEG signals. After presenting a new imagined word along with the EEG signals of characteristic code words, the new word was merged with the prior class’s histograms and a classifier for transferlearning. This study implies ageneral trend of applying session-based transfer learning to an Fig. 7. Fuzzy sets, fuzzy rules, and fuzzy integrals for EEG signal imaginedspeechdomaininEEG-basedBCIs. classification. Transfer From Headset to Headset. Apart from the above cases of transfer learning in BCIs, ideally, a BCI system advantage of using flexible boundary conditions for BCI should be completely independent of any specific EEG applications, EEG pattern classification, and interpreting headset such that the user can replace or upgrade his/her what the FIS has learned, as shown in Fig. 7. Furthermore, headset freely without re-calibration. This should greatly fuzzy measure theory, such as a fuzzy integral, is suitable facilitate real-world applications of BCIs. However, this for applications where data fusion is required to consider goalisverychallengingtoachieve.Onesteptowardsitisto possible data interactions [77], such as the fuzzy fusion of usehistoricaldatafromthesameusertoreducethecalibra- multipleinformationsources. tioneffortforthenewEEGheadset. Fuzzymodelsintegratedwithneuralnetworkshavealso Wuetal.[74]proposedactiveweightedadaptationregu- beenappliedinBCIsystems.Forexample,fuzzyneuralnet- larisation (AwAR) for headset-to-headset transfer. It inte- works (FNNs) combine the advantages of neural networks grates wAR, which uses labelled data from the previous and FIS. This architecture is similar to that of neural net- headset and handles class imbalance, and active learning, works,andtheinput(orweight)isfuzzified[78].TheFNN which selects the most informative samples from the new recognises the fuzzy rules and adjusts the membership headset to label. Experiments on single-trial ERP classifica- function by tuning the connection weights. In particular, a tionshowedthatAwARcouldsignificantlyreducethecali- self-organising neural fuzzy inference network (SONFIN) brationdatarequiredforthenewheadset. wasproposedby[79]usingadynamicFNNarchitectureto create a self-adaptive architecture for the identification of 4.3 FuzzyModels thefuzzymodel.Theadvantageofdesigningsuchahybrid 4.3.1 WhyDoWeNeedFuzzyModels? structureismoreexplanatorybecauseitutilisesthelearning capabilityoftheneuralnetwork. Currently, the classification processes of most machine Here,wesummariseup-to-datefuzzymodelsolutionsfor learning methods are not easily interpretable or traceable. EEG-based BCI systems and applications. By using fuzzy By comparison, fuzzy-rule based classification systems sets, Wu et al. [80] proposed to extend the multiclass EEG havedevelopedsensiblerulestoprocessEEGactivitiesbased common spatial pattern (CSP) filters from classification to on our knowledge of neurophysiology and neuroscience, regressioninalarge-scalesustained-attentionPVTandlater whicharethusexplainable.Additionally,conventionalneu- furtherintegratedthemwithRiemanniantangentspacefea- ral network classification methods, such as an MLP model, tures for improved PVT reaction time estimation perfor- didnottake intoaccountthe factthat EEG signals are non- mance [81]. Considering the advantages of the fuzzy linear and non-stationary. In comparison, a fuzzy set does membershipdegree,[82]usedafuzzymembershipfunction nothavethefirmboundaries(theboundarytransitionischar- insteadofastepfunction,whichdecreasedthesensitivityof acterisedbyamembershipfunction[75])usedforexploiting entropyvaluestonoisyEEGsignals.ThisimprovedtheEEG approximationtechniquesinneuralnetworks,whichmakes complexity evaluation in resting state and SSVEP sessions neuro-fuzzy models preferable for extracting intrinsic EEG [83],whichareassociatedwithhealthcareapplications[84]. activities. Exploring these fuzzy models may be useful for By integrating fuzzy sets with EEG-based BCI domain understanding and improving BCIs learned automatically adaptation, Wu et al. [85] proposed an online weighted fromEEGsignalsorpossiblygainingnewinsightsintoBCIs. adaptation regularisation for regression (OwARR) algo- Here,wecollectsomefuzzymodelsfromfuzzysetsandsys- rithmtoreducetheamountofsubject-specificcalibrationof temstoestimateoraccelerateBCIapplications. theEEGdata.Furthermore,byintegratingfuzzyruleswith domainadaptation,Changetal.[86]generatedafuzzyrule- 4.3.2 EEG-BasedFuzzyModels based brain-state-drift detector using Riemann-metric- A fuzzy inference system (FIS) can be used in BCI applica- based clustering, allowing the data distribution to be tionstoautomaticallyextractfuzzy“If-Then“rulesfromthe observable.Byadoptingfuzzyintegrals[87],amotor-imag- datathatdescribewhichinputfeaturevaluescorrespondto ery-based BCI exhibited robust performance for offline which output category [76]. Such fuzzy rules confer the single-trial classification and real-time control of a robotic