LinköpingStudiesinScienceandTechnology. Dissertations. No.1012 Regressor and Structure Selection Uses of ANOVA in System Identification Ingela Lind DepartmentofElectricalEngineering Linköpingsuniversitet,SE–58183Linköping,Sweden Linköping2006 Thecoverpictureisalittle-leaflinden,TILIAcordata,whichisnamed“lind”inSwedish. RegressorandStructureSelection:UsesofANOVAinSystemIdentification (cid:13)c 2006IngelaLind [email protected] www.control.isy.liu.se DivisionofAutomaticControl DepartmentofElectricalEngineering Linköpingsuniversitet SE–58183Linköping Sweden ISBN91-85523-98-4 ISSN0345-7524 PrintedbyLiU-Tryck,Linköping,Sweden2006 ToMattias Abstract Identificationofnonlineardynamicalmodelsofablackboxnatureinvolvesbothstructure decisions(i.e.,whichregressorstouseandtheselectionofaregressorfunction),andthe estimation of the parameters involved. The typical approach in system identification is oftenamixofallthesesteps,whichforexamplemeansthattheselectionofregressorsis basedonthefitsthatisachievedfordifferentchoices. Alternativelyonecouldtheninter- prettheregressorselectionasbasedonhypothesistests(F-tests)atacertainconfidence level that depends on the data. It would in many cases be desirable to decide which re- gressorstouse,independentlyoftheothersteps. Asurveyofregressorselectionmethods usedforlinearregressionandnonlinearidentificationproblemsisgiven. In this thesis we investigate what the well known method of analysis of variance (ANOVA) can offer for this problem. System identification applications violate many of the ideal conditions for which ANOVA was designed and we study how the method performsundersuchnon-idealconditions.ItturnsoutthatANOVAgivesbetterandmore homogeneousresultscomparedtoseveralotherregressorselectionmethods.Somepracti- calaspectsarediscussed,especiallyhowtocategorisethedatasetfortheuseofANOVA, andwhethertobalancethedatasetusedforstructureidentificationornot. AnANOVA-basedmethod,TestofInteractionsusingLayoutforIntermixedANOVA (TILIA),forregressorselectionintypicalsystemidentificationproblemswithmanycan- didateregressorsisdevelopedandtestedwithgoodperformanceonavarietyofsimulated andmeasureddatasets. TypicalsystemidentificationapplicationsofANOVA,suchasguidingthechoiceof linear terms in the regression vector and the choice of regime variables in local linear models,areinvestigated. ItisalsoshownthattheANOVAproblemcanberecastasanoptimisationproblem. Twomodified,convexversionsoftheANOVAoptimisationproblemarethenproposed, anditturnsoutthattheyarecloselyrelatedtothenn-garroteandwaveletshrinkagemeth- ods, respectively. In the case of balanced data, it is also shown that the methods have a niceorthogonalitypropertyinthesensethatdifferentgroupsofparameterscanbecom- putedindependently. v Acknowledgments Firstofall,IwouldliketothankmysupervisorprofessorLennartLjungforlettingmejoin thenice,enthusiasticandambitiousresearchersintheAutomaticControlgroup,andfor suggestingsuchaninterestingtopicforresearch. Hehasshownhonourablepatiencewith delaysduetomaternalleaves,andalsobeenveryencouragingwhenneeded. Withouthis excellentguidanceandsupportthisthesiswouldnotexist. A, for me, important part of the work is teaching. I can sincerely say that without the support of professor Svante Gunnarsson, I would not have considered starting on, or continuing graduate studies. Ulla Salaneck, who somehow manages to keep track of all practical and administrative details, is also worth a special thanks. Thank you for maintainingsuchawelcomingatmosphere. Ihavespentlotsoftimeworkingtogetherwith(oreatingincompanyof)JacobRoll duringtheseyears. Hehasbeenandisagoodfriendaswellasworkingpartner. Thank you. Iwouldalsoliketothankalltheotherpeoplepreviouslyorpresentlyinthegroup, fortheircheerfulattitude,andfortheirunbelievableabilitytospawndetaileddiscussions ofanythingbetweenheavenandearthduringthecoffeebreaks. A number of people have been a great help during the thesis writing. I would like to thank Gustaf Hendeby and Dr. Martin Enquist for providing the style files used, and Gustaf also for all his help with LaTeX issues. Henrik Tidefelt has helped me with the picturesintheIntroduction. Thefollowingpeople(inalfabeticalorder)havehelpedme byproofreadingpartsofthethesis: DanielAnkelhed, MarcusGerdin, JanneHarju, Dr. Jacob Roll, Dr. Thomas Schön and Johanna Wallén. They have given many insightful comments,whichhaveimprovedtheworkconsiderably. Thankyouall. ThisworkhasbeensupportedbytheSwedishResearchCouncil(VR)andbythegrad- uateschoolECSEL(ExcellenceCenterinComputerScienceandSystemsEngineeringin Linköping),whicharegratefullyacknowledged. I also want to thank my extended family for their love and support. Special thanks tomyparentsforalwaysencouragingmeandtrustingmyabilitytohandlethingsonmy own,tomyhusbandMattiasforsharingeverythingandtryingtoboostmysometimeslow selfconfidence,tomyparentsinlawformakingmefeelpartoftheirfamily,andfinally tomydaughtersElsaandNoraforgivingperspectiveontheimportantthingsinlife. Lasthere,butmostcentraltome,IwouldliketothankJesusChristforhisboundless graceandlove. vii Contents 1 Introduction 1 1.1 SystemIdentification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 RegressorSelection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 ModelTypeSelection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 ParameterEstimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.6 ThesisOutline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 Survey of Methods for Finding Significant Regressors in Nonlinear Regres- sion 11 2.1 BackgroundinLinearRegression . . . . . . . . . . . . . . . . . . . . . 12 2.1.1 AllPossibleRegressions . . . . . . . . . . . . . . . . . . . . . . 12 2.1.2 StepwiseRegression . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.3 BackwardElimination . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.4 Non-NegativeGarrote . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.5 Lasso . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.6 ISRR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.7 LARS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 NonlinearMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.1 ComparisonofMethods . . . . . . . . . . . . . . . . . . . . . . 16 2.2.2 ExhaustiveSearch . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.3 Non-ParametricFPE . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.4 StepwiseRegressionofNARMAXModelsusingERR . . . . . . 19 2.2.5 Bootstrap-BasedConfidenceIntervals . . . . . . . . . . . . . . . 20 2.2.6 (Partial)LagDependenceFunction . . . . . . . . . . . . . . . . 21 2.2.7 LocalConditionalMeanandANOVA . . . . . . . . . . . . . . . 22 2.2.8 LocalConditionalVariance. . . . . . . . . . . . . . . . . . . . . 22 ix x Contents 2.2.9 FalseNearestNeighbours . . . . . . . . . . . . . . . . . . . . . 23 2.2.10 LipschitzQuotient . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2.11 RankofLinearisedSystem . . . . . . . . . . . . . . . . . . . . . 24 2.2.12 MutualInformation . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.13 MARS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.14 Supanova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3 TheANOVAIdea 27 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.1 OriginandUseofANOVA . . . . . . . . . . . . . . . . . . . . . 27 3.1.2 SamplingDistributions . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Two-WayAnalysisofVariance . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 ANOVATests. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2.3 ANOVATable . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2.4 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3 RandomEffectsandMixedModels . . . . . . . . . . . . . . . . . . . . 34 3.4 SignificanceandPowerofANOVA. . . . . . . . . . . . . . . . . . . . . 36 3.5 UnbalancedDataSets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5.1 ProportionalData . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5.2 ApproximateMethods . . . . . . . . . . . . . . . . . . . . . . . 40 3.5.3 ExactMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 DeterminetheStructureofNFIRmodels 43 4.1 ProblemDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.1.1 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.1.2 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 StructureIdentificationusingANOVA . . . . . . . . . . . . . . . . . . . 46 4.2.1 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.2.2 ChecksofAssumptionsandCorrections . . . . . . . . . . . . . . 48 4.2.3 AnalysisoftheTestSystemswithContinuous-LevelInput . . . . 48 4.3 ValidationBasedExhaustiveSearchWithinANNModels . . . . . . . . . 58 4.4 RegressorSelectionusingtheGammaTest. . . . . . . . . . . . . . . . . 60 4.5 RegressorSelectionusingtheLipschitzMethod . . . . . . . . . . . . . . 61 4.6 RegressorSelectionusingStepwiseRegressionandERR . . . . . . . . . 61 4.7 TestResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.7.1 Fixed-LevelInputSignal . . . . . . . . . . . . . . . . . . . . . . 62 4.7.2 Continuous-LevelInputSignal . . . . . . . . . . . . . . . . . . . 65 4.7.3 CorrelatedInputSignal . . . . . . . . . . . . . . . . . . . . . . . 67 4.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5 PracticalConsiderationswiththeUseofANOVA 73 5.1 WhichVariantofANOVAShouldbeUsed? . . . . . . . . . . . . . . . . 73 5.2 Categorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.1 IndependentRegressors . . . . . . . . . . . . . . . . . . . . . . 75 5.2.2 CorrelatedRegressors . . . . . . . . . . . . . . . . . . . . . . . 75