ebook img

$D$-optimal saturated designs: a simulation study PDF

0.07 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview $D$-optimal saturated designs: a simulation study

Chapter 1 D-optimal saturated designs: a simulation study RobertoFontana,FabioRapalloandMariaPieraRogantin 4 1 0 2 n a J Abstract In this work we focus on saturated D-optimal designs. Using recent re- 4 sults,weidentifyD-optimaldesignswiththesolutionsofanoptimizationproblem withlinearconstraints.Weintroducenewobjectivefunctionsbasedonthegeomet- ] E ric structure of the design and we compare them with the classical D-efficiency M criterion. We performa simulation study. In all the test cases we observe that de- signs with high values of D-efficiency have also high values of the new objective . at functions. t s [ 1 1.1 Introduction v 4 4 The optimality of an experimental design depends on the statistical model that is 8 assumedandisassessedwithrespecttoastatisticalcriterion.Amongthedifferent 0 criteria,inthischapterwefocusonD-optimality. . 1 Widely used statistical systems like SAS and R have proceduresfor finding an 0 optimaldesignaccordingtotheuser’sspecifications.Proc OptexofSAS/QC[5] 4 searchesforoptimalexperimentaldesignsinthefollowingway.Theuserspecifies 1 anefficiencycriterion,asetofcandidatedesignpoints,amodelandthesizeofthe : v designtobefound,andtheproceduregeneratesasubsetofthecandidatesetsothat i X thetermsinthemodelcanbeestimatedasefficientlyaspossible. r a RobertoFontana DepartmentDISMA,PolitecnicodiTorino,CorsoDucadegliAbruzzi24,10127TORINO,Italy, e-mail:[email protected] FabioRapallo DepartmentDISIT,Universita`delPiemonteOrientale,VialeTeresaMichel11,15121ALESSAN- DRIA,Italy,e-mail:[email protected] MariaPieraRogantin Department DIMA, Universita` di Genova, Via Dodecaneso 35, 16146 GENOVA, Italy, e-mail: [email protected] 1 2 RobertoFontana,FabioRapalloandMariaPieraRogantin There are several algorithmsfor searching for D-optimaldesigns. They have a commonstructure.Indeed,theystartfromaninitialdesign,randomlygeneratedor userspecified,andmove,in a finitenumberofsteps, to abetterdesign.Allofthe search algorithms are based on adding points to the growing design and deleting pointsfromadesignthatistoobig.Mainreferencestooptimaldesignsinclude[1], [4],[7],[8],[9]and[11]. Inthiswork,weperformasimulationstudytoanalyzeadifferentapproachfor describingD-optimaldesignsinthecaseofsaturatedfractions.Saturatedfractions, orsaturateddesigns,containanumberofpointsthatisequaltothe numberofes- timable parameters of the model. It follows that saturated designs are often used in place ofstandarddesigns,suchas orthogonalfractionalfactorialdesigns,when thecostofeachexperimentalrunishigh.Weshowhowthegeometricstructureof a fraction is in relation with its D-optimality, using a recent result in [3] that al- lowsustoidentifysaturateddesignswiththepointswithcoordinatesin{0,1}ofa polytope,beingthepolytopedescribedbyasystemoflinearinequalities.Thelinear programmingproblemisbasedonacombinatorialobject,namelythecircuitbasisof themodelmatrix.Sincethecircuitsyieldageometriccharacterizationofsaturated fractions, we investigate here the connections between the classical D-optimality criterionandthepositionofthedesignpointswithrespecttothecircuits. In this way the search for D-optimal designs can be stated as an optimization problemwhere the constraintsare a system oflinear inequalities.Within the clas- sical framework the objective function to be maximized is the determinantof the information matrix. In our simulations, we define new objective functions, which take into account the geometric structure of the design points with respect to the circuitsoftherelevantdesignmatrix.Westudythebehaviorofsuchobjectivefunc- tionsandwecomparethemwiththeclassicalD-efficiencycriterion. Thechapterisorganizedasfollows.InSect.1.2webrieflydescribetheresultsof [3]andinparticularhowsaturateddesignscanbeidentifiedwith{0,1}pointsthat satisfyasystemoflinearinequalities.TheninSect.1.3wepresenttheresultsofa simulationstudyinwhich,usingsometestcases,weexperimentdifferentobjective functionsandweanalyzetheirrelationshipwiththeD-optimalcriterion.Concluding remarksaremadeinSect.1.4. 1.2 Circuits and saturated designs As described in [3], the key ingredient to characterize the saturated fractions of a factorial design is its circuit basis. We recall here only the basic notions about circuitsinordertointroduceourtheory.Forasurveyoncircuitsanditsconnections withStatistics,thereadercanreferto[6]. GivenamodelmatrixX ofafullfactorialdesignD,anintegervector f isinthe kernelofXt ifandonlyifXtf =0.WedenotebyAthetransposeofX.Moreover, we denote by supp(f) the support of the integer vector f, i.e., the set of indices j such that f 6=0. Finally,the indicatorvectorof f is the binaryvector(f 6=0), j j 1 D-optimalsaturateddesigns:asimulationstudy 3 where(·)istheindicatorfunction.Anintegervector f isacircuitofAifandonly if: 1. f ∈ker(A); 2. there is no other integer vector g ∈ ker(A) such that supp(g) ⊂ supp(f) and supp(g)6=supp(f). ThesetofallcircuitsofAisdenotedbyC ,andisnamedasthecircuitbasisof A A.ItisknownthatC isalwaysfinite.ThesetC canbecomputedthroughspecific A A software.Inourexamples,wehaveused4ti2[10]. Given a modelmatrix X on a fullfactorial design D with K design points and p degrees of freedom, we recall that a fraction F ⊂D with p design points is saturatedifdet(XF)6=0,whereXF istherestrictionofX tothedesignpointsinF. With a slight abuse of notation,F denotesboth a fractionand its support.Under these assumptions, the relations between saturated fractions and the circuit basis C ={f ,...,f }associatedtoAisillustratedinthetheorembelow,provedin[3]. A 1 L Theorem1.F is asaturatedfractionifandonlyif itdoesnotcontainany ofthe supports{supp(f ),...,supp(f )}ofthecircuitsofA=Xt. 1 L 1.3 Simulationstudy The theory described in Sect. 1.2 allows us to identify saturated designs with the feasible solutions of an integer linear programming problem. Let C = (c ,i = A ij 1,...,L,j =1,...,K) be the matrix, whose rows contain the values of the indica- torfunctionsofthecircuits f ,...,f ,c =(f 6=0),i=1,...,L,j=1,...,K and 1 L ij ij Y=(y ,...,y )betheK-dimensionalcolumnvectorthatcontainstheunknownval- 1 K uesoftheindicatorfunctionofthepointsofF.InourproblemthevectorY must satisfythefollowingconditions: 1. thenumberofpointsinthefractionsmustbeequalto p; 2. thesupportofthefractionmustnotcontainanyofthesupportsofthecircuits. Informulae,thisfacttranslatesintothefollowingconstraints: 1t Y =p, (1.1) K C Y <b, (1.2) A y ∈{0,1} (1.3) i whereb=(b ,...,b )isthecolumnvectordefinedbyb =#supp(f),i=1,...,L, 1 L i i and1 isthecolumnvectoroflengthK andwhoseentriesareallequalto1. K Since DY =det(V(Y))=det(XFt XF) is an objective function,it followsthat a D-optimaldesignisthesolutionoftheoptimizationproblem maximize det(V(Y)) subjectto(1.1),(1.2)and(1.3). 4 RobertoFontana,FabioRapalloandMariaPieraRogantin Ingeneraltheobjectivefunctiontobemaximizeddet(V(Y))hasseverallocalop- timaandtheproblemoffindingtheglobaloptimumispartofcurrentresearch,[2]. Insteadoftryingtosolvethisoptimizationprobleminthisworkweprefertostudy different objective functions that are simpler than the original one but that could generatethesame optimalsolutions.ByanalogyofTheorem1,ournewobjective functionsaredefinedusingthecircuitsofthemodelmatrix. ForanyY, we definethevectorb =C Y.Thisvectorb containsthenumber Y A Y of points that are in the intersection between the fraction F identified by Y and the supportof each circuit f ∈C ,i=1,...,L. From (1.2) we know that each of i A theseintersectionsmustbestrictlycontainedinthesupportofeachcircuit.Foreach circuit f,i=1,...,Litseemsnaturaltominimizethecardinality(b ) oftheinter- i Y i sectionbetweenitssupportsupp(f)andYwithrespecttothesizeofitssupport,b. i i Therefore,weconsideredthefollowingtwoobjectivefunctions: • g (Y)=(cid:229) L (b−b ); 1 i=1 Y i • g (Y)=(cid:229) L (b−b )2. 2 i=1 Y i From the examples analyzed in Sect. 1.3.1, we observe that the D-optimality is reached with fractions that contain part of the largest supports of the circuits, althoughthisfactseemstodisagreewithThm.1.Infact,Thm.1statesthatfractions containingthesupportofacircuitarenotsaturated,andthereforeonewouldexpect thatoptimalfractionswillhaveintersectionsassmallaspossiblewiththesupports ofthecircuits.Ontheotherhand,ourexperimentsshowthatoptimalityisreached withfractionshavingintersectionsaslargeaspossiblewithsuchsupports.Forthis reasonweconsideralsothefollowingobjectivefunction: • g (Y)=max(b ). 3 Y AsameasureofD-optimalityweusetheD-efficiency,[5].TheD-efficiencyofa fractionF withindicatorvectorY isdefinedas 1 1 E = D#F ×100 Y (cid:18)#F Y (cid:19) where #F is the number of points of F that is equal to p in our case, since we consideronlysaturateddesigns. 1.3.1 Firstcase. 24 withmaineffects and 2-way interactions Letusconsiderthe 24 designandthe modelwith mainfactorsand2-wayinterac- tions.ThedesignmatrixX ofthefulldesignhas16rowsand11columns,thenum- ber ofestimable parameters.As the matrixX hasrank 11,we search forfractions with 11 points. A direct computation shows that there are 16 =4,368 fractions 11 with11points:amongthem3,008aresaturated,andtherem(cid:0)ai(cid:1)ning1,360arenot. Notice that equivalencesup to permutationsof factor or levels are not considered here. 1 D-optimalsaturateddesigns:asimulationstudy 5 Thecircuitsare140andthecardinalitiesoftheirsupportsare8in20cases,10in 40cases,12in80cases.Formoredetailsreferto[3].Thisexampleissmallenough foracompleteenumerationofallsaturatedfractions.Moreover,thestructureofthat fractionsreducestofewcases,duetothesymmetryoftheproblem. For each saturated fraction F with indicator vectorY we compute the vector b ,whosecomponentsarethesizeoftheintersectionbetweenthefractionandthe Y support of all the circuits, F ∩supp(f),i=1,...,140, and we consider b−b . i Y Recallthatbisthevectorofthecardinalitiesofthecircuits.Thefrequencytableof b−b describeshowmanypointsneedtobeaddedtoafractioninordertocomplete Y eachcircuit.AllthefrequencytablesaredisplayedintheleftsideofTable1.1,while ontherightsidewereportthecorrespondingvaluesofD-efficiency. Table1.1 Frequencytablesofb−bY forthe24designwithmaineffectsand2-wayinteractions. table(b−bY) EY 1 2 3 4 5 68.29 77.46 83.38 5 15 50 60 10 192 0 0 5 18 48 55 14 1,040 0 0 5 21 46 50 18 960 0 0 5 24 44 45 22 480 0 0 5 27 42 40 26 0 320 0 5 30 40 35 30 0 0 16 Total 2,672 320 16 Forinstance, consideroneof the 192fractionsin the first row.Amongthe 140 circuits, 5 ofthem are completedbyadding1 pointto the fraction,15 ofthem by adding2points,andsoon.Weobservethatthereisaperfectdependencebetween theD-efficiencyandthefrequencytableofb−b . Y However, analyzing the objective functions g (Y), g (Y) and g (Y), we argue 1 2 3 thatthepreviousfindinghasnotrivialexplanation.Thevaluesofallourobjective functionsaredisplayedinTable1.2. From Table 1.2 we observe that both g (Y) and g (Y) are increasing as D- 2 3 efficiency increases. Notice also that g (Y) is constant over all the saturated frac- 1 tions.Thisisageneralfactforallno-m-wayinteractionmodels. Proposition1.Forano-m-wayinteractionmodel,g (Y)isconstantoverallsatu- 1 ratedfractions. Proof. WerecallthatC =(c ,i=1,...,L,j=1,...,K)istheL×Kmatrix,whose A ij rows contain the values of the indicator functions of the supports of the circuits f ,...,f ,c =(f 6=0),i=1,...,L,j=1,...,K.Wehave 1 L ij ij L L L (cid:229) (cid:229) (cid:229) g (Y)= (b−b ) = (b) − (b ) . 1 Y i i Y i i=1 i=1 i=1 6 RobertoFontana,FabioRapalloandMariaPieraRogantin Table1.2 Classificationofallsaturatedfractionsforthe24 designwithmaineffectsand2-way interactions. g1(Y) g2(Y) g3(Y) EY n 475 1,725 9 68.29 192 475 1,739 10 68.29 960 475 1,753 10 68.29 960 475 1,739 11 68.29 80 475 1,767 11 68.29 480 475 1,781 11 77.46 320 475 1,795 11 83.38 16 Total 3,008 ThefirstaddendumdoesnotdependonY,andforthesecondoneweget L L K K L (cid:229) (cid:229) (cid:229) (cid:229) (cid:229) (b ) = c Y = Y c . Y i ij j j ij i=1 i=1j=1 j=1 i=1 Nowobservethata no-m-wayinteractionmodeldoesnotchangewhenpermuting the factors or the levels of the factors. Therefore, by a symmetry argument, each designpointmustbelongtothesamenumberqofcircuits,andthus(cid:229) L c =q.It i=1 ij followsthat L K (cid:229) (cid:229) (b ) =q Y =pq. Y i j i=1 j=1 ⊓⊔ InviewofProp.1,intheremainingexampleswewillconsideronlythefunctions g andg . 2 3 1.3.2 Second case.3×3×4 withmaineffects and 2-way interactions Let us consider the 3×3×4 design and the model with main factors and 2-way interactions. The model has p =24 degrees of freedom. The number of circuits is 17,994.In this case the numberof possible subsets of the full design is 36 = 24 1,251,677,700.Itwouldbecomputationallyunfeasibletoanalyzeallthefra(cid:0)ctio(cid:1)ns. Weusethemethodologydescribedin[2]toobtainasampleofsaturatedD-optimal designs. It is worth noting that this methodologyfinds D-optimaldesigns and not simply saturated designs.This is particularlyusefulin ourcase becauseallows us tostudyfractionsforwhichtheD-efficiencyisveryhigh.Thesamplecontains500 designs,380different. 1 D-optimalsaturateddesigns:asimulationstudy 7 TheresultsaresummarizedinTable1.3,wherethefractionswithminimumD- efficiencyE havebeencollapsedinauniquerowinordertosavespace.Weobserve Y that for 138 different designs the maximum value of D-efficiency, E =24.41 is Y obtainedforbothg (Y)andg (Y)attheirmaximumvaluesg (Y)=970,896and 2 3 2 g (Y)=24. 3 Table 1.3 Classification of 380 random saturated fractions for the 3×3×4 design with main effectsand2-wayinteractions. g2(Y) g3(Y) EY n ≤963,008 ≤21 22.27 37 962,816 21 23.6 7 962,816 22 23.6 12 963,700 22 23.6 34 965,308 22 23.6 46 966,760 22 23.6 9 967,676 22 23.6 6 970,860 24 23.6 91 970,896 24 24.41 138 Total 380 1.3.3 Thirdcase. 25 withmaineffects Letusconsiderthe25 designandthemodelwithmaineffectsonly.Themodelhas p=6 degreesof freedom. The numberof circuits is 353,616.As in the previous casewe usethemethodologydescribedin [2]togetasampleof500designs,414 different. The results are summarizedin Table 1.4. We observethat for194 differentde- signs, the maximumvalue of D-efficiency,E =90.48is obtainedfor both g (Y) Y 2 andg (Y)attheirmaximumvaluesg (Y)=11,375,490andg (Y)=6. 3 2 3 Table1.4 Classificationof414randomsaturatedfractionsforthe25designwithmaineffects. g2(Y) g3(Y) EY n 11,360,866 6 76.31 31 11,342,586 6 83.99 9 11,371,834 6 83.99 126 11,375,490 5 83.99 54 11,375,490 6 90.48 194 Total 414 8 RobertoFontana,FabioRapalloandMariaPieraRogantin 1.4 Concluding remarks The examples discussed in the previous section show that the D-efficiency of the saturatedfractionsandthenewobjectivefunctionsbasedoncombinatorialobjects arestronglydependent.Thethreeexamplessuggesttoinvestigatesuchconnection inamoregeneralframework,inordertocharacterizesaturatedD-optimalfractions intermsoftheirgeometricstructure.Noticethatourpresentationislimitedtosat- uratedfractions,butitwouldbeinterestingtoextendtheanalysistootherkindsof fractions.Moreover,weneedtoinvestigatetheconnectionsbetweenthenewobjec- tivefunctionsandothercriteriathanD-efficiency. Since the numberof circuits dramatically increases with the dimensionsof the factorialdesign,boththeoreticaltoolsandsimulationwillbeessentialforthestudy oflargedesigns. References 1. Atkinson,A.C.,Donev,A.N.,Tobias,R.D.:Optimumexperimentaldesigns,withSAS.Oxford UniversityPress,NewYork(2007) 2. Fontana, R.:Random generation ofoptimalsaturated designs (2013). Preprint availableat arXiv:1303.6529 3. Fontana,R.,Rapallo,F.,Rogantin,M.P.:Acharacterizationofsaturateddesignsforfactorial experiments. J.Stat.Plann.Inference (2013). DOIhttp://dx.doi.org/10.1016/j.jspi.2013.10. 011. OnlineFirst 4. Goos,P.,Jones,B.:Optimaldesignofexperiments:acasestudyapproach. Wiley,Chichester, UK(2011) 5. Institute,S.:SAS/QC9.2User’sGuide,SecondEdition. Cary,NC(2010) 6. Ohsugi,H.:AdictionaryofGro¨bnerbasesoftoricideals.In:T.Hibi(ed.)HarmonyofGro¨bner basesandthemodernindustrialsociety,pp.253–281.WorldScientific,Hackensack,NJ(2012) 7. Pukelsheim, F.: Optimal design of experiments, Classics in Applied Mathematics, vol. 50. SocietyforIndustrialandAppliedMathematics,Philadelphia,PA(2006) 8. Rasch,D.,Pilz,J.,Verdooren,L.,Gebhardt,A.:OptimalexperimentaldesignwithR. CRC Press,BocaRaton,FL(2011) 9. Shah, K.R., Sinha, B.K.: Theory of optimal designs, Lecture Notes in Statistics, vol. 54. Springer-Verlag,Berlin(1989) 10. 4ti2team:4ti2—asoftwarepackageforalgebraic,geometricandcombinatorialproblemson linearspaces. Availableatwww.4ti2.de(2008) 11. Wynn, H.P.: The sequential generation of D-optimum experimental designs. Ann. Math. Statist.41(5),1655–1664(1970)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.