DiscreteMathematicsandTheoreticalComputerScience DMTCSvol. (subm.),bytheauthors,1–1 Forbidden Configurations: Finding the number predicted by the Anstee-Sali Conjecture is NP-hard 2 1 0 Miguel Raggi † 2 t CentrodeCienciasMatema´ticas c UniversidadNacionalAuto´nomadeMe´xico, O Morelia,Michoaca´n,Me´xico 0 3 ] LetF beahypergraphandletforb(m,F)denotethemaximumnumberofedgesahypergraphwithmverticescan M haveifitdoesn’tcontainF asasubhypergraph. AconjectureofAnsteeandSalipredictstheasymptoticbehaviour D offorb(m,F)forfixedF. InthispaperweprovethatevenfindingthispredictedasymptoticbehaviourisanNP- hardproblem,meaningthatiftheAnstee-Saliconjectureweretrue,findingtheasymptoticsofforb(m,F)wouldbe . s NP-hard. c [ MathematicsSubjectClassification:05D05 1 v Keywords:ForbiddenConfiguration,Hypergraph,Trace,NP-hard,NP-complete,Anstee-Saliconjecture 9 8 1 8 1 Introduction . 0 Thepaperconsidersanextremalproblem.SomeofthemostcelebratedextremalresultsarethoseofErdo˝s 1 andStone[ES46]andErdo˝sandSimonovits[ES66].Theyconsiderthefollowingproblem:Givenm∈N 2 1 andagraphF,findthemaximumnumberofedgesinagraphonmverticesthatavoidshavingasubgraph : isomorphictoF. v i Thereareanumberofwaystogeneralizethistohypergraphs. Ak-uniformhypergraphisoneinwhich X each edge has size k. Some view k-uniform hypergraphs as the most natural generalization of a graph r (agraphisa2-uniformhypergraph)andonemightalsogeneralizetheforbiddensubgraphproblemtoa a forbidden k-uniform subhypergraph problem. There are both asymptotic results and exact bounds (e.g. [dCF00],[Pik08],[Fu¨r91]). Forbidden Configurations is a different (but also natural) generalization that is studied mainly by Richard Anstee and his colaborators. We consider the following problem: Given m ∈ N and a hy- pergraph F, find the maximum number of edges in a simple hypergraph H on m vertices that avoids havingasubhypergraphisomorphictoF. †Email:[email protected],UNAM subm.toDMTCS (cid:13)c bytheauthorsDiscreteMathematicsandTheoreticalComputerScience(DMTCS),Nancy,France 2 MiguelRaggi Wefinditconvenienttousethelanguageofmatricestodescribehypergraphs:Eachcolumnofa{0,1}- matrixcanbeviewedasanincidencevectoronthesetofrows. Definition 1.1. Define a matrix to be simple if it is a {0,1}-matrix with no repeated columns. Then an m × n simple matrix corresponds to a simple hypergraph (or set system) on m vertices with n distinctedgeswhereweallowtheemptyedge. Let(cid:107)A(cid:107)denotethenumberofcolumnsinA(whichisthe cardinalityoftheassociatedsetsystem). Definition1.2. LetAandBbe{0,1}-matriceswiththesamenumberofrows.Definetheconcatenation [A|B] to be the configuration that results from taking all columns of A together with all columns of B. Fort∈N,wedefinetheproduct t·A:=[A|A| ··· |A]. (cid:124) (cid:123)(cid:122) (cid:125) ttimes Ourobjectsofstudyare{0,1}-matriceswithrowandcolumnorderinformationstrippedfromthem: Definition1.3. Two{0,1}-matricesaresaidtobeequivalentifoneisarowandcolumnpermutationof theother. Thisdefinesanequivalencerelation. Anequivalenceclassiscalledaconfiguration. Abusingnotation,wewillcommonlyusematrices(representatives)andtheircorrespondingconfigura- tionsinterchangeably. Definition 1.4. For a configuration F and a {0,1}-matrix A (or a configuration A), we say that F is a subconfigurationofA,andwriteF ≺AifthereisarepresentativeofF whichisasubmatrixofA. We sayAhasnoconfigurationF (ordoesn’tcontainF asaconfiguration)ifF isnotasubconfiguration ofA. LetAvoid(m,F)denotethesetofallm-rowedsimplematriceswithnosubconfigurationF. Ourmainextremalproblemistocompute forb(m,F)=max{(cid:107)A(cid:107) : A∈Avoid(m,F)}. A Perhapssomeexamplesareuseful: (cid:20) (cid:21) 1 • forb(m, )=m+1,sincewecantakeallcolumnswithatmostone1. 1 (cid:20) (cid:21) 1 • forb(m, )=2,sincewemayonlytakethecolumnof1’sandthecolumnof0’s(i.e. theempty 0 setandthecompleteset). (cid:20) (cid:21) • forb(m, 1 1 )=(cid:0)m(cid:1)+(cid:0)m(cid:1)+(cid:0)m(cid:1),bytakingallcolumnswithatmosttwo1’s. Theproofthat 1 1 2 1 0 thisisindeedthemaximumiseasyandcanbefoundin[Rag11]. Surveysonthetopiccanbefoundin[Ans]and[Rag11]. LetAc denotethe{0,1}-complementofA (replaceevery0inAbya1andevery1bya0). Notethatforb(m,F)=forb(m,Fc). Remark1.5. LetF andGbeconfigurationssuchthatF ≺G. Thenforb(m,F)≤forb(m,G). FindingthenumberpredictedbytheAnstee-SaliConjectureisNP-hard 3 Wesayacolumnαhascolumnsumtifithasexactlytones. Define0 tobeacolumnwithm0’s m and1 tobeacolumnofm1’s. m Forasetofrows S, weletA| denotethesubmatrixof A givenbyrestrictingtherows ofA toonly S thoseinS. AnimportantgeneralresultduetoFu¨rediappliestosimpleortonon-simpleconfigurations. Theorem1.6(Z.Fu¨redi). LetF beagivenk-rowed{0,1}−matrix. Thenforb(m,F)isO(mk). We desire more accurate asymptotic bounds. Anstee and Sali conjectured that the best asymptotic boundscanbeachievedwithcertainproductconstructions. Definition1.7. LetAandBbe{0,1}-matrices. WedefinetheproductA×Bbytakingeachcolumnof AandputtingitontopofeverycolumnofB. Hereisanexampleofaproduct: 0 0 1 1 1 1 (cid:20) (cid:21) (cid:20) (cid:21) A 0 1 1 1 0 0 0 0 0 1 1 A= , B = =⇒ ×= . 0 0 1 0 1 1 0 1 0 1 0 B 0 1 0 1 0 1 Notethatthisisawelldefinedoperationinconfigurations. We are interested in asymptotic bounds for forb(m,F). Let I be the m×m identity matrix, Ic m m be the {0,1}-complement of I (all ones except for the diagonal) and let T be the tower matrix: a m m matrixcorrespondingtoamaximumchaininthepartiallyorderedsetofthepowersetofthevertices. For example, 0 1 1 1 1 0 0 1 1 1 T4 =0 0 0 1 1 0 0 0 0 1 AnsteeandSaliconjecturedthattheasymptotically“best”constructionsavoidingasingleconfiguration wouldbeproductsofI,IcandT. Conjecture1.8. [AS05]LetF beaconfiguration. Let P (a,b,c):=I ×...×I ×Ic×...×Ic×T ×...×T , r r r r r r r (cid:124) (cid:123)(cid:122) (cid:125) (cid:124) (cid:123)(cid:122) (cid:125) (cid:124) (cid:123)(cid:122) (cid:125) atimes btimes ctimes DefineX(F)tobethelargestnumbersuchthatthereexistnumbersa,b,c∈Nwitha+b+c=X(F) suchthatforallr ∈N, F ⊀P (a,b,c). r Thenforb(m,F)isΘ(mX(F)). Observe that X(F) is always an integer. Also note that (cid:107)P (a,b,c)(cid:107) = ra+b · (r + 1)c which is r Θ(rX(F)),sobytakingr = (cid:100)m/X(F)(cid:101)(andperhapsdeletingsomerowsincaseX(F) (cid:45) m),wehave that(cid:107)P (a,b,c)(cid:107)isΩ(mX(F)),sothefactthatforb(m,F)isΩ(mX(F))isbuiltintotheconjecture. In r ordertoprovetheconjecture,allthatwouldberequiredwouldbetoprovethatforb(m,F)isO(mX(F)) foreveryF. Adisproofcouldbepotentiallyeasier,asonlyacounterexamplewouldberequired. 4 MiguelRaggi Theconjecturehasbeenprovenforallk×(cid:96)configurationsF withk ∈{1,2,3}andmanyotherscases invariouspapers. Theproofsfork = 2arein[AGS97], fork = 3in[AGS97], [AFS01], [AS05]. For (cid:96) = 2,theconjecturewasverifiedin[AK06]. Fork = 4,allcaseseitherwhentheconjecturepredictsa cubicboundforF orwhenF issimplewerecompletedin[AF10]. Fork = 4andF non-simple,there arethreeboundarycaseswithquadraticbounds,oneofwhichisestablishedin[ARS12]. Fork ∈ {5,6} someresultscanbefoundin[ARS11]. AnsteehaslongconjecturedthatevenfindingX(F)givenF wasnotatrivialtask,andthequestionof itsNP-hardnesswaslongconjecturedbutnotproven([Ans],[Rag11]). Inthispaperwesettlethatfinding X isindeedNP-hardandwealsonotethatoneofthedecisionversionsassociatedwiththisoptimization problemisNP-complete,addingthisfunctiontothelonglistoffunctionsknowntobeNP-complete,with theinterestingplusthatthisfunctionisconjecturedtogivetheexponentoftheasymptoticgrowthofforb. For relatively small configurations F we have a computer program that yields the answer (relatively) quickly. Thesourcecode(inC++)canbefreelydownloadedfrom: http://matmor.unam.mx/˜mraggi/ This program can compute X(F) for F having less than ∼10 rows in just a few minutes. This task takesmerelyexponentialtime,notdoublyexponential(asitisoftenthecasewithforbiddenconfiguration problems).ThisprogramwaswrittentoperformmanyothertasksotherthanfindingX(F).Adescription ofthealgorithmusedforthistaskisintheappendix4. 2 Results TherearetwonaturaldecisionproblemsassociatedwithX(F): GivenF andkasinputs, 1. IsittruethatX(F)<k? 2. IsittruethatX(F)≥k? WeprovethatthefirstofthetwodecisionproblemsisinNPbyexhibitingacertificatewhichcanbe checkedinpolynomialtime. Themainresultofthispaperisthefollowing: Theorem 2.1. Finding X(F) is an NP-hard problem. In other words, should a polynomial-time algo- rithm exist for finding X(F) given F, then every problem in NP could be solved in polynomial time. Furthermore,theproblemof“givenF andk,isX(F)<k?” isinNP. Beforeprovingthistheorem,weneedafewlemmas. Observation2.2. LetF beaconfigurationwithnrows. ThenX(F) ≤ n. Indeed,assumeforthesake ofcontradictionthata,b,andcaresuchthata+b+c=n+1andF ⊀P (a,b,c). Wemayplaceeach r rowofF eachintoadifferentfactoroftheproduct. TheextramatrixensurestherepeatedcolumnsofF getrepeatedasmanytimesasneeded. This observation in particular implies that if a polynomial time algorithm existed for any of the two decisionversionsoftheproblem,thenwe’dhaveapolynomialtimealgorithmforfindingX(F),which togetherwithTheorem2.1wouldmakethedecisionversionoffindingX(F)anNP-completeproblem. Asimple(butsurprising)corollaryofconjecture1.8,ifitweretrue,wouldbethatrepeatingcolumns morethantwiceinF hasnoeffectontheasymptoticbehaviorofforb(m,F). Inotherwords,assuming FindingthenumberpredictedbytheAnstee-SaliConjectureisNP-hard 5 theconjectureweretrue,themultiplicityofacolumninaconfigurationwouldnotaffecttheasymptotic boundand,asymptotically,itwouldonlymatterifacolumnisnotthere(hasmultiplicity0),appearsonce (hasmultiplicity1),orappears“multipletimes”(hasmultiplicity2ormore). Formally, Proposition 2.3. Let F = [G|t · H] with G and H simple {0,1}-matrices that have no columns in t common. ThenX(F )=X(F )forallt≥2. Inparticular,iftheconjectureweretrue,thenforb(m,F ) 2 t t andforb(m,F )wouldhavethesameasymptoticbehavior. 2 Proof: Itsufficestoshowthatgivent,G,H,a,bandc,thereexistsanRsuchthatforeveryr ≥ R,we have F =[G|2·H]≺P (a,b,c) ⇐⇒ F =[G|t·H]≺P (a,b,c). 2 r t r Since F ≺ F , we only need to prove that if F ≺ P (a,b,c) for some r, then F ≺ P (a,b,c) 2 t 2 r t R for some R. Suppose then F is contained in the product P (a,b,c) for some r. The idea is to find a 2 r subconfigurationofP (a,b,c)inwhichtherearesomecolumnswithmultiplicity1,andforthecolumns r withmultiplicity2ormore,themultiplicitydependsonr. Weneedrlargeenoughsothatthemultiplicity ofanyonecolumn(withmultiplicityof2ormore)islargerthant. LetxbethenumberofrowsofF . t Noticethefollowingthreefacts,whichincludedefinitionsforE ,E andE . I Ic T E (x,r):=[(r−x)·0 |I ]≺I I x x r E (x,r):=[(r−x)·1 |Ic]≺Ic Ic x x r (cid:106)r(cid:107) E (x,r):= ·T ≺T . T x x r The first and second facts are easy to see; just take any subset of x rows from I or Ic. The third r r statementistruebytakingthe(cid:98)r/x(cid:99)-throwofT ,the2(cid:98)r/x(cid:99)-throwofT ,etcetera,uptothex(cid:98)r/x(cid:99)-th r r row. Forexample,ifr =5andx=2,wemaytakethesecondandfourthrowfromT : 5 0 1 1 1 1 1 0 0 1 1 1 1 (cid:20) (cid:21) 0 0 1 1 1 1 T5 =0 0 0 1 1 1 =⇒ T5|{2,4} = 0 0 0 0 1 1 =ET(2,5) 0 0 0 0 1 1 0 0 0 0 0 1 Note that in the three configurations E (x,r),E (x,r) and E (x,r), we have that there are some I Ic T columnsofmultiplicity1andtherearesomecolumnsforwhichtheirmultiplicitycanbemadeaslarge as we wish by making r large. Formally, let E(x,r) be one of E (x,r) or E (x,r) or E (x,r). We I Ic T havethatforeveryx-rowedcolumnαtherearethreepossibilities: eitherλ(α,E(x,r)) = 0forallr,or λ(α,E(x,r))=1forallr,or lim λ(α,E(x,r))=∞. r→∞ If α is a column for which lim λ(α,E(x,r)) = ∞, we may conclude that there is an R for which r→∞ λ(α,E(x,r))≥tforeveryr ≥R. Since F is contained in P (a,b,c) for some r, the columns in H will have multiplicity at least 2 in 2 r somesubsetoftherowsofP (a,b,c). WeseethatF isalsoasubconfigurationofP (a,b,c). r t R 6 MiguelRaggi 3 Proof of the Main Theorem Wenowprovethemaintheorem. Proof: First we prove that the decision problem has a certificate which can be checked in polynomial time. AcertificatethatindeedX(F) < k wouldhavetobeaproofthatF ≺ P (a,b,c)foreachtriple r a,b,c for which a+b+c = k. Note that there are at most a quadratic (with respect to the number of rows)numberofa,b,c’swhichsatisfytheequation,sincethequestionhasatrivial“yes”answerwhenk morethanthenumberofrows(Observation2.2). GivenF andAconfigurations,onecaneasilyconstructacertificatethataconfigurationF isindeeda subconfigurationofaconfigurationA: explicitlystatewhichpermutationofF appearsinexactlywhich rowsandcolumnsofA. ForthecaseA=P (a,b,c),acertificateonlyneedstospecifywhichrowsofF r go inside which factors, so at most a quadratic number of these certificates-of-being-a-subconfiguration suffice. WenowprovethatfindingX(F)isNP-hard. Supposethereexistedsomepolynomial-timealgorithm that finds X(F) given F. We shall prove that there would then exist a polynomial time algorithm for GRAPH COLORING. Suppose we are given a graph G and we wish to find the minimum number of colorsforwhichthereexistsagoodcoloringofthegraph. Wemayassumenoisolatedvertices. Theideaistoconstructa3-partmatrixF(G)inwhichthefirsttwopartsensurethereisnoT orIcina maximumproductoftheformP (a,b,c)withnosubconfigurationF(G),andthelastpartisconstructed r so that a partition into I(cid:48)s produces a partition of the vertices of the graph into independent sets and vice-versa. Suppose G has n vertices and e edges. Let M be a large number with M ≥ n+2 and let S be the incidence matrix of G (i.e., the edges of G are encoded as columns with two 1’s corresponding to the verticesthatbelongtotheedge). Constructthefollowingsimplematrix: 1 0 Ic M M F(G):= 1 TM S 0 ClearlywecanconstructF(G)inpolynomialtime(withrespecttothenumberofverticesofG). We provenowthatwehaveχ(G)=X(F)−2M+1,whichinturnwouldyieldapolynomialtimealgorithm forGRAPHCOLORING,providedwehadapolynomialtimealgorithmforX(F). Now,letusstudythepossibilitiesforaproductoftypeP (a,b,c)thatdoesnothaveF(G)asasubcon- r figurationforanyr. Ifb(cid:54)=0,thenwecouldplaceallof[1|Ic ]intheIc partofP (a,b,c),soa+b+c M r wouldbeatmost1+M +n(usingLemma2.2). Thesameistruewhenc(cid:54)=0. Butifweletb=c=0, P (a,0,0)isjustaproductofI’s,soletuscalculatehowmanyI(cid:48)swecanmultiplytogetherandstillnot r createasubconfigurationF(G). InorderforF(G)tobeapartofaproductofI(cid:48)s,everyrowof[1|Ic ]and[1|T ]mustbeinaseparate M M (cid:20) (cid:21) 1 factorI, sincethereisno inI (andalsoseparatefromtherowsof[S|0], sinceweareassumingG 1 hasnoisolatedvertices). (cid:20) (cid:21) 1 Then two rows of the [S|0] part can be in the same I if and only if there is no in those two 1 rows, which, intermsofthegraph, meansthereisnoedgebetweenthosetwovertices. Inotherwords, partitioning [S|0] into I’s is equivalent to partitioning the vertices of G into independent sets. So if the FindingthenumberpredictedbytheAnstee-SaliConjectureisNP-hard 7 graphGcannotbecoloredwithχ(G)−1colorsandthisisthemaximum,thismeansthatX(F)=a= 2M +χ(G)−1≥n+M +1. Thenχ(G)=X(F(G))−2M +1. References [AF10] R.P. Anstee and B. Fleming, Two refinements of the bound of Sauer, Perles and Shelah and VapnikandChervonenkis,DiscreteMathematics310(2010),3318–3323. [AFS01] R.P.Anstee,R.Ferguson,andA.Sali,SmallforbiddenconfigurationsII,ElectronicJournalof Combinatorics8(2001),R425pp. [AGS97] R.P. Anstee, J.R. Griggs, and A. Sali, Small forbidden configurations, Graphs and Combina- torics13(1997),97–118. [AK06] R.P. Anstee and P. Keevash, Pairwise intersections and forbidden configurations, European JournalofCombinatorics27(2006),1235–1248. [Ans] R.P.Anstee,Asurveyofforbiddenconfigurationresults,http://www.math.ubc.ca/∼anstee/. [ARS11] R.P. Anstee, Miguel Raggi, and Attila Sali, Forbidden configurations: Quadratic bounds, preprint.(2011). [ARS12] R.P. Anstee, M. Raggi, and A. Sali, Evidence for a forbidden configuration conjecture: One morecasesolved,DiscreteMathematics312(2012),no.17,2720–2729. [AS05] R.P.AnsteeandA.Sali,SmallforbiddenconfigurationsIV,Combinatorica25(2005),503–518. [dCF00] DominiquedeCaenandZ.Fu¨redi,Themaximumsizeof3-uniformhypergraphsnotcontaining aFanoplane,JournalofCombinatorialTheory,SeriesB78(2000),274–276. [ES46] P. Erdo˝s and A.H. Stone, On the structure of linear graphs, Bulletin of the American Mathe- maticalSociety52(1946),1089–1091. [ES66] PaulErdo˝sandMiklo´sSimonovits,Alimittheoremingraphtheory.,StudiaScientiarumMath- ematicarumHungarica1(1966),51–57. [Fu¨r91] Z. Fu¨redi, Tura´n type problems, Surveys in Combinatorics (Proc. of the 13th British Combi- natorialConference),ed.A.D.Keedwell,CambridgeUniv.Press.LondonMath.Soc.Lecture NoteSeries166(1991),253–300. [Pik08] O. Pikhurko, An exact bound for Tura´n result for the generalized triangle, Combinatorica 28 (2008),187–208. [Rag11] MiguelRaggi,Forbiddenconfigurations,Ph.D.thesis,UniversityofBritishColumbia,2011. 8 MiguelRaggi 4 Appendix: Algorithm to find X(F) In this section we describe the algorithm used by the software described in the introduction. It runs in exponential time, of course, but it has been helpful for finding boundary configurations with given asymptotic bounds (see [ARS11], [Rag11] and [Ans]). Perhaps this program might be used to find a counterexampletotheAnstee-Saliconjecture,providedoneexists,anditisn’tverylarge. 4.1 Representation of a Configuration Weareinterestedinanefficientrepresentationforconfigurationsinordertoperformthetasksdescribed above. Intheprogressofourinvestigations,wehavehadvariousversionsoftheprogram. Mostofwhatwewanttheprogramtodoinvolvesperformingahugenumberofconfigurationcompar- isonoperations,whichistestingwhetherornotaconfigurationF isasubconfigurationofaconfiguration A. Asafirstapproachitwouldseemasif,forthistask,wewouldberequiredtotesteachrowandcolumn permutationofF againsteachsubmatrixofA.Thisisofcourseaveryslowwaytodothis.Asimpletrick tospeedupthecomputationsistokeepthecolumnsofaconfigurationalwaysinsomecanonicalorder. Then,totestwhetherornotaconfigurationF iscontainedinanotherA,wejustneedtopermuterowsof F andtakesubsetsS ofrowsofAandplacethecolumnsofA| incanonicalorder. S Mostofthetaskswewishthisprogramtoperforminvolvecheckingwhetherornotagiven(fixed)con- figurationF isasubconfigurationofavastnumberofconfigurationsA. Inparticular,anypre-processing wedoonF canbeconsideredasalmostfree. Forexample,findingallrowpermutationsofF andstoring themwouldneedtobedoneonceforeachconfigurationF,andnotatallforconfigurationsA. Aftermanyattemptsandexperiments,itseemedthatthebest(fastest)waytostoreaconfigurationthat mademanyoftheothertasksreasonablyfastisthis:Maintainanarrayofintegerswheretheindicesofthe array,writteninbinary,arethecolumnsoftheconfiguration,andtheactualnumbersofthearrayrepresent thenumberoftimesacolumnappears. Thatistosay,aconfigurationF inmrowsisrepresentedbyan array(C++vector) Fofsize2m. Foranumberα, considerthebinaryrepresentationof α andconsider itasacolumnwithmrows. Ifnecessary,putenough0’satthebeginningofthebinaryrepresentationin ordertohavetherequiredmbits. ThenumberF[α](theα-thnumberofthearray)representsthenumber oftimesthatcolumnαappearsinconfigurationF. Intheimplementation, weuseanarrayofunsigned charactersinsteadofintegers,sinceweneverneedaconfigurationwiththesamecolumnrepeatedmore than255times. Anunsignedcharacterconsistsof1byte(8bits). Forexample,thearrayF = [1,0,0,2,0,1,0,1]representsthefollowingconfiguration(noticeithas3 rows,sincethearrayhassize8=23): 0 0 0 1 1 F =0 1 1 0 1. 0 1 1 1 1 To see this, remember we start from 0. There is a one in position 0 = 000 , meaning the colum b (0,0,0)T getsrepeatedonetime. Atwoinposition3 = 011 , aoneinposition5 = 101 andaonein b b position7=111 . Thecolumnsofthismatrixaretherepresentationsofthesenumbersinbinaryform. b Anobservantreadermightcomplainthatthishasthedisadvantagethatitrequiresstoring2mbytes,and ifF doesn’thavemanycolumns,mostofthosewillbe0’s. Butit’saminordisadvantage,becauseeven at10rowswewouldonlyneed1024bytes,andweusuallyhaveconfigurationsforwhichthenumberof rowsis5orless(32bytes). Perhapsthiswouldbecomemoreofanissuewithconfigurationswithahigh FindingthenumberpredictedbytheAnstee-SaliConjectureisNP-hard 9 numberofrows,butforthoseconfigurations,mostofourtaskswouldrequiretoomuchtimetobeofany practicaluseanyway. Wecametothisrepresentationafteranimplementationwhichrepresentedcolumnsasanarrayofbits (C++bitset)andstoringthemintoanorderedtree-likestructure(C++multisetfromtheSTL).Thismight beamorenaturalimplementation,butprofilingthecodemadeclearthattheprogramwasspendingmostof itstimecountinghowmanycolumnsofacertaintypeappearedinaconfiguration,andwasalsospending a considerable amount of time navigating the tree. Explicitly storing the number of times each column appears,andmakingthatnumberinstantlyaccessiblebystoringitinanarray(forrandomaccess)givesa verynoticeablespeedupandallowsustoconsiderlargerproblems. Byrepresentingcolumnsasnumbers, wecandoalotofpreprocessingandcomputelargetablesinwhichwehavealmostinstantaccesstime. Forexample, considerthefollowingproblem, whichhastobedonemanytimesforourtasks: Given acolumnαandasubsetS (representedbyanintegeraswritteninbinary),whatcolumnisα ? Thisis S relativelyslowtocompute,butwecanfilloutatablebypreprocessingtospeedupanyfurtheraccessto it. Sincewedothisafewmilliontimes,theinvestmentissound. Theotheradvantageisthatitbecomesimmediatelyclearhowtocomparetwoconfigurationswiththe samenumberofrowstoseeifoneisacolumn-permutationsubmatrixoftheother;checkifforanycolumn (index)theintegeratpositioncofthefirstarrayisbiggerthanthatofthesecondarray. TocheckifF is asubconfigurationofA,wewouldneedtofindallpermutationsoftherowsofF (whichweneedtodo justonceperconfiguration). 4.2 Subconfigurations SupposeF andAareconfigurationsandwewanttodecideifF ≺ A. Thenforeverys-tupleofrowsS (wheresisthenumberofrowsofF),wecanextractfromAtheconfigurationA| easilywithourpre- S storedtableofcolumnsandsubsets. Oncewe’vedonethisforeachcolumnofAandfoundA| ,thenfor S eachpermutationofrowsofF,wecheckifeverycolumnαinthearraycorrespondingtoconfigurationF appearslessthanorequaltothecorrespondingnumberforcolumnαinthearrayofA| . Wecancheck S everysubsetS likethis. Ifatanypointthisisso,wecanreturntrue. There are a few speedups. Sometimes it’s immediately obvious a configuration can’t be contained in another. Forexample,iftherearemore1’sinF thaninA,orifF hasmorecolumnsorrowsthanA,then F ⊀A. 4.3 Determining X(F) GivenaconfigurationF,wewishtofindX(F). Inotherwords,wewishtofindtheconjecturedasymp- toticboundforforb(m,F). WemaymakeasimplificationusingProposition2.3andassumethemultiplicityofanycolumnofF is atmost2. First,supposewewantedtotestwhetherornotthereexistsrsuchthatconfigurationF iscontainedin theproduct P (a,b,c)=I ×...×I ×Ic×...×Ic×T ×...×T . r r r r r r r (cid:124) (cid:123)(cid:122) (cid:125) (cid:124) (cid:123)(cid:122) (cid:125) (cid:124) (cid:123)(cid:122) (cid:125) atimes btimes ctimes Buildingthisobjectwithr = RascalculatedinProposition2.3wouldbeprohibitivelyslow. Instead, we build a set X of subconfigurations from P (a,b,c) such that if F ≺ P (a,b,c), then F ≺ X for R R someX ∈X. 10 MiguelRaggi Recall that if F ≺ P (a,b,c), then the rows of F get partitioned into a+b+c parts (a part can be R empty), where each part belongs to a factor of the product P (a,b,c). Because of Proposition 2.3, we R canassumeeachcolumnappearsatmosttwice. Givens∈N,considerthefollowingmatrices: A (s)=(cid:2)t·0 I (cid:3) , A (s)=(cid:2)t·1 Ic(cid:3) , A (s)=t·(cid:2)T (cid:3). I s s Ic s s T s Weseethatans-rowedconfigurationF (witheachcolumnrepeatedatmostttimes)iscontainedinI m forsomelargem,ifandonlyifF ≺ A (s). WecanthenconsiderallpartitionsofrowsofF andseeif I eachpartiscontainedinthecorrespondingA ,A orA . I Ic T Forexample,totestwhether 1 0 1 1 0 1 0 1 F =1 1 0 0 0 0 1 1 1 0 0 1 iscontainedinI ×T ×T,wewouldpartitiontherowsofF inthreeparts. Inthiscase,F hasfiverows, soconsider,forexample,thefollowingpartitionof5: (2,2,1). Considerthefollowingrepresentatives: (cid:20) (cid:20) (cid:21) (cid:12)(cid:20) (cid:21)(cid:21) (cid:20) (cid:21) 0 (cid:12) 1 0 0 1 1 (cid:2) (cid:3) AI(2)= s· 0 (cid:12)(cid:12) 0 1 , AT(2)=s· 0 0 1 , AT(1)=s· 0 1 , andbuilda5-rowedmatrixA := A (2)×A (2)×A (1). IfF ≺ A,thenF ≺ I ×T ×T. Ifwedo I T T thisforeverypossiblepartitionof5,wegetthedesiredresult. To find X(F) we can build a tree of possibilities of products and prune whenever we hit a node that alreadyhasconfigurationF: We return the product with the largest number of factors for which F is not a subconfiguration of a product of this form, observing that if F is a subconfiguration of a product P (a,b,c), then it will be a r subconfigurationinanyproductP (a(cid:48),b(cid:48),c(cid:48))witha(cid:48) ≥a,b(cid:48) ≥bandc(cid:48) ≥c. r 4.4 Finding Boundary Cases and Classifying Configurations Givennumberofrowssandanumberk,wewishtofindallboundarycases,thatismaximalandminimal F suchthatX(F) = k. WemakeuseoftheprogramdescribedintheprevioussectiontofindX(F)for manydifferentF. Theboundarycasesforsmallsandkcanbefoundin[Ans]and[Rag11].