Table Of Content

The Massey’s method for sport rating: a network science perspective EnricoBozzo 7 1 DepartmentMathematics,ComputerScience,andPhysics 0 UniversityofUdine 2 [email protected] n a MassimoFranceschet J DepartmentMathematics,ComputerScience,andPhysics 2 UniversityofUdine 1 [email protected] ] I S January13,2017 . s c [ Abstract 1 WerevisittheMassey’smethodforratingandrankinginsportsandcontextu- v alizeitasageneralcentralitymeasureinnetworkscience. 3 6 3 1 Introduction 3 0 . Rating and ranking in sport have a flourishing tradition. Each sport competition has 1 its own official rating, from which a ranking of players and teams can be compiled. 0 The challenge of many sports’ fans and bettors is to beat the official rating method: 7 1 todevelopanalternativeratingalgorithmthatisbetterthantheofficialoneinthetask : ofpredictingfutureresults. Asaconsequence, manysportratingmethodshavebeen v i developed.AmyN.LangvilleandCarlD.Meyerevenwrotea(compelling)bookabout X (general)ratingandrankingmethodsentitledWho’s#1? [11]. r In 1997, Kenneth Massey, then an undergraduate, created a method for ranking a college football teams. He wrote about this method, which uses the mathematical theoryofleastsquares,ashishonorsthesis[12]. Informally,atanygivendaykofthe season,Massey’smethodratesateamiaccordingtothefollowingtwofactors: (a)the differencebetweenpointsforandpointsagainsti,orpointspreadofi,uptodayk,and (b)theratingsoftheteamsthatimatcheduptodayk. Hence,highlyratedteamshave a large point differential and matched strong teams so far. Below in the ranking are teamsthatdidwellbuthadaneasyscheduleaswellasteamsthatdidnotsowellbut hadatoughschedule. In this paper, we review the original Massey’s method and explicitly provide the algebraic background of the method. Moreover, we interpret Massey’s technique in 1 thecontextofnetworkscience. Thepaperisorganizedasfollows. Section2reviews theoriginalMassey’smethod. Inparticular,Section2.1providesafullexampleofthe method,Section2.2embedsthemethodinthecontextofnetworkscience,andSection 2.3givesthealgebraicbackgroundoftheMassey’smethodforthecuriousreader. We review related methods for sport rating in Section 3. Finally, we discuss alternative (outsidethesportcontext)usesofMassey’smethodsinSection4. 2 The Massey’s method for sports ranking The main idea of Massey’s method, as proposed in [12], is enclosed in the following equation: r −r =y i j k wherer andr aretheratingsofteamsiand jandy istheabsolutemarginofvictory i j k forgamekbetweenteamsiand j. Iftherearenteamswhoplayedmgames,wehave alinearsystem: Xr=y (1) where X is a m×n matrix such the k-th row of X contains all 0s with the exception of a 1 in location i and a −1 in location j, meaning that team i beat team j in match k (ifmatchk endswithadraw, eitherior j locationcanbeassigned1, andtheother −1). Observethat, ifedenotesthevectorofall1’s, thenXe=0. LetM=XTX and p=XTy. Noticethat (cid:26) thenegationofthe#ofmatchesbetweeniand j ifi(cid:54)= j, M = i,j #ofgamesplayedbyi ifi= j. and p isthesignedsumofpointspreadsofeverygameplayedbyi. Clearlytheentries i ofpsumto0,infacteTp=eTXTy=(Xe)Ty=0.TheMassey’smethodisthendefined bythefollowinglinearsystem: Mr=p (2) whichcorrespondstotheleastsquaressolutionofsystem(1). We observe how the Massey’s team ratings are in fact interdependent. Indeed, Massey’smatrixMcanbedecomposedas M=D−A, whereDisadiagonalmatrixwithD equaltothenumberofgamesplayedbyteam i,i i,andAisamatrixwithA equaltothenumberofmatchesplayedbyteamiagainst i,j team j. Hence,linearsystem(2)isequivalentto Dr−Ar=p, (3) 2 or,equivalently r=D−1(Ar+p)=D−1Ar+D−1p. (4) Thatis,foranyteami 1 p r = ∑A r + i . i i,j j D D i,i j i,i (1) (2) Thismeansthattheratingr ofteamiisthesumr +r oftwomeaningfulcompo- i i i nents: 1. themeanratingofteamsthatihasmatched 1 r(1)= ∑A r ; i D i,j j i,i j 2. themeanpointspreadofteami p (2) i r = . i D i,i How do we solve the Massey’s system Mr= p? Unfortunately, M =D−A is a singular matrix, hence it is not possible to solve this system computing the inverse of M. Notice that the symmetric matrix A can be interpreted as the adjacency matrix of an undirected weighted graph that we denote with G . It is well known that A is A irreducible if and only if G is connected. Assuming that matrix A is irreducible, or A equivalentlythatthegraphG isconnected, asolutionofthesystemcanbeobtained A asfollows: letMˆ bethematrixobtainedbyreplacinganyrow,saythelast,ofMwitha rowofall1’s,andlet pˆbethevectorobtainedbyreplacingthelastelementof pwith a0. Then,solvethesystem: Mˆr=pˆ (5) Notice that the added constraint forces the ranking vector r to sum to 0. See Section 2.3forthemathematicaldetails. WhatabouttheconnectivityhypothesisofthegraphofmatchesG ? Withoutsuch A assumption, the ranking cannot be computed. Indeed, this is not a strong constraint. Assume, realistically, a competition in which there are n teams matching. At each seasondayeachteammatchesanotherteamnotmatchedbefore. Hence,forallteams to be matched at any day, n must be even and we have n/2 matches each day for a maximum of n−1 days before teams match twice. This is known as round-robin competition. AtseasondaykthegraphG isobtainedasthedisjointunionofkperfect A matchings, hence, in particular, it is regular with degree k. Moreover, we have the followingresult: Lemma1. Givenn>2teams,withneven,competinginaround-robincompetition, then: 3 1. theminimumnumberofdaysforthegraphofmatchestobeconnectedis2; 2. themaximumnumberofdaysforthegraphofmatchestobeconnectedisn/2. Proof. Weproveitem1. Afteroneseasondaythegraphofmatchesisnotconnected beingjustaperfectmatchingoftheteams. Aftertwodaythegraphcanbecomeacycle andhencecanbeconnected. Wenowproveitem2. Firstofall,letusobservethataftern/2−1seasondaysthe graphcanbenotconnected,buttheonlypossibilityisthatitisformedbytwocomplete graphs with nodes inV andV having each n/2 nodes. At day n/2, all matches are 1 2 betweenpairsofteamsoneofwhichisinV andtheotherinV ,resultinginaconnected 1 2 graph. Typically,theactualnumberofdaysforthematchgraphtobeconnectediscloseto theminimum. Forinstance,weexperimentedthatinthelast11editionsoftheItalian soccerleague(SerieA),after2or3daysthematchgraphisconnected,withameanof 2.6. Masseyalsoproposedanoffensiverating(o)andadefensiverating(d)character- izingtheoffensiveanddefensivestrengthsofteams. Masseyassumesthatr=o+d, thatis,theoverallstrengthofateamisthesumofitsoffensiveanddefensivepowers. Let’sdecomposethepointspreadvector p= f−a,where f holdsthetotalnumberof pointsscoredbyeachteamandaholdsthetotalnumberofpointsscoredagainsteach team. Then,theequationsdefiningoandd are: (cid:40) Do−Ad= f Ao−Dd=a. Thatisforeachteami: (cid:40)oi= D1i,i(∑jAi,jdj+fi) di= D1i,i(∑jAi,joj−ai) Thismeansthattheoffensiveratingofteamimultipliedbythenumberofgamesplayed byiisequaltothedefensiveratingsofopponentsofiplusthenumberofpointsscored by i. On the other hand, the defensive rating of team i multiplied by the number of gamesplayedbyiisequaltotheoffensiveratingsofopponentsofiminusthenumber ofpointsscoredagainsti. Sinceweknowthatr=o+d,wehavethat,knowingr,thedefensiveratingd can becomputedas: (D+A)d=Dr−f (6) and,knowingd andr,theoffensiveratingo=r−d. Howdowesolvesystem(6)? Infact,itmighthappenthatmatrixD+Aissingular, hence it cannot be inverted. Assuming that the graph G is connected, it holds that A D+AissingularpreciselywhenG isbipartite. LetN=D+Aandq=Dr−f. IfG A A isconnectedandbipartitewithnodesetsU andV,wecanapplytothesystemNd=q aperturbationtricksimilartotheoneexploitedfortheMassey’ssystemMr= p. Let 4 Nˆ bethematrixobtainedbyreplacinganyrow,saythelast,ofN withvectorvdefined as v =1 if i∈U and v =−1 if i∈V and let qˆ be the vector obtained by replacing i i the last element of q with a 0. Then, a solution of system (6) is obtained solving the followingperturbedsystem: Nˆd=qˆ (7) Again, seeSection2.3forthemathematicaldetails. Assumingaround-robincompetition, the hypothesis that the match graph G is non-bipartite is not a strong one, as A provedinthefollowingresult. Lemma2. Givenn>2teams,withneven,competinginaround-robincompetition, then: 1. theminimumnumberofdaysforthegraphofmatchestobenon-bipartiteis3; 2. the maximum number of days for the graph of matches to be non-bipartite is n/2+1. Proof. Weproveitem1. Aftertwodaysthegraphofthematchescannothavecycles oflength3. After3daysitispossibletohavecyclesoflength3sothatthegraphcan benon-bipartite. Wenowproveitem2. Aftern/2daysthegraphofthematchescanbebipartitebut theonlypossibilityisthatitisacompletebipartitegraphwithnodesetsV andV such 1 2 that|V |=|V |=n/2. Atdayn/2+1allmatchesarebetweenpairsofnodeseitherin 1 2 V orinV . Hence, somecyclesoflength3areformedandthereforethegraphisno 1 2 morebipartite. Typically,theactualnumberofdaysforthematchgraphtobenon-bipartiteisclose totheminimum.Forinstance,weexperimentedthatinthelast11editionsoftheItalian soccer league (Serie A), within 5 days the match graph is non-bipartite, with a mean of3.7. Hence,weexpectthattheMassey’smethodisfullyapplicableafterfewseason daysinthecompetition. 2.1 ExampleofMassey’smethod The table below shows the results of 4 matches (numbered 1, 2, 3, 4) involving 4 fictitiousteams(labelledA,B,C,D): match team1 team2 score1 score2 1 A C 2 0 2 A D 3 0 3 B C 1 1 4 B D 2 1 Thematch-teammatrixX isgivenbelow: 5 A B C D 1 1 0 -1 0 2 1 0 0 -1 3 0 1 -1 0 4 0 1 0 -1 Thematchspreadvectoryis: team y 1 2 2 3 3 0 4 1 TheMassey’smatrixM=XTX is: A B C D A 2 0 -1 -1 B 0 2 -1 -1 C -1 -1 2 0 D -1 -1 0 2 andtheteamspreadvector p=XTyis: team p A 5 B 1 C -2 D -4 TheresultingMassey’sratingisthefollowing: team r A 1.75 B -0.25 C -0.25 D -1.25 NoticethatteamsBandChavethesamerating(andhenceranking),despitethepoint spreadofB(1)ishigherthanthepointspreadofC(-2). ThisisbecauseBagainstC endedinadraw,Bwonthesecondmatch,butagainsttheweakestteam(D),whileC lost the second match, but against the strongest team (A). Indeed, the rating r can be (1) (2) decomposedinthepartialratingsr andr thatfollows: i i 6 (1) (2) team r r r i i A 1.75 -0.75 2.5 B -0.25 -0.75 0.5 C -0.25 0.75 -1.0 D -1.25 0.75 -2.0 Hence,BhasabetterpointspreadbutaneasierschedulewithrespecttoC. Summing thetworatingcomponenteswegetthesameratingforteamsBandC. Moreover, the rating r can be decomposed in the offensive and defensive ratings thatfollows: team r o d A 1.75 1.875 -0.125 B -0.25 0.875 -1.125 C -0.25 -0.125 -0.125 D -1.25 -0.125 -1.125 wherethepointsforandagainstvectorsareasfollows: team p f a A 5 5 0 B 1 3 2 C -2 1 3 D -4 1 5 Notice that team B has a better offensive rating thanC, but a worse defensive rating, andthesumofoffensiveanddefensivepowersisthesameforbothteams. Also,Aand Chavethesamedefensiverating,althoughAhas0pointsagainstandChas3points again;again,thisbecausethescheduleofCisharderthanthescheduleofA. 2.2 Massey’smethodandnetworkscience WeobservethatEquation(4)isclosetotheequationdefiningKatz’scentrality[7]: r=αAr+β (8) whereα isagivenconstantandβ isagiven(exogenous)vector. TheKatz’sequation can be solved by inverting matrix I−αA as soon as α does not coincide with the reciprocal of an eigenvalue of A [13]. In particular, making the assumption that all teamsplayedthesamenumberkofmatches,thenEquation(3)becomes 1 r= (Ar+p), (9) k which corresponds to Katz’s centrality with parameters α =1/k and β = p/k. Nev- ertheless, notice that in this setting k is an eigenvalue of A. It follows that Massey’s 7 equationisaninstanceofKatz’scentrality,butcannotbecomputedbyinvertingmatrix I−αA. Furthermore,inthefollowing,weproposeanelectricalinterpretationoftheMassey’s rating. If we view the symmetric matrix A as the adjacency matrix of an undirected graphG ,thenMistheLaplacianmatrixofthegraphG . Theratingvectorrdefined A A insystem(2)isthenequivalenttothepotentialvectoroveraresistornetworkdefined byAwithsupplyvector p[5]. Following this metaphor, the resistor network has a resistor between nodes i and j as soon as teams i and j has matched with conductance (reciprocal of resistance) equal to the number A of matches between i and j. Sources, that are nodes i with i,j positive supply p through which current enters the network, are teams with positive i point spread, while targets, that are nodes i with negative supply p through which i current leaves the network, are teams with negative point spread. Notice that current enteringandleavingthenetworkmustbeequal,indeedthepointspreadvector psums to0. Thepotentialr fornodeicorrespondstotheratingofteami: teamswithlarge i potentialareteamshighintheranking. Moreover,thecurrentflowthroughedge(i,j) is,byOhm’slaw,thequantityA (r −r ),whichcorrespondstotheratingdifference i,j i j betweenteamsiand j(whichisalsoanestimateofthepointspreadinamatchbetween iand j)multipliedbythenumberoftimestheymatched: currentflowsmoreintensely betweenteamsofdifferentstrengths(asmeasuredbyMassey’smethod)thatmatched manytimes. Finally,itisinterestingtoanalysewhathappenstoMassey’ssystemattheendof the season, assuming that all n teams matched all other teams once. In this case, the opponentsratingcomponent r (1) i r =− , i n−1 wherewehaveusedthefactthat∑iri=0,andthepointspreadcomponent p (2) i r = , i n−1 hence r p (1) (2) i i r =r +r =− + , i i i n−1 n−1 andthus p i r = . i n Thesameresultisobtainednoticingthat Mp=(D−A)p=(n−1)p−Ap=np−(p+Ap)=np where we have used the fact that p sums to 0. Hence p is an eigenvector of M with eigenvalue n and hence, once again, r = p/n is a solution of the Massey’s system. Hence,thefinalratingofateamissimplythemeanpointspreadoftheteam. It is possible to be a bit more precise about this property of Massey’s method by exploitingthepropertiesofthesetofeigenvalues,orspectrum,oftheLaplacianmatrix M =D−A. The spectrum reflects various aspects of the structure of the graph G A associated with A, in particular those related to connectedness. It is well known that 8 theLaplacianissingularandpositivesemidefinite(recallthatM=XTX andXe=0) sothatitseigenvaluesarenonnegativeandcanbeorderedasfollows: λ =0≤λ ≤λ ≤...≤λ . 1 2 3 n It can be shown that λ ≤n, see for example [2]. The multiplicity of λ =0 as an n 1 eigenvalueoftheLaplaciancanbeshowntobeequaltothenumberoftheconnected components of the graph, see again [2]. If the graph of the matches is connected or, equivalently, M is irreducible, as we assume in the following, λ (cid:54)=0 is known as 2 algebraicconnectivityofthegraphandisanindicatoroftheefforttobeemployedin ordertodisconnectthegraph. WecanwritethespectraldecompositionofMasM=UDUT whereUisorthogonal √ anditsfirstcolumnisequaltoe/ n,andD=diag(0,λ ,...,λ ). FromMr= pwe 2 n obtainr=UD+UTpwhereD+=diag(0, 1 ,..., 1 ). Now λ2 λn p p (cid:104) I(cid:105) r− =UD+UTp− =U D+− UTp, n n n where I is the identity matrix. Observe that the first component of the vectorUTp is equaltozerosothat p (cid:104) I(cid:105) (cid:104) I˜(cid:105) r− =U D+− UTp=U D+− UTp, n n n whereI˜=diag(0,1,...,1). Ifwedenotewith(cid:107)·(cid:107)theEuclideannormweobtain (cid:107)r− p(cid:107)=(cid:107)U(cid:2)D+− I˜(cid:3)UTp(cid:107)≤(cid:107)p(cid:107) max (cid:12)(cid:12) 1 −1(cid:12)(cid:12)≤(cid:107)p(cid:107)n−λ2, n n k=2,...,n(cid:12)λk n(cid:12) nλ2 whereweusedthefactthattheEuclideannormofanorthogonalmatrixisequaltoone. Hence, asthealgebraicconnectivityλ , aswellastheothereigenvalues, approachn, 2 that is, as more and more matches are played, the vector r approaches p/n and the equalityisreachedwhenthegraphofthematchesbecomescomplete. 2.3 Theunderlyinglinearalgebra Let n≥2. An n×n matrix A is defined to be strictly diagonally dominant if |A |> i,i ∑j(cid:54)=i|Ai,j|fori=1,...,n. ByusingGers˘gorintheoremitiseasytoprovethatifAis strictly diagonally dominant then it is nonsingular, see [6]. A matrix A is defined to bediagonallydominantif|Ai,i|≥∑j(cid:54)=i|Ai,j|fori=1,...,n. AmatrixAisdefinedto be irreducibly diagonally dominant if it is irreducible, it is diagonally dominant, and foratleastonevalueofithestrictinequality|Ai,i|>∑j(cid:54)=i|Ai,j|holds. Anirreducibly diagonallydominantmatrixisnonsingular, seeagain[6]. Withanonstandardtermi- nology,wedefinethematrixAtobeexactlydiagonallydominantif|Ai,i|=∑j(cid:54)=i|Ai,j| fori=1,...,n. WeintroducethisconceptsincetheLaplacianandthesignlessLapla- cianofagraphareexactlydiagonallydominant. ThesignlessLaplacianofagraphis definedasthematrixwhoseentriesaretheabsolutevaluesoftheentriesoftheLapla- cian [3]. For example, the matrix M =D−A of system (2) is the Laplacian of the 9 graphG , andthematrixN =D+Aofsystem(6)isitssignlessLaplacian. Thefol- A lowinglemmapointsoutaninterestingpropertyofasymmetricirreducibleandexactly diagonallydominantmatrix. Lemma3. Letn≥2,Abeasymmetricirreducibleexactlydiagonallydominantn×n matrix and k∈{1,2,...,n}. The (n−1)×(n−1) matrix A(k) obtained by deleting fromAthek-throwandthek-thcolumnisnonsingular. Proof. First of all notice that A cannot have diagonal elements equal to zero, since, beingexactlydiagonallydominant,thiswouldimplythatAhasarowcompletelyequal tozero,andthisconflictswiththeassumptionthatAisirreducible. The matrix A(k) is the adjacency matrix of the subgraph G obtained by the A(k) elimination of node k and of the edges incident in that node from G . The matrix A A(k) can be transformed by a simultaneous permutation of rows and columns into a block diagonal matrix where each diagonal block is relative to one of the connected componentsofthesubgraph.Wewanttoshowthateachoftheseblocksisnonsingular. If the block has dimension 1, then it contains one of the diagonal elements of A, henceitsonlyentryisdifferentfromzero. Ifthedimensionoftheblockisbiggerthan 1,thentheblockisirreduciblebecauseitsgraphisconnectedbyconstruction. More- over,sinceAisexactlydiagonallydominant,eachoftheblocksofA(k)isdiagonally dominantandthestrictinequalityholdsforoneofthevaluesoftheindexineachblock becauseatleastoneofthedeletededgeswasincidentinoneofthenodesofthecon- nected component. Hence the block is irreducibly diagonally dominant so that it is nonsingular. Finally, the matrix A(k) is nonsingular since the diagonal blocks are nonsingular. From Lemma 3 we deduce that a symmetric irreducible and exactly diagonally dominantn×nmatrixAhasrankatleastn−1andmorespecifically,ifwedeletean arbitrary row from A, the remaining rows are linearly independent. Hence, either A is nonsingular, or its nullspace has dimension one. For example the Laplacian of a connectedgraphisalwayssingular,whileitssignlessLaplacianissingularifandonly ifthegraphisbipartite[3]. IfAissingularandv(cid:54)=0issuchthatAv=0thenvcannot haveentriesequaltozerosinceallthesubsetsofn−1rows,orequivalentlycolumns, of A are linearly independent. For example for the Laplacian of a connected graph v=e while for its signless Laplacian, in the case where the graph is bipartite, v has componentsequalto1or−1incorrespondencewiththetwopartsofthegraph. Theorem1. LetAbeasymmetricirreducibleandexactlydiagonallydominantmatrix andletv(cid:54)=0besuchthatAv=0. Thefollowingassertionsholdtrue. 1. ThelinearsystemAx= f issolvableifandonlyif f isavectorsuchthatvTf =0. 2. Bysubstitutingthek-throwofAwithvT weobtainanonsingularmatrix. 3. LetAˆ thematrixobtainedfromAbysubstitutingthek-throwofAwithvT and let fˆthevectorobtainedfrom f bysubstitutingthek-thentryof f withzero. The solutionx∗ofthenonsigularsystemAˆx= fîssuchthatAx∗= f. 10

The Massey's method for sport rating: a network science perspective PDF

0.17 MB·

by Massimo Franceschet

#journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Download The Massey's method for sport rating: a network science perspective PDF Free - Full Version

by Massimo Franceschet| 0.17

Download The Massey's method for sport rating: a network science perspective by Massimo Franceschet in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About The Massey's method for sport rating: a network science perspective

No description available for this book.

Detailed Information

Author:	Massimo Franceschet
File Size:	0.17
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free The Massey's method for sport rating: a network science perspective Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download The Massey's method for sport rating: a network science perspective PDF?

Yes, on https://PDFdrive.to you can download The Massey's method for sport rating: a network science perspective by Massimo Franceschet completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read The Massey's method for sport rating: a network science perspective on my mobile device?

After downloading The Massey's method for sport rating: a network science perspective PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of The Massey's method for sport rating: a network science perspective?

Yes, this is the complete PDF version of The Massey's method for sport rating: a network science perspective by Massimo Franceschet. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download The Massey's method for sport rating: a network science perspective PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.