Modeling Interactions Across Skills: A Method to Construct and Compare Models Predicting the Existence of Skill Relationships Anthony F. Botelho Seth A. Adjei Neil T. Heffernan WorcesterPolytechnicInstitute WorcesterPolytechnicInstitute WorcesterPolytechnicInstitute 100InstituteRd. 100InstituteRd. 100InstituteRd. Worcester,MA01609-2280 Worcester,MA01609-2280 Worcester,MA01609-2280 [email protected] [email protected] [email protected] ABSTRACT structuresareoftendevelopedbydomainexpertsandteach- Theincorporationofprerequisiteskillstructuresintoeduca- ersinthefieldofstudy,andarelikelytoholdground-truth. tionalsystemshelpstoidentifytheorderinwhichconcepts It is clear, for example, that relationships can be identi- shouldbepresentedtostudentstooptimizestudentachieve- fiedbyobservingskillsattheproblem-level;byviewingthe ment. Many skills have a causal relationship in which one stepsrequiredforstudentstocompleteeachitem,itcanbe skill must be presented before another, indicating a strong known that any skills required to complete such problems skillrelationship. Knowingthisrelationshipcanhelptopre- can be considered prerequisites. For example, Multiplying dict student performance and identify prerequisite arches. WholeNumbersmayactasaprerequisitetoGreatestCom- Skillrelationships,however,arenotdirectlymeasurable;in- mon Factors, as is used in our analysis. While causality stead,therelationshipcanbeestimatedbyobservingdiffer- suggestsastrongrelationship,itispossiblefortwoskillsto encesofstudentperformanceacrossskills. Suchmethodsof relate to each other in other ways. Such relationships are estimation,however,seemtolackabaselinemodeltocom- lessintuitive,perhapsrequiringasimilarthoughtprocessor pare their effectiveness. If two methods of estimating the sequence of steps tosolve, even if the content of such tasks existenceofarelationshipyieldtwodifferentvalues,whichis differ. Many causal skill arches are identifiable by domain themoreaccurateresult? Inthiswork,weproposeamethod experts by observing content, but as described, other such of comparing models that attempt to measure the strength relationshipsmaybemissedduetotheirnon-intuitivestruc- of skill relationships. With this method, we begin to iden- tures. By observing strong skill relationships identified by tifythosestudent-levelcovariatesthatprovidethemostac- domain experts, we construct a method of measuring the curatemodelspredictingtheexistenceofskillrelationships. factorsthataremostpredictiveoftheirexistence. Focusingoninteractionsofperformanceacrossskills,weuse our method to construct models to predict the existence of We also argue that identifying strong relationships is not five strongly-related and five simulated poorly-related skill enough for a method of prediction to be considered ade- pairs. Our method is able to evaluate several models that quate. Such a method should also be able to identify weak distinguishthese differences withsignificantaccuracy gains or non-existent skill relationships. It is likely that while over a null model, and provides the means to identify that much attention and research is placed on structuring pre- interactionsofstudentmasteryprovidethemostsignificant requisite links, some of these are false-positives. In other contributionstothesegainsinouranalysis. words, a skill may be listed as a prerequisite, but has no truerelationshiptoitssupposedpost-requisiteskill. Insuch acasethereislittleornointeractionsofperformance. Such Keywords links must also be identified and removed or reordered in prerequisitestructures,skillrelationships,featureselection, learningplatformstobenefitthestudents. modelcomparison A significant amount of research has looked at measuring 1. INTRODUCTION the strength of skill relationships [1],[4], and even the ef- fects such relationships have on measuring student perfor- ManyeducationalsystemslikeASSISTmentsandKhanAcademy mance [3],[10], but without understood ground truths, it alreadyimplementaprerequisitestructureasasuggestedor- is difficult to compare across these methods. Furthermore, deringinwhichskillsshouldbepresentedtostudents. These manyofthesemethodsrepresentsimilarconceptualizations of performance inherently, or through variations of repre- sentation such as aggregation or centering. For example, “student achievement”is likely a predictor of skill relation- ships (achievement on a prerequisite skill will likely influ- enceachievementonapost-requisiteskill), butcanbe rep- resentedasthepercentofproblemsansweredcorrectly,mas- tery speed (the number of items needed to complete an as- signment as is commonly used in intelligent tutoring sys- tems),orcountlessothercombinationsoffeatures. Itwillbe Proceedings of the 9th International Conference on Educational Data Mining 292 important to distinguish between these generalized compo- eachpair. Dependingontheinteractionsandasetofinteraction- nentstoavoidincorporatingfeaturesthatcapturethesame relatedcriteria,theydeterminewhetherthetwoitemshave typesofconceptualizationsintopredictivemodels. a prerequisite relationship between them. This approach was applied by Pavlick, et al. to analyze item-type covari- Thisworkprovidesamethodtoevaluatemodelsthatmea- ancesandtoproposeahierarchicalagglomerativeclustering surethestrengthofskillrelationships,andwiththismodel methodtorefinethetaggingofitemstoskills[7]. Brunskel weattempttoidentifywhichfeaturesbestindicateastrong conducted a preliminary study in which they use students’ relationship between two skills. This analysis will incorpo- noisy data to infer prerequisite structures [4]. Further re- rate a method of generalizing and distinguishing features search by Scheines, et al. extended a causal structure dis- thatmeasuredifferentaspectsoflearningandperformance. coveryalgorithminwhichanassumptionregardingthepu- Withthismethodology,weseektoanswerthefollowingtwo rity of items is relaxed to reflect real data and to use that researchquestions: toinferprerequisiteskillstructurefromdata[8]. 1. What link-level features, expressed in this paper as in- 3. DATASET teractions of performance between skills, are significant in predicting the existence or non-existence of skill relation- The dataset1 used for this study consists of real-world stu- ships? dent data from the ASSISTments online learning platform. The raw data contains student problem logs pertaining to 2. Which features are the strongest predictors of skill rela- ten math skills from the 2014-2015 school year. These ten tionships, and does combining them make for a more accu- skills represent five skill pairs, listed in Table 1, for which ratepredictivemodel? domain experts identified as having a strong prerequisite relationship. While we are not limiting the usage of our Thenextsectionofthispaperwilldiscusssomeofthepre- proposed baseline model to just prerequisite relationships, vious research performed on skill relationships and prereq- these are the most reliable to identify due to the causal ef- uisite structures. Then, we will discuss our theory and fectofcontent(ifproblemsinskillBrequiretheuseofskill methodologytoprovideabaselinemodelofcomparingmeth- Atocomplete,astrongrelationshipcanbeidentified). ods of measuring skill relationships. Using this model, we thencompareseveralcommonly-usedstudent-levelfeatures, and of the most accurate, compare several different repre- Table 1: The strong skill pairs as determined by sentations of those features. Finally, we will discuss our domain experts findingsandsuggestedfutureworks. Prerequisite Post-requisite Multiplicationof Greatest WholeNumbers GreatestCommonFactor 2. PREVIOUSWORKS Subtracting Orderof Thediscoveryandrefinementofprerequisiteskillstructures Integers Operations has been an important research question in recent years. Divisionof DividingMulti- The impact of this research on educational systems cannot WholeNumbers DigitNumbers beoveremphasized. Domainexpertswhodesignthesestruc- VolumeofRectangular Volumeof tures need data centered methods to support the decisions PrismsWithoutFormula RectangularPrisms theymake;itisvitaltohaveempiricaldatatosupporthy- Netsof SurfaceAreaof pothesisregardingtheorderinwhichskillsarepresentedas 3DFigures RectangularPrisms itcanhavealargeimpactonstudentachievementandeither aid or impede the learning process. Additionally, identify- ingthebestprerequisiteskillstructurewillenhancestudent Inordertoidentifybelievableground-truthskillpairs,asur- modeling;knowingastudent’spriorperformanceonprereq- veycontaining24skillpairsforwhichwehadsufficientstu- uisiteskillscanhelpestimatethatstudent’sperformanceon dent data (greater than 50 student rows) was administered thepost-requisites. Thiscanleadtoearlierinterventionsfor to 45 teachers and domain experts who use ASSISTments. struggling students, or even help redefine mastery perhaps Each was asked to rate on a scale of 1 to 7, indicating the students who perform very well on a prerequisite requires perceivedqualitativestrengthoftherelationshipofeachskill less practice on a post-requisite, or can be given more ad- pair. From the survey results, five skill pairs were selected vancedexamples. to be the strongest related links with the smallest variance in opinion scores. As we are treating these links as truth, Tatsuoka,definedadatastructurecalledtheQ-Matrix,that wewantedtobehighlyselectiveofthesepairs. representsthemappingofproblemstoskills: therowsofthis matrix represent the problems, and the columns represent The resulting dataset consists of 1838 total student rows the skills [9]. Though the goal of the research was to diag- from896uniquestudents. Thisincludestworowsofdataper nose the misconceptions of students, they set in motion a studentforeachofthefiveskillpairsincluded. Thefirstrow numberofstudiesthathaveusedthisdatastructureasthe contains information of that student’s performance on the firststeptofindprerequisitestructures[2],[5],[8]. pre-andpost-requisiteskills,whilethesecondrowcontains student performance on the prerequisite and a simulated Desmarais and his colleagues developed an algorithm that post-requisitedescribedfurtherinthenextsection. findstheprerequisiterelationshipbetweenquestions,oritems, instudents’responsedata[6]. Theycomparepairsofitems 1The full raw and filtered datasets are available at the fol- in a test and determine any interactions existing between lowinglink: http://tiny.cc/veqg5x Proceedings of the 9th International Conference on Educational Data Mining 293 For each student, a feature vector was selected using com- 4. METHODOLOGY monperformancemetricstocomparewithinourmodel. This The ultimate goal of this work is to provide the means of feature vector contained eight link-level features represent- comparingmodelspredictingtheexistence,ornon-existence ing the interactions between student-level prerequisite and of skill relationships. Our approach to this is through the post-requisiteperformancemetrics. Thegeneratedlink-level comparison and identification of features that most accu- featuresobservedareasdescribedbelow: rately predict these relationships. Using principal compo- nent analysis, we group similar features into more general- izedconceptualizationstobothcomparewhichtypesoffea- Percent Correct turesmatterwhenpredictingrelationships,butalsotoavoid Themean-centered2percentageofcorrectresponsesin problems of multicollinearity that may bias our estimates. the prerequisite skill multiplied by the mean-centered Once this baseline model is established, we can construct percentage of correct responses in the post-requisite newpredictivemodelsfromthesignificantfeaturesandob- skill. servetheiraccuracyinpredictingtheexistenceofskillrela- tionshipswhencomparedtoasimplenull,orunconditional First Problem Correctness (FPC) model. Thebinarycorrectnessofthefirstresponseinthepre- requisite skill multiplied by the binary correctness of In order to compare the usage of features against a weak thefirstresponseonthepost-requisiteskill. ornon-existentrelationship,wesimulatedanewskillusing students from the existing prerequisite skill by generating Mastery Speed random sequences of responses. For each existing student, The mean-centered mastery speed of the prerequisite we randomly assign him/her a probability between 0.5 and skill, defined as the number of problems required for 0.9inordertocreatearandomsequenceofanswers. Forex- each student to achieve three consecutive correct re- ample,astudentgivenaprobabilityof0.5hasa50%chance sponses,multipliedbythemean-centeredmasteryspeed ofansweringeachgivenproblemcorrectly. Wesimulatestu- of the post-requisite skill. In addition to centering, dent answers until either mastery is achieved, defined as these values were also winsorized to make the largest three sequentially correct responses, or the student reaches possible value 10, chosen as this is often the maxi- 10problemswithoutmastering;avalueof10ischosenhere, mum number of daily attempts allowed within AS- asmanyassignmentsinASSISTmentsaregivenadailylimit SISTments. All centering and winsorizing occurred of 10 problem attempts before asking the student to seek beforemultiplyingthetwovalues. help or try again on another day. While we acknowledge therearemanywaystoaccomplishthissimulationstep,we Z-Scored Percent Correct feelthissimplemethodsufficientlycreatesaskillthathasno The z-scored3 value of mean-centered percentage of relationshiptotheoriginalprerequisiteasintended. Asour correct responses in the prerequisite skill multiplied proposedmethodisintendedtobeusedinthefuturetohelp by the z-scored value of mean-centered percentage of identify undiscovered pre- or post-requisite links, we chose correctresponsesinthepost-requisiteskill. to use a simulated skill rather than a random existing skill to avoid the possibility of randomly selecting an undiscov- Binned Mastery Speed (Bin) ered related skill. Again, we wanted to be highly selective Thenumberedbinofmasteryspeedasdescribedin[3] andconsiderseveralsuchscenariosasweareattemptingto of the prerequisite skill multiplied by the bin of mas- create ground-truth values to which we can make our com- tery speed in the second skill. Students were placed parisons. intooneoffivebinsbasedonmasteryspeediftheas- signmentwascompletedandbasedonpercentcorrect Using these two skill-pairs, one link representing a strong iftheassignmentwasnotcompleted. relationship while the other representing a non-existent re- lationship,wecancalculateafeaturevectorforeachstudent intheprerequisiteskillwithvaluesfromeachskill-pair. We Z-Scored Mastery Speed use a binary logistic regression with the existence of a re- Thez-scoredvalueofmean-centered,winsorizedmas- lationship as the dependent variable and several link-level tery speed in the prerequisite skill, multiplied by the covariates to predict whether a skill relationship exists for z-scored value of mean-centered, winsorized mastery each student row. The existence of a relationship can be speedinthepost-requisiteskill. determined then simply by majority ruling, but such cal- culation is not included in this work and instead observes Bin X FPC accuracy at the student-level for a more accurate compari- The binned mastery speed value in the prerequisite son. skill multiplied by the binary correctness of the first responseinthepost-requisiteskill. Webegintocomparecommonlyusedstudent-levelfeatures in this study through two levels analysis. The first step Percent Correct X FPC attemptstocomparegroupsoffeatures,generalizingdiffer- The mean-centered percentage of correct responses in entrepresentationsofsimilarfeaturesintoconceptualgroup- theprerequisiteskillmultipliedbythebinarycorrect- ings. As such, we are able to view the predictive power of nessofthefirstresponseinthepost-requisiteskill. whatwedenoteasinitialperformance,mastery,andcorrect- 2Allcenteringoffeatureswasperformedattheskill-level. ness. Thesecondexperimentlooksattheindividualfeatures 3Allz-scoringwasperformedattheclass-level. asdifferentrepresentationsoftheoverallgrouptocompare Proceedings of the 9th International Conference on Educational Data Mining 294 studentperformanceontheinitialitemsofeachskill. Inad- ditiontothesethreecategories,wearealsoleftwithstudent mastery speed z-scored within student classes as a variable that did not fall under either of the three aforementioned categories; while a derivation of mastery speed, we believe thatthisdidnotcorrelatetothe“mastery”categorydueto themethodofstandardizationasitiscapturingthismetric inrelationtostudents’peers. Wewillreaddressthiscasein oursectionofdiscussion. Once these predictors are identified and created, we con- structabinarylogisticregressionmodeltopredict,foreach student row, whether a relationship exists or not. This model will give us a significance value and coefficient for eachpredictorinthemodel,aswellasanoverallpredictive accuracyofthemodelwhichwillbeusedmoreforthenext analysis. 4.2 ComparingFeatureModels After being able to compare which generalized groups of featuresaresignificantpredictorsoftheexistenceofskillre- Figure 1: The results of the PCA analysis. All fea- lationships, we are able to compare the individual student- tures except Z-Scored Mastery Speed mapped to level features that fall into each category by incorporating one of three generalized components. them into separate models to observe predictive accuracy. The analysis of the first experiment is used to determine which categories are significant in predicting the existence these predictors at a closer level. We can take each factor ofskillrelationships. Usingthatinformation,weareableto ofmastery,forexample,andcomparetheirusageinseveral focusonthosegroupingswithsignificancetoconstructmod- modelstodeterminewhichisthemostaccuratepredictorof elsthatutilizefactorsfromeachgrouping. Thegroupingof theexistenceofskillrelationships. “mastery,”forexamplecontainsthefactorsofmasteryspeed andbinnedmasteryspeed,sowecanconstructmodelsusing 4.1 ComparingLink-LevelFeatures each to compare differences in predictive power. To avoid Inordertocomparerepresentationsofstudent-levelfeatures, problemsofcollinearity,nosinglemodelcontainsmorethan wemustfirstbeabletocomparegeneralconceptualizations onefactorfromasinglegrouping. Thissignificantlyreduces offeaturestodeterminewhichprovidemoreaccuratepredic- thenumberofcombinationsoffeaturestotestcomparedto tionsoftheexistenceofskillrelationships. Wewanttocap- runningthisexperimentwithoutfirstgroupinglikefeatures turethetruerepresentationsofeachmetricandattemptto and identifying those that are significant as we did in the interpretthesegeneralizationsastypesoffeatures. Inorder firstexperiment. to accomplish this grouping of predictors, we use principal component analysis (PCA) to identify which student-level Using the significant groupings, we are able to create 17 featurescorrelatetoandarerepresentativeofmoregeneral- models consisting of single, pairs, and triplets of features. izedcomponents. PCAisprimarilyusedfordimensionality A logistic regression is run on each of these models to pre- reduction as we are doing here and gives us the ability to dict the existence of a skill relationship. Of the 17 mod- create new variables from the component mappings. The els,10ofthemproduceastatisticallysignificantprediction resulting feature alignment can be seen in Figure 1. As is when compared to a null model. Ideally, our null model the case in our study, and was mentioned in the previous shouldproducea50%accuracyasthereisanequalnumber section, we have multiple metrics of mastery speed as well ofgoodandbadlinkrowsinourdataset. Thisisnotalways as several other features. As we can represent“mastery”in thecase,however,asdependingonthefeatureobserved,in- severalways,wewanttoknowiftheoverallconceptofmas- formationmaybemissingforaparticularstudent;mastery tery, as captured by the metrics used, is reliably predictive speed,forexample,asthenumberofitemsattemptedbya oftheexistenceofskillrelationships. studentbeforereaching3consecutivecorrectanswers,would bemissingforanystudentthatdidnotcompletetheassign- Creating a new set of predictors of these groupings, we are ment. Forthisreason,thepredictivepowerofeachmodelis able to incorporate these into a binary logistic regression describedasgainsinpredictiveaccuracy,orrather,theaccu- model to view the predictive power of each. While PCA racyofeachmodelminustheaccuracyofthecorresponding groupssimilarfeaturestogetherbasedontheircorrelations, nullmodel. byviewingwhichfeaturesaregroupedweareabletointer- pretandlabeleach. Fromthisprocess,wefoundthatmost 5. RESULTS of our features fell into three categories for which we have The results of the first analysis are expressed in Table 2. giventhenames“mastery,”asthisconsistsofrepresentations EachofthethreefeaturegroupingsofMastery,Correctness, ofmasteryspeed,“correctness,”asthisconsistsofrepresen- and Initial Performance created using PCA in addition to tations of the percentage of correct student responses, and theZ-ScoredMasteryarecomparedwithinthesamemodel, “initial performance,”as this consists of representations of predicting the existence of a skill relationship. As these Proceedings of the 9th International Conference on Educational Data Mining 295 thatwehaveinterpretedasmastery,correctness,andinitial Table 2: The coefficients and significance values of performance. Itwasfoundthez-scoredmasteryspeed,con- the generalized components analyzed. From this we trary to our intuition, did not map well to the grouping of canfocusonmodelsthatexcludefeaturescontained mastery. We can speculate the reason for this occurrence in the components with no significance. byalteringourinterpretationofthefeature. Masteryspeed Component CoefficientValue Significance itself is an interesting metric as it attempts to capture two (log-oddsunits) dimensions of performance: a level of understanding and a rateoflearning. Also,toreiterateapriordistinction,these Mastery -.251 <.001*** metrics are interactions of performance across skills. By z- scoringthemetric,itiscapturingacontextualeffectofeach Correctness .015 .802 student in comparison with other students in the class, a Initial distinctionthatappearstohaveasignificanteffect. Performance .129 .037* Z-Scored Observing the resulting model components from the prin- MasterySpeed -.129 <.001*** cipal component analysis in Table 2, we were able to focus our attention to those components with significant values. Correctnesswastheonlycomponentofthatmodelthatwas againarelink-levelfeaturesdescribinginteractionsbetween found to have no statistical significance on the dependent student-levelperformanceonprerequisiteandpost-requisite variable. Thisiscertainlyinteresting,aspercentcorrectness skills, it is difficult to draw tangible interpretations from andothersuchmeasuresareamongthemostcommonmet- the coefficient value, expressed in log-odds units. This co- rics of performance. Perhaps the interaction between pre- efficient, used in the logistic regression to make the predic- and post-requisite percent correct is losing some predictive tions, describes each component’s effect on the dependent powerfromwhenthemetricisusedforotherpredictionsof variable. For example, for each unit increase in“Mastery,” performance. theprobabilitythatthelinkexistsdecreases. Again,asthis componentisanaggregationofinteractionfeatures,itisre- Thisaspectillustratesoneotherimportantfindingthatthe ally describing an aggregation of differences of differences distinct representations of one metric or another each con- between student-level features making it difficult to make tribute differently to the predictive accuracy of the models definitive claims regarding these values alone and were in- studied. Models incorporating mastery speed, for example, cludedpurelytodisplayageneraltrendofthesecomponents had no significant accuracy gains over a null model, while ontheprediction. masteryspeedbinningshowedconsiderablegainsasseenin Table3. Thebaselinemodelofcomparisonproposedinthis Fromthetable,weareabletodeterminethesignificanceof study provides the means to make that distinction regard- eachcomponentontheoverallpredictionbyviewingthecor- ing features contained within the same generalized compo- responding p-values in the third column. Looking at these nent grouping. As is seen in that figure, combinations of values, we can claim that the overall grouping of“Correct- features outperform any single feature, illustrating a more ness”seems to have less of an impact on the predictive ac- robust model by capturing multiple representations of per- curacyofthemodel. Asthistermisnotsignificant,wecan formance. focus the remainder of our study on the remaining three components. 7. FUTUREWORK While we have shown that our model is able to compare Table 3 illustrates the results of our second analysis com- and identify features that contribute to higher accuracy in paringthemodelsthatweareabletoconstructwiththere- predicting the existence of skill relationships, we also need maining features once the“Correctness”grouping has been to stress the importance of the usage of this information. disregarded. This figure shows the comparative predictive Theabilitytocomparefeaturesisonlythefirststepofour accuracy of the 10 models that give statistically significant model’s goal. By identifying strong predictors of skill rela- predictions as seen in Table 3. Again, these values are ex- tionshipsthatweknowexist,wecanapplyittootherskills pressed as accuracy gains, or rather the percent accuracy within ASSISTments and other systems to identify poten- increaseoverthenullmodelrunforeachpredictivemodel. tially new prerequisite arches, and also to better measure and predict long-term student performance, learning, and 6. DISCUSSION retention. Havinganaccurateestimateofskillrelationships Thisworkprovidesabaselinemodelofcomparingstudent- can help restructure prerequisite structures to provide skill levelperformanceacrossskillstomeasurethestrengthofa sequences in an order that optimizes student learning and skillrelationshipandcomparetheaccuracyofbothfeatures achievement. and models that estimate this value. Such a model, in our experience,hasnotexistedpriortothisstudy. Ourmethod Theworkinthispaperincorporatedseveralskillsintoasin- attempts to identify not only the individual features that gle dataset to make predictions. In this case, we wanted to contribute to better predictions of these relationships, but createamethodthatisgeneralizabletosomedegree. While alsomovestogeneralizesimilarfeaturesintoconceptualiza- ourselectiveskillsetallowsustomakesomeclaimsinterms tionsforcomparisoninordertominimizemulticollinearity. oftheaccuracythesemodelsoverallskills,itmaylikelybe the case that skill relationships are measurable in different The principal component analysis step of our model found ways for different skills. Further analysis could repeat the thatallbutonefeaturemappedtooneofthreecomponents stepshereoneachoneoftheacquiredskillsinthedataset. Proceedings of the 9th International Conference on Educational Data Mining 296 Table 3: The models constructed from features in the significant generalized components. No one model contains more than a single feature from each generalized component. Model NullAccuracy ModelAccuracy AccuracyGain Significance MasterySpeed(MS) 0.63 0.62 0.00 1.000 Z-ScoredMasterySpeed 0.63 0.63 0.00 0.888 FirstProblemCorrectness(FPC) 0.50 0.56 0.06 <0.001*** BinnedMS 0.50 0.69 0.19 <0.001*** BinXFPC 0.50 0.56 0.06 <0.001*** Bin,Z-ScoredMS 0.50 0.71 0.21 <0.001*** MS,FPC 0.63 0.62 0.00 1.000 MS,BinXFPC 0.63 0.62 0.00 1.000 Bin,FPC 0.50 0.69 0.19 <0.001*** Bin,BinXFPC 0.50 0.69 0.19 <0.001*** MS,FPC,Z-ScoredMS 0.63 0.63 0.00 0.754 MS,BinXFPC,Z-ScoredMS 0.63 0.63 0.00 0.979 Bin,FPC,Z-ScoredMS 0.50 0.71 0.20 <0.001*** Bin,BinXFPC,Z-ScoredMS 0.50 0.71 0.21 <0.001*** MS,Z-ScoredMS 0.63 0.63 0.00 0.843 FPC,Z-ScoredMS 0.50 0.64 0.14 <0.001*** BinXFPC,Z-ScoredMS 0.50 0.61 0.11 <0.001*** While correctness was not significant in these results, per- 9. REFERENCES hapsitissignificantwhenpredictingcertaintypesofskills. [1] S.Adjei,D.Selent,N.Heffernan,Z.Pardos, Perhaps, similar to our features, skills themselves could be A.Broaddus,andN.Kingston.Refininglearning generalizedintoconceptualtypesfordifferentkindsofanal- mapswithdatafittingtechniques: Searchingfor ysis pertaining to interactions of performance and their re- betterfittinglearningmaps.InEducational Data lationships. Mining,2014. [2] T.Barnes.Theq-matrixmethod: Miningstudent Thefeaturevectorsgeneratedforeachstudentinourdataset responsedataforknowledge.InAmerican Association captured many of the most common student-level metrics, for Artificial Intelligence 2005 Educational Data but certainly not all of them. There are many other as- Mining Workshop,2005. pects that could be added including completion, measures [3] A.Botelho,H.Wan,andN.Heffernan.Theprediction oflearningrate,timespentontheassignments,hintusage, ofstudentfirstresponseusingprerequisiteskills.In and countless other variables. In addition, this study only Learning at Scale,2015. observed interactions expressed as multiplications of these [4] E.Brunskill.Estimatingprerequisitestructurefrom terms to describe them as link-level features. There are noisydata.InEducational Data Mining,pages various other ways to represent interactions or other such 217–222.Citeseer,2011. transformations including differences of values, division of [5] Y.Chen,P.-H.Wuillemin,andJ.-M.Labat. values, or just simply cross-feature interactions as was par- Discoveringprerequisitestructureofskillsthrough tially explored here by looking at Bin X FPC and Percent probabilisticassociationrulesmining.InThe 8th Correct X FPC. Such interactions model various other as- International Conference on Educational Data Mining, pectsofstudentperformanceandbehaviorthatcanbevery pages117–124,2015. usefulinthistypeofrelationshipprediction. [6] M.C.Desmarais,A.Maluf,andJ.Liu.User-expertise modelingwithempiricallyderivedprobabilistic The methodology presented observes models that predict implicationnetworks.User modeling and user-adapted the existence of skills as a binary outcome, while it can be interaction,5(3-4):283–315,1995. modified to make comparisons on estimates of relationship [7] P.I.PavlikJr,H.Cen,L.Wu,andK.R.Koedinger. strengthsasacontinuousoutcomeaswell. Themethodob- Usingitem-typeperformancecovariancetoimprove served model accuracy at the student level for better mea- theskillmodelofanexistingtutor.Online surements, but it is a skill-level relationship that is being Submission,2008. tested. One simple addition of future work could explore [8] R.Scheines,E.Silver,andI.Goldin.Discovering how to best combine the predictions at a student level to prerequisiterelationshipsamongknowledge make a skill-level prediction. The methodology can then components.InEducational Data Mining,2014. testrelationshipsontheentiresystemskillstructure. [9] K.K.Tatsuoka.Rulespace: Anapproachfordealing withmisconceptionsbasedonitemresponsetheory. 8. ACKNOWLEDGMENTS Journal of educational measurement,20(4),1983. We acknowledge funding from the NSF (1316736, 1252297, [10] H.WanandJ.B.Beck.Consideringtheinfluenceof 1109483,1031398,0742503,1440753),theU.S.Dept. ofEd. prerequisiteperformanceonwheelspinning.In GAAN(P200A120238),ONR’s“STEMGrandChallenges,” Educational Data Mining,2015. andIES(R305A120125,R305C100024). Proceedings of the 9th International Conference on Educational Data Mining 297