Table Of ContentModeling Interactions Across Skills: A Method to
Construct and Compare Models Predicting the Existence
of Skill Relationships
Anthony F. Botelho Seth A. Adjei Neil T. Heffernan
WorcesterPolytechnicInstitute WorcesterPolytechnicInstitute WorcesterPolytechnicInstitute
100InstituteRd. 100InstituteRd. 100InstituteRd.
Worcester,MA01609-2280 Worcester,MA01609-2280 Worcester,MA01609-2280
abotelho@wpi.edu saadjei@wpi.edu nth@wpi.edu
ABSTRACT structuresareoftendevelopedbydomainexpertsandteach-
Theincorporationofprerequisiteskillstructuresintoeduca- ersinthefieldofstudy,andarelikelytoholdground-truth.
tionalsystemshelpstoidentifytheorderinwhichconcepts It is clear, for example, that relationships can be identi-
shouldbepresentedtostudentstooptimizestudentachieve- fiedbyobservingskillsattheproblem-level;byviewingthe
ment. Many skills have a causal relationship in which one stepsrequiredforstudentstocompleteeachitem,itcanbe
skill must be presented before another, indicating a strong known that any skills required to complete such problems
skillrelationship. Knowingthisrelationshipcanhelptopre- can be considered prerequisites. For example, Multiplying
dict student performance and identify prerequisite arches. WholeNumbersmayactasaprerequisitetoGreatestCom-
Skillrelationships,however,arenotdirectlymeasurable;in- mon Factors, as is used in our analysis. While causality
stead,therelationshipcanbeestimatedbyobservingdiffer- suggestsastrongrelationship,itispossiblefortwoskillsto
encesofstudentperformanceacrossskills. Suchmethodsof relate to each other in other ways. Such relationships are
estimation,however,seemtolackabaselinemodeltocom- lessintuitive,perhapsrequiringasimilarthoughtprocessor
pare their effectiveness. If two methods of estimating the sequence of steps tosolve, even if the content of such tasks
existenceofarelationshipyieldtwodifferentvalues,whichis differ. Many causal skill arches are identifiable by domain
themoreaccurateresult? Inthiswork,weproposeamethod experts by observing content, but as described, other such
of comparing models that attempt to measure the strength relationshipsmaybemissedduetotheirnon-intuitivestruc-
of skill relationships. With this method, we begin to iden- tures. By observing strong skill relationships identified by
tifythosestudent-levelcovariatesthatprovidethemostac- domain experts, we construct a method of measuring the
curatemodelspredictingtheexistenceofskillrelationships. factorsthataremostpredictiveoftheirexistence.
Focusingoninteractionsofperformanceacrossskills,weuse
our method to construct models to predict the existence of We also argue that identifying strong relationships is not
five strongly-related and five simulated poorly-related skill enough for a method of prediction to be considered ade-
pairs. Our method is able to evaluate several models that quate. Such a method should also be able to identify weak
distinguishthese differences withsignificantaccuracy gains or non-existent skill relationships. It is likely that while
over a null model, and provides the means to identify that much attention and research is placed on structuring pre-
interactionsofstudentmasteryprovidethemostsignificant requisite links, some of these are false-positives. In other
contributionstothesegainsinouranalysis. words, a skill may be listed as a prerequisite, but has no
truerelationshiptoitssupposedpost-requisiteskill. Insuch
acasethereislittleornointeractionsofperformance. Such
Keywords
links must also be identified and removed or reordered in
prerequisitestructures,skillrelationships,featureselection,
learningplatformstobenefitthestudents.
modelcomparison
A significant amount of research has looked at measuring
1. INTRODUCTION the strength of skill relationships [1],[4], and even the ef-
fects such relationships have on measuring student perfor-
ManyeducationalsystemslikeASSISTmentsandKhanAcademy
mance [3],[10], but without understood ground truths, it
alreadyimplementaprerequisitestructureasasuggestedor-
is difficult to compare across these methods. Furthermore,
deringinwhichskillsshouldbepresentedtostudents. These
manyofthesemethodsrepresentsimilarconceptualizations
of performance inherently, or through variations of repre-
sentation such as aggregation or centering. For example,
“student achievement”is likely a predictor of skill relation-
ships (achievement on a prerequisite skill will likely influ-
enceachievementonapost-requisiteskill), butcanbe rep-
resentedasthepercentofproblemsansweredcorrectly,mas-
tery speed (the number of items needed to complete an as-
signment as is commonly used in intelligent tutoring sys-
tems),orcountlessothercombinationsoffeatures. Itwillbe
Proceedings of the 9th International Conference on Educational Data Mining 292
important to distinguish between these generalized compo- eachpair. Dependingontheinteractionsandasetofinteraction-
nentstoavoidincorporatingfeaturesthatcapturethesame relatedcriteria,theydeterminewhetherthetwoitemshave
typesofconceptualizationsintopredictivemodels. a prerequisite relationship between them. This approach
was applied by Pavlick, et al. to analyze item-type covari-
Thisworkprovidesamethodtoevaluatemodelsthatmea- ancesandtoproposeahierarchicalagglomerativeclustering
surethestrengthofskillrelationships,andwiththismodel methodtorefinethetaggingofitemstoskills[7]. Brunskel
weattempttoidentifywhichfeaturesbestindicateastrong conducted a preliminary study in which they use students’
relationship between two skills. This analysis will incorpo- noisy data to infer prerequisite structures [4]. Further re-
rate a method of generalizing and distinguishing features search by Scheines, et al. extended a causal structure dis-
thatmeasuredifferentaspectsoflearningandperformance. coveryalgorithminwhichanassumptionregardingthepu-
Withthismethodology,weseektoanswerthefollowingtwo rity of items is relaxed to reflect real data and to use that
researchquestions: toinferprerequisiteskillstructurefromdata[8].
1. What link-level features, expressed in this paper as in-
3. DATASET
teractions of performance between skills, are significant in
predicting the existence or non-existence of skill relation- The dataset1 used for this study consists of real-world stu-
ships? dent data from the ASSISTments online learning platform.
The raw data contains student problem logs pertaining to
2. Which features are the strongest predictors of skill rela- ten math skills from the 2014-2015 school year. These ten
tionships, and does combining them make for a more accu- skills represent five skill pairs, listed in Table 1, for which
ratepredictivemodel? domain experts identified as having a strong prerequisite
relationship. While we are not limiting the usage of our
Thenextsectionofthispaperwilldiscusssomeofthepre- proposed baseline model to just prerequisite relationships,
vious research performed on skill relationships and prereq- these are the most reliable to identify due to the causal ef-
uisite structures. Then, we will discuss our theory and fectofcontent(ifproblemsinskillBrequiretheuseofskill
methodologytoprovideabaselinemodelofcomparingmeth- Atocomplete,astrongrelationshipcanbeidentified).
ods of measuring skill relationships. Using this model, we
thencompareseveralcommonly-usedstudent-levelfeatures,
and of the most accurate, compare several different repre- Table 1: The strong skill pairs as determined by
sentations of those features. Finally, we will discuss our domain experts
findingsandsuggestedfutureworks. Prerequisite Post-requisite
Multiplicationof Greatest
WholeNumbers GreatestCommonFactor
2. PREVIOUSWORKS
Subtracting Orderof
Thediscoveryandrefinementofprerequisiteskillstructures
Integers Operations
has been an important research question in recent years.
Divisionof DividingMulti-
The impact of this research on educational systems cannot
WholeNumbers DigitNumbers
beoveremphasized. Domainexpertswhodesignthesestruc-
VolumeofRectangular Volumeof
tures need data centered methods to support the decisions
PrismsWithoutFormula RectangularPrisms
theymake;itisvitaltohaveempiricaldatatosupporthy-
Netsof SurfaceAreaof
pothesisregardingtheorderinwhichskillsarepresentedas
3DFigures RectangularPrisms
itcanhavealargeimpactonstudentachievementandeither
aid or impede the learning process. Additionally, identify-
ingthebestprerequisiteskillstructurewillenhancestudent Inordertoidentifybelievableground-truthskillpairs,asur-
modeling;knowingastudent’spriorperformanceonprereq- veycontaining24skillpairsforwhichwehadsufficientstu-
uisiteskillscanhelpestimatethatstudent’sperformanceon dent data (greater than 50 student rows) was administered
thepost-requisites. Thiscanleadtoearlierinterventionsfor to 45 teachers and domain experts who use ASSISTments.
struggling students, or even help redefine mastery perhaps Each was asked to rate on a scale of 1 to 7, indicating the
students who perform very well on a prerequisite requires perceivedqualitativestrengthoftherelationshipofeachskill
less practice on a post-requisite, or can be given more ad- pair. From the survey results, five skill pairs were selected
vancedexamples. to be the strongest related links with the smallest variance
in opinion scores. As we are treating these links as truth,
Tatsuoka,definedadatastructurecalledtheQ-Matrix,that wewantedtobehighlyselectiveofthesepairs.
representsthemappingofproblemstoskills: therowsofthis
matrix represent the problems, and the columns represent The resulting dataset consists of 1838 total student rows
the skills [9]. Though the goal of the research was to diag- from896uniquestudents. Thisincludestworowsofdataper
nose the misconceptions of students, they set in motion a studentforeachofthefiveskillpairsincluded. Thefirstrow
numberofstudiesthathaveusedthisdatastructureasthe contains information of that student’s performance on the
firststeptofindprerequisitestructures[2],[5],[8]. pre-andpost-requisiteskills,whilethesecondrowcontains
student performance on the prerequisite and a simulated
Desmarais and his colleagues developed an algorithm that post-requisitedescribedfurtherinthenextsection.
findstheprerequisiterelationshipbetweenquestions,oritems,
instudents’responsedata[6]. Theycomparepairsofitems 1The full raw and filtered datasets are available at the fol-
in a test and determine any interactions existing between lowinglink: http://tiny.cc/veqg5x
Proceedings of the 9th International Conference on Educational Data Mining 293
For each student, a feature vector was selected using com- 4. METHODOLOGY
monperformancemetricstocomparewithinourmodel. This The ultimate goal of this work is to provide the means of
feature vector contained eight link-level features represent- comparingmodelspredictingtheexistence,ornon-existence
ing the interactions between student-level prerequisite and of skill relationships. Our approach to this is through the
post-requisiteperformancemetrics. Thegeneratedlink-level comparison and identification of features that most accu-
featuresobservedareasdescribedbelow: rately predict these relationships. Using principal compo-
nent analysis, we group similar features into more general-
izedconceptualizationstobothcomparewhichtypesoffea-
Percent Correct turesmatterwhenpredictingrelationships,butalsotoavoid
Themean-centered2percentageofcorrectresponsesin problems of multicollinearity that may bias our estimates.
the prerequisite skill multiplied by the mean-centered Once this baseline model is established, we can construct
percentage of correct responses in the post-requisite newpredictivemodelsfromthesignificantfeaturesandob-
skill. servetheiraccuracyinpredictingtheexistenceofskillrela-
tionshipswhencomparedtoasimplenull,orunconditional
First Problem Correctness (FPC) model.
Thebinarycorrectnessofthefirstresponseinthepre-
requisite skill multiplied by the binary correctness of In order to compare the usage of features against a weak
thefirstresponseonthepost-requisiteskill. ornon-existentrelationship,wesimulatedanewskillusing
students from the existing prerequisite skill by generating
Mastery Speed random sequences of responses. For each existing student,
The mean-centered mastery speed of the prerequisite we randomly assign him/her a probability between 0.5 and
skill, defined as the number of problems required for 0.9inordertocreatearandomsequenceofanswers. Forex-
each student to achieve three consecutive correct re- ample,astudentgivenaprobabilityof0.5hasa50%chance
sponses,multipliedbythemean-centeredmasteryspeed ofansweringeachgivenproblemcorrectly. Wesimulatestu-
of the post-requisite skill. In addition to centering, dent answers until either mastery is achieved, defined as
these values were also winsorized to make the largest three sequentially correct responses, or the student reaches
possible value 10, chosen as this is often the maxi- 10problemswithoutmastering;avalueof10ischosenhere,
mum number of daily attempts allowed within AS- asmanyassignmentsinASSISTmentsaregivenadailylimit
SISTments. All centering and winsorizing occurred of 10 problem attempts before asking the student to seek
beforemultiplyingthetwovalues. help or try again on another day. While we acknowledge
therearemanywaystoaccomplishthissimulationstep,we
Z-Scored Percent Correct feelthissimplemethodsufficientlycreatesaskillthathasno
The z-scored3 value of mean-centered percentage of relationshiptotheoriginalprerequisiteasintended. Asour
correct responses in the prerequisite skill multiplied proposedmethodisintendedtobeusedinthefuturetohelp
by the z-scored value of mean-centered percentage of identify undiscovered pre- or post-requisite links, we chose
correctresponsesinthepost-requisiteskill. to use a simulated skill rather than a random existing skill
to avoid the possibility of randomly selecting an undiscov-
Binned Mastery Speed (Bin) ered related skill. Again, we wanted to be highly selective
Thenumberedbinofmasteryspeedasdescribedin[3] andconsiderseveralsuchscenariosasweareattemptingto
of the prerequisite skill multiplied by the bin of mas- create ground-truth values to which we can make our com-
tery speed in the second skill. Students were placed parisons.
intooneoffivebinsbasedonmasteryspeediftheas-
signmentwascompletedandbasedonpercentcorrect Using these two skill-pairs, one link representing a strong
iftheassignmentwasnotcompleted. relationship while the other representing a non-existent re-
lationship,wecancalculateafeaturevectorforeachstudent
intheprerequisiteskillwithvaluesfromeachskill-pair. We
Z-Scored Mastery Speed
use a binary logistic regression with the existence of a re-
Thez-scoredvalueofmean-centered,winsorizedmas-
lationship as the dependent variable and several link-level
tery speed in the prerequisite skill, multiplied by the
covariates to predict whether a skill relationship exists for
z-scored value of mean-centered, winsorized mastery
each student row. The existence of a relationship can be
speedinthepost-requisiteskill.
determined then simply by majority ruling, but such cal-
culation is not included in this work and instead observes
Bin X FPC
accuracy at the student-level for a more accurate compari-
The binned mastery speed value in the prerequisite
son.
skill multiplied by the binary correctness of the first
responseinthepost-requisiteskill.
Webegintocomparecommonlyusedstudent-levelfeatures
in this study through two levels analysis. The first step
Percent Correct X FPC
attemptstocomparegroupsoffeatures,generalizingdiffer-
The mean-centered percentage of correct responses in
entrepresentationsofsimilarfeaturesintoconceptualgroup-
theprerequisiteskillmultipliedbythebinarycorrect-
ings. As such, we are able to view the predictive power of
nessofthefirstresponseinthepost-requisiteskill.
whatwedenoteasinitialperformance,mastery,andcorrect-
2Allcenteringoffeatureswasperformedattheskill-level. ness. Thesecondexperimentlooksattheindividualfeatures
3Allz-scoringwasperformedattheclass-level. asdifferentrepresentationsoftheoverallgrouptocompare
Proceedings of the 9th International Conference on Educational Data Mining 294
studentperformanceontheinitialitemsofeachskill. Inad-
ditiontothesethreecategories,wearealsoleftwithstudent
mastery speed z-scored within student classes as a variable
that did not fall under either of the three aforementioned
categories; while a derivation of mastery speed, we believe
thatthisdidnotcorrelatetothe“mastery”categorydueto
themethodofstandardizationasitiscapturingthismetric
inrelationtostudents’peers. Wewillreaddressthiscasein
oursectionofdiscussion.
Once these predictors are identified and created, we con-
structabinarylogisticregressionmodeltopredict,foreach
student row, whether a relationship exists or not. This
model will give us a significance value and coefficient for
eachpredictorinthemodel,aswellasanoverallpredictive
accuracyofthemodelwhichwillbeusedmoreforthenext
analysis.
4.2 ComparingFeatureModels
After being able to compare which generalized groups of
featuresaresignificantpredictorsoftheexistenceofskillre-
Figure 1: The results of the PCA analysis. All fea-
lationships, we are able to compare the individual student-
tures except Z-Scored Mastery Speed mapped to
level features that fall into each category by incorporating
one of three generalized components.
them into separate models to observe predictive accuracy.
The analysis of the first experiment is used to determine
which categories are significant in predicting the existence
these predictors at a closer level. We can take each factor
ofskillrelationships. Usingthatinformation,weareableto
ofmastery,forexample,andcomparetheirusageinseveral
focusonthosegroupingswithsignificancetoconstructmod-
modelstodeterminewhichisthemostaccuratepredictorof
elsthatutilizefactorsfromeachgrouping. Thegroupingof
theexistenceofskillrelationships.
“mastery,”forexamplecontainsthefactorsofmasteryspeed
andbinnedmasteryspeed,sowecanconstructmodelsusing
4.1 ComparingLink-LevelFeatures each to compare differences in predictive power. To avoid
Inordertocomparerepresentationsofstudent-levelfeatures, problemsofcollinearity,nosinglemodelcontainsmorethan
wemustfirstbeabletocomparegeneralconceptualizations onefactorfromasinglegrouping. Thissignificantlyreduces
offeaturestodeterminewhichprovidemoreaccuratepredic- thenumberofcombinationsoffeaturestotestcomparedto
tionsoftheexistenceofskillrelationships. Wewanttocap- runningthisexperimentwithoutfirstgroupinglikefeatures
turethetruerepresentationsofeachmetricandattemptto and identifying those that are significant as we did in the
interpretthesegeneralizationsastypesoffeatures. Inorder firstexperiment.
to accomplish this grouping of predictors, we use principal
component analysis (PCA) to identify which student-level Using the significant groupings, we are able to create 17
featurescorrelatetoandarerepresentativeofmoregeneral- models consisting of single, pairs, and triplets of features.
izedcomponents. PCAisprimarilyusedfordimensionality A logistic regression is run on each of these models to pre-
reduction as we are doing here and gives us the ability to dict the existence of a skill relationship. Of the 17 mod-
create new variables from the component mappings. The els,10ofthemproduceastatisticallysignificantprediction
resulting feature alignment can be seen in Figure 1. As is when compared to a null model. Ideally, our null model
the case in our study, and was mentioned in the previous shouldproducea50%accuracyasthereisanequalnumber
section, we have multiple metrics of mastery speed as well ofgoodandbadlinkrowsinourdataset. Thisisnotalways
as several other features. As we can represent“mastery”in thecase,however,asdependingonthefeatureobserved,in-
severalways,wewanttoknowiftheoverallconceptofmas- formationmaybemissingforaparticularstudent;mastery
tery, as captured by the metrics used, is reliably predictive speed,forexample,asthenumberofitemsattemptedbya
oftheexistenceofskillrelationships. studentbeforereaching3consecutivecorrectanswers,would
bemissingforanystudentthatdidnotcompletetheassign-
Creating a new set of predictors of these groupings, we are ment. Forthisreason,thepredictivepowerofeachmodelis
able to incorporate these into a binary logistic regression describedasgainsinpredictiveaccuracy,orrather,theaccu-
model to view the predictive power of each. While PCA racyofeachmodelminustheaccuracyofthecorresponding
groupssimilarfeaturestogetherbasedontheircorrelations, nullmodel.
byviewingwhichfeaturesaregroupedweareabletointer-
pretandlabeleach. Fromthisprocess,wefoundthatmost 5. RESULTS
of our features fell into three categories for which we have The results of the first analysis are expressed in Table 2.
giventhenames“mastery,”asthisconsistsofrepresentations EachofthethreefeaturegroupingsofMastery,Correctness,
ofmasteryspeed,“correctness,”asthisconsistsofrepresen- and Initial Performance created using PCA in addition to
tations of the percentage of correct student responses, and theZ-ScoredMasteryarecomparedwithinthesamemodel,
“initial performance,”as this consists of representations of predicting the existence of a skill relationship. As these
Proceedings of the 9th International Conference on Educational Data Mining 295
thatwehaveinterpretedasmastery,correctness,andinitial
Table 2: The coefficients and significance values of
performance. Itwasfoundthez-scoredmasteryspeed,con-
the generalized components analyzed. From this we
trary to our intuition, did not map well to the grouping of
canfocusonmodelsthatexcludefeaturescontained
mastery. We can speculate the reason for this occurrence
in the components with no significance.
byalteringourinterpretationofthefeature. Masteryspeed
Component CoefficientValue Significance
itself is an interesting metric as it attempts to capture two
(log-oddsunits)
dimensions of performance: a level of understanding and a
rateoflearning. Also,toreiterateapriordistinction,these
Mastery -.251 <.001*** metrics are interactions of performance across skills. By z-
scoringthemetric,itiscapturingacontextualeffectofeach
Correctness .015 .802 student in comparison with other students in the class, a
Initial distinctionthatappearstohaveasignificanteffect.
Performance .129 .037*
Z-Scored Observing the resulting model components from the prin-
MasterySpeed -.129 <.001*** cipal component analysis in Table 2, we were able to focus
our attention to those components with significant values.
Correctnesswastheonlycomponentofthatmodelthatwas
againarelink-levelfeaturesdescribinginteractionsbetween found to have no statistical significance on the dependent
student-levelperformanceonprerequisiteandpost-requisite variable. Thisiscertainlyinteresting,aspercentcorrectness
skills, it is difficult to draw tangible interpretations from andothersuchmeasuresareamongthemostcommonmet-
the coefficient value, expressed in log-odds units. This co- rics of performance. Perhaps the interaction between pre-
efficient, used in the logistic regression to make the predic- and post-requisite percent correct is losing some predictive
tions, describes each component’s effect on the dependent powerfromwhenthemetricisusedforotherpredictionsof
variable. For example, for each unit increase in“Mastery,” performance.
theprobabilitythatthelinkexistsdecreases. Again,asthis
componentisanaggregationofinteractionfeatures,itisre- Thisaspectillustratesoneotherimportantfindingthatthe
ally describing an aggregation of differences of differences distinct representations of one metric or another each con-
between student-level features making it difficult to make tribute differently to the predictive accuracy of the models
definitive claims regarding these values alone and were in- studied. Models incorporating mastery speed, for example,
cludedpurelytodisplayageneraltrendofthesecomponents had no significant accuracy gains over a null model, while
ontheprediction. masteryspeedbinningshowedconsiderablegainsasseenin
Table3. Thebaselinemodelofcomparisonproposedinthis
Fromthetable,weareabletodeterminethesignificanceof study provides the means to make that distinction regard-
eachcomponentontheoverallpredictionbyviewingthecor- ing features contained within the same generalized compo-
responding p-values in the third column. Looking at these nent grouping. As is seen in that figure, combinations of
values, we can claim that the overall grouping of“Correct- features outperform any single feature, illustrating a more
ness”seems to have less of an impact on the predictive ac- robust model by capturing multiple representations of per-
curacyofthemodel. Asthistermisnotsignificant,wecan formance.
focus the remainder of our study on the remaining three
components.
7. FUTUREWORK
While we have shown that our model is able to compare
Table 3 illustrates the results of our second analysis com-
and identify features that contribute to higher accuracy in
paringthemodelsthatweareabletoconstructwiththere-
predicting the existence of skill relationships, we also need
maining features once the“Correctness”grouping has been
to stress the importance of the usage of this information.
disregarded. This figure shows the comparative predictive
Theabilitytocomparefeaturesisonlythefirststepofour
accuracy of the 10 models that give statistically significant
model’s goal. By identifying strong predictors of skill rela-
predictions as seen in Table 3. Again, these values are ex-
tionshipsthatweknowexist,wecanapplyittootherskills
pressed as accuracy gains, or rather the percent accuracy
within ASSISTments and other systems to identify poten-
increaseoverthenullmodelrunforeachpredictivemodel.
tially new prerequisite arches, and also to better measure
and predict long-term student performance, learning, and
6. DISCUSSION retention. Havinganaccurateestimateofskillrelationships
Thisworkprovidesabaselinemodelofcomparingstudent- can help restructure prerequisite structures to provide skill
levelperformanceacrossskillstomeasurethestrengthofa sequences in an order that optimizes student learning and
skillrelationshipandcomparetheaccuracyofbothfeatures achievement.
and models that estimate this value. Such a model, in our
experience,hasnotexistedpriortothisstudy. Ourmethod Theworkinthispaperincorporatedseveralskillsintoasin-
attempts to identify not only the individual features that gle dataset to make predictions. In this case, we wanted to
contribute to better predictions of these relationships, but createamethodthatisgeneralizabletosomedegree. While
alsomovestogeneralizesimilarfeaturesintoconceptualiza- ourselectiveskillsetallowsustomakesomeclaimsinterms
tionsforcomparisoninordertominimizemulticollinearity. oftheaccuracythesemodelsoverallskills,itmaylikelybe
the case that skill relationships are measurable in different
The principal component analysis step of our model found ways for different skills. Further analysis could repeat the
thatallbutonefeaturemappedtooneofthreecomponents stepshereoneachoneoftheacquiredskillsinthedataset.
Proceedings of the 9th International Conference on Educational Data Mining 296
Table 3: The models constructed from features in the significant generalized components. No one model
contains more than a single feature from each generalized component.
Model NullAccuracy ModelAccuracy AccuracyGain Significance
MasterySpeed(MS) 0.63 0.62 0.00 1.000
Z-ScoredMasterySpeed 0.63 0.63 0.00 0.888
FirstProblemCorrectness(FPC) 0.50 0.56 0.06 <0.001***
BinnedMS 0.50 0.69 0.19 <0.001***
BinXFPC 0.50 0.56 0.06 <0.001***
Bin,Z-ScoredMS 0.50 0.71 0.21 <0.001***
MS,FPC 0.63 0.62 0.00 1.000
MS,BinXFPC 0.63 0.62 0.00 1.000
Bin,FPC 0.50 0.69 0.19 <0.001***
Bin,BinXFPC 0.50 0.69 0.19 <0.001***
MS,FPC,Z-ScoredMS 0.63 0.63 0.00 0.754
MS,BinXFPC,Z-ScoredMS 0.63 0.63 0.00 0.979
Bin,FPC,Z-ScoredMS 0.50 0.71 0.20 <0.001***
Bin,BinXFPC,Z-ScoredMS 0.50 0.71 0.21 <0.001***
MS,Z-ScoredMS 0.63 0.63 0.00 0.843
FPC,Z-ScoredMS 0.50 0.64 0.14 <0.001***
BinXFPC,Z-ScoredMS 0.50 0.61 0.11 <0.001***
While correctness was not significant in these results, per- 9. REFERENCES
hapsitissignificantwhenpredictingcertaintypesofskills. [1] S.Adjei,D.Selent,N.Heffernan,Z.Pardos,
Perhaps, similar to our features, skills themselves could be A.Broaddus,andN.Kingston.Refininglearning
generalizedintoconceptualtypesfordifferentkindsofanal- mapswithdatafittingtechniques: Searchingfor
ysis pertaining to interactions of performance and their re- betterfittinglearningmaps.InEducational Data
lationships. Mining,2014.
[2] T.Barnes.Theq-matrixmethod: Miningstudent
Thefeaturevectorsgeneratedforeachstudentinourdataset responsedataforknowledge.InAmerican Association
captured many of the most common student-level metrics, for Artificial Intelligence 2005 Educational Data
but certainly not all of them. There are many other as- Mining Workshop,2005.
pects that could be added including completion, measures [3] A.Botelho,H.Wan,andN.Heffernan.Theprediction
oflearningrate,timespentontheassignments,hintusage, ofstudentfirstresponseusingprerequisiteskills.In
and countless other variables. In addition, this study only Learning at Scale,2015.
observed interactions expressed as multiplications of these [4] E.Brunskill.Estimatingprerequisitestructurefrom
terms to describe them as link-level features. There are noisydata.InEducational Data Mining,pages
various other ways to represent interactions or other such 217–222.Citeseer,2011.
transformations including differences of values, division of
[5] Y.Chen,P.-H.Wuillemin,andJ.-M.Labat.
values, or just simply cross-feature interactions as was par-
Discoveringprerequisitestructureofskillsthrough
tially explored here by looking at Bin X FPC and Percent
probabilisticassociationrulesmining.InThe 8th
Correct X FPC. Such interactions model various other as-
International Conference on Educational Data Mining,
pectsofstudentperformanceandbehaviorthatcanbevery
pages117–124,2015.
usefulinthistypeofrelationshipprediction.
[6] M.C.Desmarais,A.Maluf,andJ.Liu.User-expertise
modelingwithempiricallyderivedprobabilistic
The methodology presented observes models that predict
implicationnetworks.User modeling and user-adapted
the existence of skills as a binary outcome, while it can be
interaction,5(3-4):283–315,1995.
modified to make comparisons on estimates of relationship
[7] P.I.PavlikJr,H.Cen,L.Wu,andK.R.Koedinger.
strengthsasacontinuousoutcomeaswell. Themethodob-
Usingitem-typeperformancecovariancetoimprove
served model accuracy at the student level for better mea-
theskillmodelofanexistingtutor.Online
surements, but it is a skill-level relationship that is being
Submission,2008.
tested. One simple addition of future work could explore
[8] R.Scheines,E.Silver,andI.Goldin.Discovering
how to best combine the predictions at a student level to
prerequisiterelationshipsamongknowledge
make a skill-level prediction. The methodology can then
components.InEducational Data Mining,2014.
testrelationshipsontheentiresystemskillstructure.
[9] K.K.Tatsuoka.Rulespace: Anapproachfordealing
withmisconceptionsbasedonitemresponsetheory.
8. ACKNOWLEDGMENTS
Journal of educational measurement,20(4),1983.
We acknowledge funding from the NSF (1316736, 1252297,
[10] H.WanandJ.B.Beck.Consideringtheinfluenceof
1109483,1031398,0742503,1440753),theU.S.Dept. ofEd.
prerequisiteperformanceonwheelspinning.In
GAAN(P200A120238),ONR’s“STEMGrandChallenges,”
Educational Data Mining,2015.
andIES(R305A120125,R305C100024).
Proceedings of the 9th International Conference on Educational Data Mining 297