Mathematical Language Processing: Automatic Grading and Feedback for Open Response Mathematical Questions AndrewS.Lan,DivyanshuVats,AndrewE.Waters,RichardG.Baraniuk RiceUniversity Houston,TX77005 mr.lan,dvats,waters,richb @sparfa.com 5 { } 1 ABSTRACT INTRODUCTION 0 While computer and communication technologies have pro- Large-scaleeducationalplatformshavethecapabilitytorev- 2 videdeffectivemeanstoscaleupmanyaspectsofeducation, olutionize education by providing inexpensive, high-quality n thesubmissionandgradingofassessmentssuchashomework learning opportunities for millions of learners worldwide. a assignmentsandtestsremainsaweaklink. Inthispaper,we Examples of such platforms include massive open online J studytheproblemofautomaticallygradingthekindsofopen courses (MOOCs) [6, 7, 9, 10, 16, 42], intelligent tutoring 8 response mathematical questions that figure prominently in systems[43],computer-basedhomeworkandtestingsystems 1 STEM (science, technology, engineering, and mathematics) [1,31,38,40],andpersonalizedlearningsystems[24].While courses. Our data-driven framework for mathematical lan- computerandcommunicationtechnologieshaveprovidedef- ] L guageprocessing(MLP)leveragessolutiondatafromalarge fective means to scale up the number of learners viewing M number of learners to evaluate the correctness of their solu- lectures (via streaming video), reading the textbook (via the tions, assign partial-credit scores, and provide feedback to web), interacting with simulations (via a graphical user in- t. eachlearneronthelikelylocationsofanyerrors. MLPtakes terface),andengagingindiscussions(viaonlineforums),the a inspiration from the success of natural language processing submissionandgradingofassessmentssuchashomeworkas- t s for text data and comprises three main steps. First, we con- signmentsandtestsremainsaweaklink. [ vert each solution to an open response mathematical ques- 1 tion into a series of numerical features. Second, we clus- There is a pressing need to find new ways and means to au- tomatetwocriticaltasksthataretypicallyhandledbythein- v ter the features from several solutions to uncover the struc- structororcourseassistantsinasmall-scalecourse: (i)grad- 6 turesofcorrect,partiallycorrect,andincorrectsolutions. We ing of assessments, including allotting partial credit for par- 4 develop two different clustering approaches, one that lever- tiallycorrectsolutions,and(ii)providingindividualizedfeed- 3 agesgenericclusteringalgorithmsandonebasedonBayesian 4 nonparametrics. Third, we automatically grade the remain- backtolearnersonthelocationsandtypesoftheirerrors. 0 ing(potentiallylargenumberof)solutionsbasedontheiras- Substantial progress has been made on automated grading . 1 signed cluster and one instructor-provided grade per cluster. andfeedbacksystemsinseveralrestricteddomains,including 0 Asabonus,wecantracktheclusterassignmentofeachstep essayevaluationusingnaturallanguageprocessing(NLP)[1, 5 ofamultistepsolutionanddeterminewhenitdepartsfroma 33], computer program evaluation [12, 15, 29, 32, 34], and 1 cluster of correct solutions, which enables us to indicate the mathematicalproofverification[8,19,21]. v: likely locations of errors to learners. We test and validate Inthispaper,westudytheproblemofautomaticallygrading i MLP on real-world MOOC data to demonstrate how it can X the kinds of open response mathematical questions that fig- substantially reduce the human effort required in large-scale ureprominentlyinSTEM(science,technology,engineering, r educationalplatforms. a and mathematics) education. To the best of our knowledge, thereexistnotoolstoautomaticallyevaluateandallotpartial- AuthorKeywords credit scores to the solutions of such questions. As a result, Automaticgrading,Machinelearning,Clustering,Bayesian large-scale education platforms have resorted either to over- nonparametrics,Assessment,Feedback,Mathematical simplifiedmultiplechoiceinputandbinarygradingschemes languageprocessing (correct/incorrect),whichareknowntoconveylessinforma- tionaboutthelearners’knowledgethanopenresponseques- tions[17],orpeer-gradingschemes[25,26],whichshiftthe Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalor burdenofgradingfromthecourseinstructortothelearners.1 classroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributed forprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitation onthefirstpage. Copyrightsforcomponentsofthisworkownedbyothersthan ACMmustbehonored. Abstractingwithcreditispermitted. Tocopyotherwise, 1While peer grading appears to have some pedagogical value for or republish, to post on servers or to redistribute to lists, requires prior specific learners[30],eachlearnertypicallyneedstogradeseveralsolutions permissionand/[email protected]. fromotherlearnersforeachquestiontheysolve,inordertoobtain L@S’15,March14–March15,2015,Vancouver,Canada. anaccurategradeestimate. Copyright©2015ACMISBN/15/03...$15.00. First, we convert each solution to an open response mathe- maticalquestionintoaseriesofnumericalfeatures. Inderiv- ingthesefeatures, wemakeuseofsymbolicmathematicsto transformmathematicalexpressionsintoacanonicalform. Second,weclusterthefeaturesfromseveralsolutionstoun- coverthestructuresofcorrect,partiallycorrect,andincorrect solutions. We develop two different clustering approaches. (a)Acorrectsolutionthatreceives3/3 (b)Anincorrectsolutionthatreceives MLP-Susesthenumericalfeaturestodefineasimilarityscore credits 2/3creditsduetoanerrorinthelast betweenpairsofsolutionsandthenappliesagenericcluster- expression ingalgorithm,suchasspectralclustering(SC)[22]oraffinity propagation (AP) [11]. We show that MLP-S is also useful forvisualizingmathematicalsolutions. Thiscanhelpinstruc- tors identify groups of learners that make similar errors so thatinstructorscandeliverpersonalizedremediation.MLP-B (c)Anincorrectsolutionthatreceives (d)Anincorrectsolutionthatreceives definesanonparametricBayesianmodelforthesolutionsand 1/3creditsduetoanerrorinthesecond 0/3credits appliesaGibbssamplingalgorithmtoclusterthesolutions. expression Third, onceahumanassignsagradetoatleastonesolution Figure1: Examplesolutionstothequestion“Findthederiva- in each cluster, we automatically grade the remaining (po- tiveof(x3 +sinx)/ex”thatwereassignedscoresof3,2,1 tentially large number of) solutions based on their assigned and0outof3,respectively,byourMLP-Balgorithm. cluster. As a bonus, in MLP-B, we can track the cluster as- signment of each step in a multistep solution and determine whenitdepartsfromaclusterofcorrectsolutions,whichen- ablesustoindicatethelikelylocationsoferrorstolearners. (x2+x+sin2x+cos2x)(2x 3) � IndevelopingMLP,wetacklethreemainchallengesofana- =2x3+2x2+2xsin2x+2xcos2x lyzingopenresponsemathematicalsolutions. First,solutions 3x2 3x 3sin2x 3cos2x � � � � mightcontaindifferentnotationsthatrefertothesamemath- =2x3 x2 3x+2x(sin2x+cos2x) � � ematicalquantity. Forinstance,inFigure1, thelearnersuse �3(sin2x+cos2x) bothe−x and 1 torefertothesamequantity. Second,some =2x3 x2 3x+2x(1) 3(1) ex � � � questions admit more than one path to the correct/incorrect =2x3 x2 x 3 � � � solution. For instance, in Figure 2 we see two different yet (a)Acorrectsolutionthatmakesthe (b)Acorrectsolutionthatmakesthe correct solutions to the same question. It is typically infea- simplification sin2x+cos2x=1in thesimplification sin2x+cos2x=1 sible for an instructor to enumerate all of these possibilities thefirstexpression inthethirdexpression toautomatethegradingandfeedbackprocess. Third,numer- ically verifying the correctness of the solutions does not al- Figure2:Examplesoftwodifferentyetcorrectpathstosolve waysapplytomathematicalquestions,especiallywhensim- the question “Simplify the expression (x2 + x + sin2x + plificationsarerequired. Forexample,aquestionthatasksto cos2x)(2x 3).” simplifytheexpressionsin2x+cos2x+xcanhaveboth1+x − andsin2x+cos2x+xasnumericallycorrectanswers,since boththeseexpressionsoutputthesamevalueforallvaluesof x.However,thecorrectansweris1+x,sincethequestionex- MainContributions pectsthelearnerstorecognizethatsin2x+cos2x=1.Thus, Inthispaper,wedevelopadata-drivenframeworkformath- methodsdevelopedtocheckthecorrectnessofcomputerpro- ematicallanguageprocessing(MLP)thatleveragessolution gramsandformulaebyspecifyingarangeofdifferentinputs datafromalargenumberoflearnerstoevaluatethecorrect- andcheckingforthecorrectoutputs,e.g.,[32],cannotalways ness of solutions to open response mathematical questions, be applied to accurately grade open response mathematical assign partial-credit scores, and provide feedback to each questions. learneronthelikelylocationsofanyerrors. Thescopeofour frameworkis broadand coversquestions whosesolutionin- RelatedWork volvesoneormoremathematicalexpressions. Thisincludes Prior work has led to a number of methods for grading and notjustformalproofsbutalsothekindsofmathematicalcal- providing feedback to the solutions of certain kinds of open culations that figure prominently in science and engineering response questions. A linear regression-based approach has courses. Examples of solutions to two algebra questions of beendevelopedtogradeessaysusingfeaturesextractedfrom variouslevelsofcorrectnessaregiveninFigures1and2. In a training corpus using Natural Language Processing (NLP) this regard, our work differs significantly from that of [8], [1,33]. Unfortunately,suchasimpleregression-basedmodel whichfocusesexclusivelyonevaluatinglogicalproofs. doesnotperformwellwhenappliedtothefeaturesextracted Our MLP framework, which is inspired by the success of frommathematicalsolutions. Severalmethodshavebeende- NLP methods for the analysis of textual solutions (e.g., es- veloped for automated analysis of computer programs [15, saysandshortanswer),comprisesthreemainsteps. 32]. However, these methods do not apply to the solutions toopenresponsemathematicalquestions,sincetheylackthe MLP identifies the unique mathematical expressions con- structure and compilability of computer programs. Several tained in the learners’ solutions and uses them as features, methodshavealsobeendevelopedtocheckthecorrectnessof effectively extending the bag-of-words model to use mathe- thelogicinmathematicalproofs[8,19,21]. However,these maticalexpressionsasfeaturesratherthanwords. Tocoina methodsapplyonlytomathematicalproofsinvolvinglogical phrase,MLPusesanovelbag-of-expressionsmodel. operationsandnotthekindsofopen-endedmathematicalcal- Oncethemathematicalexpressionshavebeenextractedfrom culations that are often involved in science and engineering a solution, we parse them using SymPy, the open source courses. Python library for symbolic mathematics [36].2 SymPy has The idea of clustering solutions to open response questions powerful capability for simplifying expressions. For exam- into groups of similar solutions has been used in a number ple, x2 +x2 can be simplified to 2x2, and exx2/e2x can be of previous endeavors: [2, 5] uses clustering to grade short, simplifedtoe−xx2. Inthisway,wecanidentifytheequiva- textual answers to simple questions; [23] uses clustering to lenttermsinexpressionsthatrefertothesamemathematical visualize a large collection of computer programs; and [28] quantity, resulting in more accurate features. In practice for uses clustering to grade and provide feedback on computer somequestions,however,itmightbenecessarytotonedown programs. Althoughthehigh-levelconceptunderlyingthese thelevelofSymPy’ssimplification. Forinstance, thekeyto worksisresonantwiththeMLPframework,thefeaturebuild- solvingthequestioninFigure2istosimplifytheexpression ingtechniquesusedinMLPareverydifferent,sincethestruc- usingthePythagoreanidentitysin2x+cos2x=1.IfSymPy tureofmathematicalsolutionsdifferssignificantlyfromshort is called on to perform such a simplification automatically, textualanswersandcomputerprograms. thenitwillnotbepossibletoverifywhetheralearnerhascor- rectlynavigatedthesimplificationintheirsolution. Forsuch This paper is organized as follows. In the next section, we problems,itisadvisabletoperformonlyarithmeticsimplifi- develop our approach to convert open response mathemati- cations. cal solutions to numerical features that can be processed by machine learning algorithms. We then develop MLP-S and Afterextractingtheexpressionsfromthesolutions,wetrans- MLP-B and use real-world MOOC data to showcase their form the expressions into numerical features. We assume abilitytoaccuratelygradealargenumberofsolutionsbased that N learners submit solutions to a particular mathemati- on the instructor’s grades for only a small number of solu- cal question. Extracting the expressions from each solution tions, thus substantially reducing the human effort required using SymPy yields a total of V unique expressions across inlarge-scaleeducationalplatforms. Weclosewithadiscus- theN solutions. sionandperspectivesonfutureresearchdirections. We encode the solutions in a integer-valued solution feature matrix Y NV×N whose rows correspond to different ex- ∈ pressions and whose columns correspond to different solu- MLPFEATUREEXTRACTION ThefirststepinourMLPframeworkistotransformacollec- tions;thatis,the(i,j)thentryofYisgivenby tion of solutions to an open response mathematical question Y =timesexpressioniappearsinsolutionj. into a set of numerical features. In later sections, we show i,j how the numerical features can be used to cluster and grade EachcolumnofYcorrespondstoanumericalrepresentation solutionsaswellasgenerateinformativelearnerfeedback. ofamathematicalsolution. Notethatwedonotconsiderthe orderingoftheexpressionsinthismodel; suchanextension Asolutiontoanopenresponsemathematicalquestionwillin is an interesting avenue for future work. In this paper, we generalcontainamixtureofexplanatorytextandcoremath- indicate in Y only the presence and not the frequency of an ematicalexpressions. Sincethecorrectnessofasolutionde- expression,i.e.,Y 0,1 V×N and pendsprimarilyonthemathematicalexpressions,wewillig- ∈{ } norethetextwhenderivingfeatures. However,werecognize (cid:26) 1 if expressioniappearsinsolutionj thatthetextispotentiallyveryusefulforautomaticallygener- Y = (1) i,j 0 otherwise. atingexplanationsforvariousmathematicalexpressions. We leavethisavenueforfuturework. Theextensiontoencodingfrequenciesisstraightforward. AworkhorseofNLPisthebag-of-wordsmodel;ithasfound To illustrate how the matrix Y is constructed, consider the tremendous success in text semantic analysis. This model solutionsinFigure2(a)and(b). Acrossbothsolutions,there treats a text document as a collection of words and uses the are 7 unique expressions. Thus, Y is a 7 2 matrix, with frequencies of the words as numerical features to perform × each row corresponding to a unique expression. Letting the tasksliketopicclassificationanddocumentclustering[4,5]. first four rows of Y correspond to the four expressions in A solution to an open response mathematical question con- Figure2(a)andtheremainingthreerowstoexpressions2–4 sistsofaseriesofmathematicalexpressionsthatarechained inFigure2(b),wehave togetherbytext, punctuation, ormathematicaldelimitersin- (cid:20) (cid:21)T cluding =, , >, , , etc. For example, the solution Y = 1 1 1 1 0 0 0 . in Figure 1(b≤) conta∝ins≈the expressions ((x3 + sinx)/ex)(cid:48), 1 0 0 1 1 1 1 ((3x2+cosx)ex (x3+sinx)ex))/e2x,and(2x2 x3+ cosx sinx)/ex−thatareallseparatedbythedelimit−er“=”. 2Inparticular,weusetheparse exprfunction. − We end this section with the crucial observation that, for a belongtothesamecluster. Foreachfigure,weshowasam- widerangeofmathematicalquestions,manyexpressionswill plesolutionfromsomeoftheseclusters,withtheboxedsolu- besharedacrosslearners’solutions.Thisistrue,forinstance, tionscorrespondingtocorrectsolutions. Wecanmakethree inFigure2. Thissuggeststhattherearealimitednumberof interestingobservationsfromFigure3: types of solutions to a question (both correct and incorrect) In the top left figure, we cluster a solution with the final andthatsolutionsofthesametypetendtendtobesimilarto • answer3x2+cosx (x3+sinx))/exwithasolutionwith each other. This leads us to the conclusion that the N solu- − tionstoaparticularquestioncanbeeffectivelyclusteredinto thefinalanswer3x2+cosx (x3+sinx))/ex. Although − K N clusters. In the next two sections, we will develop thelatersolutionisincorrect,itcontainedatypographical ML(cid:28)P-S and MLP-B, two algorithms to cluster solutions ac- error where 3 x 2 was typed as 3 x 2. MLP-S is ∗ ∧ ∧ ∧ cordingtotheirnumericalfeatures. able to identify this typographical error, since the expres- sion before the final solution is contained in several other correctsolutions. MLP-S:SIMILARITY-BASEDCLUSTERING In this section, we outline MLP-S, which clusters and then In the top right figure, the correct solution requires iden- gradessolutionsusingasolutionsimilarity-basedapproach. • tifying the trigonometric identify sin2x + cos2x = 1. Theclusteringalgorithmisabletoidentifyasubsetofthe TheMLP-SModel learnerswhowerenotabletoidentifythisrelationshipand WestartbyusingthesolutionfeaturesinYtodefineanotion hencecouldnotsimplifytheirfinalexpression. of similarity between pairs of solutions. Define the N N × similarity matrix S containing the pairwise similarities be- MLP-S is able to identify solutions that are strongly con- tween all solutions, with its (i,j)th entry the similarity be- • nectedtoeachother.Suchavisualizationcanbeextremely tweensolutionsiandj useful for course instructors. For example, an instructor yTy can easily identify a group of learners who lack mastery S = i j . (2) ofacertainskillthatresultsinacommonerrorandadjust i,j min yTy ,yTy { i i j j} theircourseplanaccordinglytohelptheselearners. Thecolumnvectory denotestheithcolumnofYandcorre- i spondstolearneri’ssolution. Informally,S isthenumber Auto-GradingviaMLP-S i,j ofcommonexpressionsbetweensolutioniandsolutionj di- HavingclusteredallsolutionsintoasmallnumberK ofclus- videdbytheminimumofthenumberofexpressionsinsolu- ters, we assign the same grade to all solutions in the same tionsiandj. Alarge/smallvalueofS correspondstothe cluster. Ifacourseinstructorassignsagradetoonesolution i,j twosolutionsbeingsimilar/dissimilar. Forexample,thesim- from each cluster, then MLP-S can automatically grade the ilarity between the solutions in Figure 1(a) and Figure 1(b) remainingN Ksolutions. Weconstructtheindexset S of − I is1/3andthesimilaritybetweenthesolutionsinFigure2(a) solutionsthatthecourseinstructorneedstogradeas and Figure 2(b) is 1/2. S is symmetric, and 0 Si,j 1. Equation (2) is just one of any possible solutio≤n simil≤arity (cid:88)N metrics. Wedeferthedevelopmentofothermetricstofuture S = argmax Si,j, k =1,2,...,K , work. I i∈Ck j=1 where represents the index set of the solutions in cluster ClusteringSolutionsinMLP-S Ck k. In words, in each cluster, we select the solution having Having defined the similarity S between two solutions i i,j the highest similarity to the other solutions (ties are broken andj,wenowclustertheN solutionsintoK N clusters randomly)toincludein . Wedemonstratetheperformance (cid:28) S suchthatthesolutionswithineachclusterhavehighsimilarity I ofauto-gradingviaMLP-Sintheexperimentalresultssection score between them and solutions in different clusters have below. lowsimilarityscorebetweenthem. Given the similarity matrix S, we can use any of the mul- MLP-B:BAYESIANNONPARAMETRICCLUSTERING titude of standard clustering algorithms to cluster solutions. In this section, we outline MLP-B, which clusters and then Two examples of clustering algorithms are spectral cluster- gradessolutionsusingaBayesiannonparameterics-basedap- ing (SC) [22] and affinity propagation (AP) [11]. The SC proach. TheMLP-Bmodelandalgorithmcanbeinterpreted algorithmrequiresspecifyingthenumberofclustersK asan asanextensionofthemodelin[44],whereasimilarapproach inputparameter,whiletheAPalgorithmdoesnot. isproposedtoclustershorttextdocuments. Figure3illustrateshowAPisabletoidentifyclustersofsim- ilar solutions from solutions to four different mathematical TheMLP-BModel questions. The figures on the top correspond to solutions to Following the key observation that the N solutions can be the questions in Figures 1 and 2, respectively. The bottom effectivelyclusteredintoK N clusters,letzbetheN 1 (cid:28) × two figures correspond to solutions to two signal processing clusterassignmentvector,withzj 1,...,K denotingthe ∈{ } questions. Eachnodeinthefigurecorrespondstoasolution, cluster assignment of the jth solution with j 1,...,N . ∈ { } and nodes with the same color correspond to solutions that Using this latent variable, we model the probability of the Figure3: IllustrationoftheclustersobtainedbyMLP-Sbyapplyingaffinitypropagation(AP)onthesimilaritymatrixScorre- spondingtolearners’solutionstofourdifferentmathematicalquestions(seeTable1formoredetailsaboutthedatasetsandthe Appendixforthequestionstatements). Eachnodecorrespondstoasolution. Nodeswiththesamecolorcorrespondtosolutions thatareestimatedtobeinthesamecluster. Thethicknessoftheedgebetweentwosolutionsisproportionaltotheirsimilarity score. Boxedsolutionsarecorrect;allothersareinvaryingdegreesofcorrectness. solutionofalllearners’solutionstothequestionas ofΦandcharcterizesthemultinomialdistributionoverallthe V featuresforclusterk. N (cid:32) K (cid:33) (cid:89) (cid:88) p(Y)= p(y z =k)p(z =k) , Inpractice,oneoftenhasnoinformationregardingthenum- j j j | j=1 k=1 berofclustersK. Therefore,weconsiderK asanunknown parameter and infer it from the solution data. In order to do whereyj,thejthcolumnofthedatamatrixY,correspondsto so, we impose a Chinese restaurant process (CRP) prior on learnerj’ssolutiontothequestion. Herewehaveimplicitly the cluster assignments z, parameterized by a parameter α. assumedthatthelearners’solutionsareindependentofeach TheCRPcharacterizestherandompartitionofdataintoclus- other. By analogy to topic models [4, 35], we assume that ters,inanalogytotheseatingprocessofcustomersinaChi- learnerj’ssolutiontothequestion,yj,isgeneratedaccording neserestaurant. ItiswidelyusedinBayesianmixturemodel- toamultinomialdistributiongiventheclusterassignmentsz ingliterature[3,14]. UndertheCRPprior,thecluster(table) as assignmentofthejth solution(customer),conditionedonthe clusterassignmentsofalltheothersolutions,followsthedis- p(y z =k)=Mult(y φ ) j j j k | | tribution (cid:80) ( Y )! = i i,j ΦY1,jΦY2,j...ΦYV,j, Y !Y !...Y ! 1,k 2,k V,k 1,j 2,j V,j (3) (cid:26) nk,¬j if clusterkisoccupied, p(z =k z ,α)= N−1+α whereΦ [0,1]V×K isaparametermatrixwithΦv,k denot- j | ¬j N−α1+α if clusterkisempty, ingits(v,∈k)th entry. φ [0,1]V×1 denotesthekth column (4) k ∈ α α whereΓ()istheGammafunction. β Φ Y z α · Ifaclusterbecomesemptyafterweremoveasolutionfrom K N α its current cluster, then we remove it from our sampling β process and erase its corresponding multinomial parame- Figure4: Graphicalmodelofthegenerationprocessofsolu- ter vector φk. If a new clusteris sampled for zj, thenwe sample its multinomial parameter vector φ immediately tionstomathematicalquestions. α , α andβ arehyperpa- k α β rameters,zandΦarelatentvariablestobeinferred,andYis according to Step 2 below. Otherwise, we do not change φ untilwehavefinishedsamplingzforallsolutions. theobserveddatadefinedin(1). k 2. Sample Φ: For each cluster k, sample φ from its pos- k terior Dir(φ n +β,...,n +β), where n is the wherenk,¬j representsthenumberofsolutionsthatbelongto k| 1,k V,k i,k clusterkexcludingthecurrentsolutionj,with(cid:80)K n = number of times feature i occurs in the solutions that be- k=1 k,¬j longtoclusterk. N 1. Thevectorz representstheclusterassignmentsof ¬j − theothersolutions. Theflexibilityofallowinganysolutionto 3. Sampleα:Sampleαusingtheapproachdescribedin[41]. startanewclusterofitsownenablesustoautomaticallyin- ferKfromdata.Itisknown[37]thattheexpectednumberof 4. Update β: Update β using the fixed-point procedure de- clustersundertheCRPpriorsatisfiesK O(αlogN) N, scribedin[20]. ∼ (cid:28) soourmethodscaleswellasthenumberoflearnersN grows The output of the Gibbs sampler is a series of samples that large. WealsoimposeaGammapriorα Gam(α ,α )on α β ∼ correspond to the approximate posterior distribution of the αtohelpusinferitsvalue. various parameters of interest. To make meaningful infer- SincethesolutionfeaturedataYisassumedtofollowamulti- encefortheseparameters(suchastheposteriormeanofapa- nomial distribution parameterized by Φ, we impose a sym- rameter), it is important to appropriately post-process these metricDirichletprioroverΦasφk Dir(φk β)becauseof samples. Forourestimateofthetruenumberofclusters,K(cid:98), ∼ | itsconjugacywiththemultinomialdistribution[13]. wesimplytakethemodeoftheposteriordistributiononthe Thegraphicalmodelrepresentationofourmodelisvisualized numberofclustersK. WeuseonlyiterationswithK =K(cid:98) to in Figure 4. Our goal next is to estimate the cluster assign- estimatetheposteriorstatistics[39]. mentszforthesolutionofeachlearner,theparametersφ of k Inmixturemodels,theissueof“label-switching”cancausea eachcluster,andthenumberofclustersK,fromthebinary- modeltobeunidentifiable, becausetheclusterlabelscanbe valuedsolutionfeaturedatamatrixY. arbitrarilypermutedwithoutaffectingthedatalikelihood. In ordertoovercomethisissue,weuseanapproachreportedin ClusteringSolutionsinMLP-B [39]. First, we compute the likelihood of the observed data We use a Gibbs sampling algorithm for posterior inference ineachiterationasp(Y Φ(cid:96),z(cid:96)), whereΦ(cid:96) andz(cid:96) represent under the MLP-B model, which automatically groups solu- the samples of these var|iables at the (cid:96)th iteration. After the tionsintoclusters. Westartbyapplyingagenericclustering algorithmterminates,wesearchfortheiteration(cid:96) withthe algorithm (e.g., K-means, with K = N/10) to initialize z, largest data likelihood and then permute the labemlsaxz(cid:96) in the and then initialize Φ accordingly. Then, in each iteration of otheriterationstobestmatchΦ(cid:96) withΦ(cid:96)max. WeuseΦ(cid:98) (with MLP-B,weperformthefollowingsteps: columnsφˆ )todenotetheestimateofΦ,whichissimplythe k 1. Samplez: Foreachsolutionj,weremoveitfromitscur- posteriormeanofΦ.Eachsolutionjisassignedtothecluster rent cluster and sample its cluster assignment zj from the indexedbythemodeofthesamplesfromtheposteriorofzj, posteriorp(zj =k|z¬j,α,Y). UsingBayesrule,wehave denotedbyzˆj. p(z =k z ,Φ,α,Y)=p(z =k z ,φ ,α,y ) j ¬j j ¬j k j | | Auto-GradingviaMLP-B p(z =k z ,α)p(y z =k,φ ). ∝ j | ¬j j| j k We now detail how to use MLP-B to automatically grade a Thepriorprobabilityp(z =k z ,α)isgivenby(4). For largenumberN oflearners’solutionstoamathematicalques- j ¬j | non-emptyclusters,theobserveddatalikelihoodp(yj zj = tion,usingasmallnumberK(cid:98) ofinstructorgradedsolutions. | k,φk)isgivenby(3).However,thisdoesnotapplytonew First,asinMLP-S,weselecttheset B of“typicalsolutions” clusters that are previously empty. For a new cluster, we fortheinstructortograde. WeconstIruct byselectingone B marginalizeoutφ ,resultingin I k solutionfromeachoftheK(cid:98) clustersthatismostrepresenta- (cid:90) tiveofthesolutionsinthatcluster: p(y z =k,β)= p(y z =k,φ )p(φ β) j j j j k k | (cid:90)φk | | IB ={argmjaxp(yj|φˆk),k =1,2,...,K(cid:98)}. = Mult(y z =k,φ )Dir(φ β) j| j k k| In words, for each cluster, we select the solution with the φk largestlikelihoodofbeinginthatcluster. V Γ(Vβ) (cid:89)Γ(Yi,j +β) = Γ((cid:80)V Y +Vβ) Γ(β) , The instructor grades the K(cid:98) solutions in IB to form the set i=1 i,j i=1 ofinstructorgrades gk fork B. Usingthesegrades,we { } ∈I assigngradestotheothersolutionsj / accordingto Table1: Datasetsconsistingofthesolutionsof116learners B ∈I to4mathematicalquestionsonalgebraandsignalprocessing. (cid:80)K(cid:98) p(y φˆ )g SeetheAppendixforthequestionstatements. gˆ = k=1 j| k k. (5) j (cid:80)K(cid:98) p(y φˆ ) k=1 j| k No.ofsolutionsN No.offeatures(uniqueexpressions)V Thatis,wegradeeachsolutionnotin Bastheaverageofthe Question1 108 78 instructorgradesweightedbythelikeIlihoodthatthesolution Question2 113 53 Question3 90 100 belongstocluster. Wedemonstratetheperformanceofauto- Question4 110 45 gradingviaMLP-Bintheexperimentalresultssectionbelow. FeedbackGenerationviaMLP-B ofquestionsincludes2high-schoollevelmathematicalques- In addition to grading solutions, MLP-B can automatically tionsand2college-levelsignalprocessingquestions(details provideusefulfeedbacktolearnersonwheretheymadeerrors aboutthequestionscanbefoundinTable1,andthequestion intheirsolutions. statementsaregivenintheAppendix). Foreachquestion,we pre-processthesolutionstofilterouttheblanksolutionsand For a particular solution j denoted by its column feature extract features. Using the features, we represent the solu- value vector y with V total expressions, let y(v) denote j j j tionsbythematrixYin(1).Everysolutionwasgradedbythe the feature value vector that corresponds to the first v ex- courseinstructorwithoneofthescoresintheset 0,1,2,3 , pressions of this solution, with v = {1,2,...,Vj}. Un- withafullcreditof3. { } der this notation, we evaluate the probability that the first v expressions of solution j belong to each of the K(cid:98) clusters: Baseline: Randomsub-sampling p(yj(v)|φˆk),k = {1,2,...,K(cid:98)},forallv. Usingtheseproba- WMeLPc-oBmapgaarienstthaebaausteol-ignreamdientghopderthfoartmdaonecsenootfgMroLupP-tSheasnod- bilities, wecanalsocomputetheexpectedcreditofsolution lutionsintoclusters.Inthismethod,werandomlysub-sample j afterthefirstvexpressionsvia allsolutionstoformasmallsetofsolutionsfortheinstructor (cid:80)K(cid:98) p(y(v) φˆ )g tograde.Then,eachungradedsolutionissimplyassignedthe gˆ(v) = k=1 j | k k, (6) gradeofthesolutioninthesetofinstructor-gradedsolutions j (cid:80)K(cid:98) p(y(v) φˆ ) that is most similar to it as defined by S in (2). Since this k=1 j | k smallsetispickedrandomly,werunthebaselinemethod10 where gk isthesetofinstructorgradesasdefinedabove. timesandreportthebestperformance.3 { } Usingthesequantities,itispossibletoidentifythatthelearner Experimentalsetup has likely made an error at the vth expression if it is most Foreachquestion,weapplyfourdifferentmethodsforauto- likely to belong to a cluster with credit g less than the full k grading: creditor,alternatively,iftheexpectedcreditgˆ(v) islessthan j thefullcredit. Random sub-sampling (RS) with the number of clusters • K 5,6,...,40 . The ability to automatically locate where an error has been ∈{ } made in a particular incorrect solution provides many bene- MLP-S with spectral clustering (SC) with K • ∈ fits. Forinstance,MLP-Bcaninforminstructorsofthemost 5,6,...,40 . { } commonlocationsoflearnererrorstohelpguidetheirinstruc- MLP-Swithaffinitypropagation(AP)clustering. Thisal- tion. Itcanalsoenableanautomatedtutoringsystemtogen- • gorithmdoesnotrequireK asaninput. eratefeedbacktoalearnerastheymakeanerrorintheearly steps of a solution, before it propagates to later steps. We MLP-B with hyperparameters set to the non-informative demonstrate the efficacy of MLP-B to automatically locate • valuesα = α = 1andrunningtheGibbssamplingal- α β learnererrorsusingreal-worldeducationaldataintheexper- gorithmfor10,000iterationswith2,000burn-initerations. imentssectionbelow. MLP-SwithAPandMLP-Bbothautomaticallyestimatethe numberofclustersK. Oncetheclustersareselected,weas- EXPERIMENTS sign one solution from each cluster to be graded by the in- Inthissection,wedemonstratehowMLP-SandMLP-Bcan structorusingthemethodsdescribedinearliersections. beusedtoaccuratelyestimatethegradesofroughly100open responsesolutionstomathematicalquestionsbyonlyasking Performancemetric thecourseinstructortogradeapproximately10solutions.We Weusemeanabsoluteerror(MAE),whichmeasuresthe“av- also demonstrate how MLP-B can be used to automatically erageabsoluteerrorperauto-gradedsolution” providefeedbacktolearnersonthelocationsoferrorsintheir solutions. (cid:80)N−K gˆ g MAE= j=1 | j − j|, N K Auto-GradingviaMLP-SandMLP-B − 3Otherbaselinemethods,suchasthelinearregression-basedmethod Datasets usedintheedXessaygradingsystem[33], arenotlisted, because Our dataset that consists of 116 learners solving 4 open re- theydidnotperformaswellasrandomsub-samplinginourexperi- sponse mathematical questions in an edX course. The set ments. as our performance metric. Here, N K equals the num- 0.7 RS 0.7 ber of solutions that are auto-graded,−and gˆ and g repre- 0.6 MLP−S−SC 0.6 sent the estimated grade (for MLP-B, the esjtimatedjgrades 0.5 MMLLPP−−SB−AP 0.5 are rounded to integers) and the actual instructor grades for E0.4 E0.4 A A M0.3 M0.3 theauto-gradedsolutions,respectively. 0.2 0.2 Resultsanddiscussion 0.1 0.1 In Figure 5, weplot the MAE versus the number ofclusters 0 10 20 30 40 0 10 20 30 40 No. of clusters K No. of clusters K K for Questions 1–4. MLP-S with SC consistently outper- formstherandomsamplingbaselinealgorithmforalmostall (a)Question1 (b)Question2 valuesofK. Thisperformancegainislikelyduetothefact 0.7 0.7 that the baseline method does not cluster the solutions and 0.6 0.6 thusdoesnotselectagoodsubsetofsolutionsfortheinstruc- 0.5 0.5 tortograde. MLP-BismoreaccuratethanMLP-Swithboth E0.4 E0.4 A A SCandAPandcanautomaticallyestimatethevalueofK,al- M0.3 M0.3 thoughatthepriceofsignificantlyhighercomputationalcom- 0.2 0.2 plexity (e.g., clustering and auto-grading one question takes 0.1 0.1 2minutesforMLP-Bcomparedtoonly5secondsforMLP-S 0 10 20 30 40 0 10 20 30 40 No. of clusters K No. of clusters K withAPonastandardlaptopcomputerwitha2.8GHzCPU (c)Question3 (d)Question4 and8GBmemory). BothMLP-SandMLP-Bgradethelearners’solutionsaccu- Figure 5: Mean absolute error (MAE) versus the number of rately(e.g.,anMAEof0.04outofthefullgrade3usingonly instructor graded solutions (clusters) K, for Questions 1– K = 13 instructor grades to auto-grade all N = 113 solu- 4, respectively. For example, on Question 1, MLP-S and tions to Question 2). Moreover, as we see in Figure 5, the MLP-Bestimatethetruegradeofeachsolutionwithanaver- MAE for MLP-S decreases as K increases, and eventually ageerrorofaround0.1outofafullcreditof3. “RS”repre- reaches0whenK islargeenoughthatonlysolutionsthatare sentstherandomsub-samplingbaseline. BothMLP-Smeth- exactlythesameaseachotherbelongtothesamecluster. In odsandMLP-Boutperformsthebaselinemethod. practice,onecantunethevalueofKtoachieveabalancebe- tween maximizing grading accuracy and minimizing human effort. Such a tuning process is not necessary for MLP-B, sinceitautomaticallyestimatesthevalueofK andachieves We have developed a framework for mathematical language suchabalance. processing(MLP)thatconsistsofthreemainsteps: (i)con- vertingeachsolutiontoanopenresponsemathematicalques- FeedbackGenerationviaMLP-B tionintoaseriesofnumericalfeatures;(ii)clusteringthefea- turesfromseveralsolutionstouncoverthestructuresofcor- Experimentalsetup rect, partiallycorrect, andincorrectsolutions; and(iii)auto- SinceQuestions3–4requiresomefamiliaritywithsignalpro- maticallygradingtheremaining(potentiallylargenumberof) cessing, we demonstrate the efficacy of MLP-B in provid- solutions based on their assigned cluster and one instructor- ing feedback on mathematical solutions on Questions 1–2. provided grade per cluster. As our experiments have indi- Among the solutions to each question, there are a few types cated,ourframeworkcansubstantiallyreducethehumanef- ofcommonerrorsthatmorethanonelearnermakes. Wetake fortrequiredforgradinginlarge-scalecourses. Asabonus, oneincorrectsolutionoutofeachtypeandrunMLP-Bonthe MLP-S enables instructors to visualize the clusters of solu- other solutions to estimate the parameter φˆ for each clus- k tionstohelpthemidentifycommonerrorsandthusgroupsof ter. Using this information and the instructor grades g , { k} learnershavingthesamemisconceptions.Asafurtherbonus, aftereachexpressionv inasolution,wecomputetheproba- MLP-Bcantracktheclusterassignmentofeachstepofamul- bilitythatitbelongstoaclusterp(yj(v)|φˆk)thatdoesnothave tistep solution and determine when it departs from a cluster full credit (gk < 3), together with the expected credit using ofcorrectsolutions,whichenablesustoindicatethelocations (6). Oncetheexpectedgradeiscalculatedtobelessthanfull oferrorstolearnersinrealtime.Improvedlearningoutcomes credit,weconsiderthatanerrorhasoccurred. shouldresultfromtheseinnovations. Resultsanddiscussion Thereareseveralavenuesforcontinuedresearch.Wearecur- Two sample feedback generation process are shown in Fig- rentlyplanningmoreextensiveexperimentsontheedXplat- ure6. InFigure6(a),wecanprovidefeedbacktothelearner form involving tens of thousands of learners. We are also ontheirerrorasearlyasLine2,beforeitcarriesovertolater planningtoextendthefeatureextractionsteptotakeintoac- lines. Thus, MLP-B can potentially become a powerful tool count both the ordering of expressions and ancillary text in to generate timely feedback to learners as they are solving asolution. Clusteringalgorithmsthatallowasolutiontobe- mathematicalquestions,byanalyzingthesolutionsitgathers longtomorethanoneclustercouldmakeMLPmorerobust fromotherlearners. tooutliersolutionsandfurtherreducethenumberofsolutions thattheinstructorsneedtograde. Finally, itwouldbeinter- CONCLUSIONS estingtoexplorehowthefeaturesofsolutionscouldbeused ((x3+sinx)/ex)(cid:48) REFERENCES 1. Attali,Y.Constructvalidityofe-raterinscoringTOEFL =(ex(x3+sinx)(cid:48) (x3+sinx)(ex)(cid:48))/e2x − essays.Tech.rep.,EducationalTestingService,May prob.incorrect=0.11, exp.grade=3 2000. =(ex(2x2+cosx) (x3+sinx)ex)/e2x − 2. Basu,S.,Jacobs,C.,andVanderwende,L. prob.incorrect=0.66, exp.grade=2 Powergrading: Aclusteringapproachtoamplifyhuman =(2x2+cosx x3 sinx)/ex effortforshortanswergrading.Trans.Associationfor − − prob.incorrect=0.93, exp.grade=2 ComputationalLinguistics1(Oct.2013),391–402. =(x2(2 x)+cosx sinx)/ex 3. Blei,D.,Griffiths,T.,andJordan,M.Thenestedchinese − − prob.incorrect=0.99, exp.grade=2 restaurantprocessandBayesiannonparametricinference oftopichierarchies.J.ACM57,2(Jan.2010),7:1–7:30. (a)Asamplefeedbackgenerationprocesswherethelearnermakesan errorintheexpressioninLine2whileattemptingtosolveQuestion1. 4. Blei,D.M.,Ng,A.Y.,andJordan,M.I.LatentDrichlet allocation.J.MachineLearningResearch3(Jan.2003), (x2+x+sin2x+cos2x)(2x 3) − 993–1022. =(x2+x+1)(2x 3) − 5. Brooks,M.,Basu,S.,Jacobs,C.,andVanderwende,L. prob.incorrect=0.09, exp.grade=3 Divideandcorrect: Usingclusterstogradeshort =4x3+2x2+2x 3x2 3x 3 answersatscale.InProc.1stACMConf.onLearningat − − − prob.incorrect=0.82, exp.grade=2 Scale(Mar.2014),89–98. =4x3 x2 x 3 6. Champaign,J.,Colvin,K.,Liu,A.,Fredericks,C., − − − prob.incorrect=0.99, exp.grade=2 Seaton,D.,andPritchard,D.Correlatingskilland improvementin2MOOCswithastudent’stimeon (b)Asamplefeedbackgenerationprocesswherethelearnermakesan tasks.InProc.1stACMConf.onLearningatScale errorintheexpressioninLine3whileattemptingtosolveQuestion2. (Mar.2014),11–20. Figure6: Demonstrationofreal-timefeedbackgenerationby 7. Coursera.https://www.coursera.org/,2014. MLP-B while learners enter their solutions. After each ex- pression, we compute both the probability that the learner’s 8. Cramer,M.,Fisseni,B.,Koepke,P.,Ku¨hlwein,D., solutionbelongstoaclusterthatdoesnothavefullcreditand Schro¨der,B.,andVeldman,J.TheNaprocheproject– thelearner’sexpectedgrade. Analertisgeneratedwhenthe Controllednaturallanguageproofcheckingof expectedcreditislessthanfullcredit. mathematicaltexts,June2010. 9. Dijksman,J.A.,andKhan,S.KhanAcademy: The tobuildpredictivemodels,asintheRaschmodel[27]oritem world’sfreevirtualschool.InAPSMeetingAbstracts responsetheory[18]. (Mar.2011). 10. edX.https://www.edx.org/,2014. APPENDIX:QUESTIONSTATEMENTS Question1: Multiply 11. Frey,B.J.,andDueck,D.Clusteringbypassing (x2+x+sin2x+cos2x)(2x 3) messagesbetweendatapoints.Science315,5814 − (2007),972–976. andsimplifyyouranswerasmuchaspossible. 12. Galenson,J.,Reames,P.,Bodik,R.,Hartmann,B.,and x3+sin(x) Sen,K.CodeHint: Dynamicandinteractivesynthesisof Question2: Findthederivativeof andsimplify ex codesnippets.InProc.36thIntl.Conf.onSoftware youranswerasmuchaspossible. Engineering(June2014),653–663. Question3: Adiscrete-timelineartime-invariantsystemhas 13. Gelman,A.,Carlin,J.,Stern,H.,Dunson,D.,Vehtari, theimpulseresponseshowninthefigure(omitted).Calculate A.,andRubin,D.BayesianDataAnalysis.CRCPress, H(ejω),thediscrete-timeFouriertransformofh[n].Simplify 2013. youranswerasmuchaspossibleuntilithasnosummations. 14. Griffiths,T.,andTenenbaum,J.Hierarchicaltopic Question4: Evaluatethefollowingsummation modelsandthenestedchineserestaurantprocess. AdvancesinNeuralInformationProcessingSystems16 ∞ (cid:88) δ[n k]x[k n]. (Dec.2004),17–24. − − k=−∞ 15. Gulwani,S.,Radicˇek,I.,andZuleger,F.Feedback Acknowledgments generationforperformanceproblemsinintroductory ThankstoHeatherSeebaforadministeringthedatacollection programmingassignments.InProc.22ndACM process and Christoph Studer for discussions and insights. SIGSOFTIntl.SymposiumontheFoundationsof Visitourwebsitewww.sparfa.com,whereyoucanlearnmore SoftwareEngineering(Nov.2014,toappear). about our project and purchase t-shirts and other merchan- dise. 16. Guo,P.,andReinecke,K.Demographicdifferencesin 31. SaplingLearning.http://www.saplinglearning.com/, howstudentsnavigatethroughMOOCs.InProc.1st 2014. ACMConf.onLearningatScale(Mar.2014),21–30. 32. Singh,R.,Gulwani,S.,andSolar-Lezama,A. 17. Kang,S.,McDermott,K.,andRoedigerIII,H.Test Automatedfeedbackgenerationforintroductory formatandcorrectivefeedbackmodifytheeffectof programmingassignments.InProc.34thACM testingonlong-termretention.EuropeanJ.Cognitive SIGPLANConf.onProgrammingLanguageDesignand Psychology19,4-5(July2007),528–558. Implementation,vol.48(June2013),15–26. 18. Lord,F.ApplicationsofItemResponseTheoryto 33. Southavilay,V.,Yacef,K.,Reimann,P.,andCalvo,R. PracticalTestingProblems.ErlbaumAssociates,1980. Analysisofcollaborativewritingprocessesusing 19. Megill,N.Metamath: Acomputerlanguageforpure revisionmapsandprobabilistictopicmodels.InProc. mathematics.Citeseer,1997. 3rdIntl.Conf.onLearningAnalyticsandKnowledge (Apr.2013),38–47. 20. Minka,T.EstimatingaDrichletdistribution.Tech.rep., MIT,Nov.2000. 34. Srikant,S.,andAggarwal,V.Asystemtograde computerprogrammingskillsusingmachinelearning.In 21. Naumowicz,A.,andKorniłowicz,A.Abriefoverview Proc.20thACMSIGKDDIntl.Conf.onKnowledge ofMIZAR.InTheoremProvinginHigherOrderLogics, DiscoveryandDataMining(Aug.2014),1887–1896. vol.5674ofLectureNotesinComputerScience.Aug. 2009,67–72. 35. Steyvers,M.,andGriffiths,T.Probabilistictopic models.HandbookofLatentSemanticAnalysis427,7 22. Ng,A.,Jordan,M.,andWeiss,Y.Onspectralclustering: (2007),424–440. Analysisandanalgorithm.AdvancesinNeural InformationProcessingSystems2(Dec.2002), 36. SymPyDevelopmentTeam.Sympy: Pythonlibraryfor 849–856. symbolicmathematics,2014.http://www.sympy.org. 23. Nguyen,A.,Piech,C.,Huang,J.,andGuibas,L. 37. Teh,Y.Drichletprocess.InEncyclopediaofMachine Codewebs: Scalablehomeworksearchformassiveopen Learning.Springer,2010,280–287. onlineprogrammingcourses.InProc.23rdIntl.World WideWebConference(Seoul,Korea,Apr.2014), 38. Vats,D.,Studer,C.,Lan,A.S.,Carin,L.,andBaraniuk, 491–502. R.G.Testsizereductionforconceptestimation.In Proc.6thIntl.Conf.onEducationalDataMining(July 24. OpenStaxTutor.https://openstaxtutor.org/,2013. 2013),292–295. 25. Piech,C.,Huang,J.,Chen,Z.,Do,C.,Ng,A.,and 39. Waters,A.,Fronczyk,K.,Guindani,M.,Baraniuk,R., Koller,D.TunedmodelsofpeerassessmentinMOOCs. andVannucci,M.ABayesiannonparametricapproach InProc.6thIntl.Conf.onEducationalDataMining fortheanalysisofmultiplecategoricalitemresponses.J. (July2013),153–160. StatisticalPlanningandInference(2014,Inpress). 26. Raman,K.,andJoachims,T.Methodsforordinalpeer 40. WebAssign.https://webassign.com/,2014. grading.InProc.20thACMSIGKDDIntl.Conf.on KnowledgeDiscoveryandDataMining(Aug.2014), 41. West,M.HyperparameterestimationinDrichletprocess 1037–1046. mixturemodels.Tech.rep.,DukeUniversity,1992. 27. Rasch,G.ProbabilisticModelsforSomeIntelligence 42. Wilkowski,J.,Deutsch,A.,andRussell,D.Studentskill andAttainmentTests.MESAPress,1993. andgoalachievementinthemappingwithGoogle 28. Rivers,K.,andKoedinger,K.Acanonicalizingmodel MOOC.InProc.1stACMConf.onLearningatScale forbuildingprogrammingtutors.InProc.11thIntl. (Mar.2014),3–10. Conf.onIntelligentTutoringSystems(June2012), 43. Woolf,B.P.BuildingIntelligentInteractiveTutors: 591–593. Student-centeredStrategiesforRevolutionizing 29. Rivers,K.,andKoedinger,K.Automatinghint E-learning.MorganKaufmanPublishers,2008. generationwithsolutionspacepathconstruction.In 44. Yin,J.,andWang,J.ADrichletmultinomialmixture Proc.12thIntl.Conf.onIntelligentTutoringSystems model-basedapproachforshorttextclustering.InProc. (June2014),329–339. 20thACMSIGKDDIntl.Conf.onKnowledgeDiscovery 30. Sadler,P.,andGood,E.Theimpactofself-and andDataMining(Aug.2014),233–242. peer-gradingonstudentlearning.Educational Assessment11,1(June2006),1–31.