ebook img

Neural Sequence Modelling for Automated Essay Scoring Ahmed Hasan Zaidi PDF

62 Pages·2016·0.88 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Neural Sequence Modelling for Automated Essay Scoring Ahmed Hasan Zaidi

Neural Sequence Modelling for Automated Essay Scoring Ahmed Hasan Zaidi St. Edmund’s College A dissertation submitted to the University of Cambridge in partial fulfilment of the requirements for the degree of Master of Philosophy in Advanced Computer Science University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom Email: [email protected] June 10, 2016 Declaration I Ahmed Hasan Zaidi of St. Edmund’s College, being a candidate for the M.Phil in Advanced Computer Science, hereby declare that this report and theworkdescribedinitaremyownwork, unaidedexceptasmaybespecified below, and that the report does not contain material that has already been used to any substantial extent for a comparable purpose. Total word count: 11,903 Signed: Date: This dissertation is copyright c 2016 Ahmed Hasan Zaidi. � All trademarks used in this dissertation are hereby acknowledged. Abstract Automated Essay Scoring (AES) is the use of specialised computer software to assign scores for essays written in an academic environment. Its growing interest has been motivated by several factors including rising costs of edu- cation, need for grading standards, and major technological breakthroughs. Despite the positive results in literature, there still remain many critical challenges that need to be addressed to ensure the wide-spread adoption of AES systems. These challenges can be divided into three main categories: meaningfulness, transparency, and robustness. This investigation aims to address these challenges while also attempting to improve the human-machine inter-rater agreement. Motivated by the recent successofneuralnetworks, weconductasystematicinvestigationofdeeprep- resentation learning; initially using a basic recurrent neural network (RNN) but extending to Long-short term memory cells and deep bi-directional ar- chitectures as well. In order to evaluate the AES system, an adapted visual- isation technique was implemented. The visualisation identifies portions of the text that are discriminative of writing quality. Overall it was found that deep bi-directional model, DBLSTM, are more e↵ectiveincapturingfeaturesdiscriminativeofwritingqualitythanshallower uni-directional models. Although the results did not surpass existing state- of-the-art, our methodology lays the foundations for a potentially rewarding avenue for future AES systems. Contents 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Purpose of Investigation . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Report Structure . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Literature Review 7 2.1 Automated Essay Scoring . . . . . . . . . . . . . . . . . . . . 7 2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Background 11 3.1 Neural Word Embeddings . . . . . . . . . . . . . . . . . . . . 11 3.2 Recurrent Neural Network (RNN) . . . . . . . . . . . . . . . . 13 3.3 Long-short term memory (LSTM) . . . . . . . . . . . . . . . . 15 3.4 Optimising the Model . . . . . . . . . . . . . . . . . . . . . . 17 3.4.1 RMSProp . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Approach 19 4.1 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.1 FCE Dataset . . . . . . . . . . . . . . . . . . . . . . . 20 4.1.2 Data Extraction . . . . . . . . . . . . . . . . . . . . . . 20 4.1.3 Tokenising . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.1.4 Neural Word Embeddings . . . . . . . . . . . . . . . . 21 4.2 Model 1: Preliminary RNN . . . . . . . . . . . . . . . . . . . 21 4.3 Model 2: RNN-MLP . . . . . . . . . . . . . . . . . . . . . . . 24 i 4.4 Model 3: LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.5 Model 4: Bi-directional LSTM . . . . . . . . . . . . . . . . . . 27 4.6 Model 5: Deep Bi-Directional LSTM . . . . . . . . . . . . . . 28 5 Evaluation 31 5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.2.1 Hidden Layer Analysis . . . . . . . . . . . . . . . . . . 32 5.2.2 Discussion: Hidden Layer Analysis . . . . . . . . . . . 33 5.2.3 Answer Level versus Script Level . . . . . . . . . . . . 34 5.2.4 Discussion: Answer Level versus Script Level . . . . . . 34 5.2.5 Learning Rate Selection . . . . . . . . . . . . . . . . . 35 5.2.6 Learning Rate Analysis . . . . . . . . . . . . . . . . . . 36 5.2.7 Model Evaluation . . . . . . . . . . . . . . . . . . . . . 36 6 Visual Evaluation and Error Analysis 39 6.1 Visualisation Approach . . . . . . . . . . . . . . . . . . . . . . 39 6.2 Visualisation Examples . . . . . . . . . . . . . . . . . . . . . . 41 7 Summary and Conclusions 45 ii List of Figures 3.1 Figure showing Word2Vec architecture with both CBOW and Skip-gram methods of training . . . . . . . . . . . . . . . . . . 13 3.2 FigureshowingatraditionalRNNwithonehiddenlayer(Source: Nature) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 Figure showing long-short term memory cell with forget gate (Source: deeplearning.net) . . . . . . . . . . . . . . . . . . . . 16 4.1 Figure showing the RNN with many-to-one output method . . 23 4.2 FigureshowinganLSTMwith“PeepholeConnections”(Source: Herta (2015)) . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 Figure showing Bi-directional RNN . . . . . . . . . . . . . . . 28 4.4 Figure showing deep bi-directional LSTM (Source: Graves & Schmidhuber (2005)) . . . . . . . . . . . . . . . . . . . . . . . 29 iii iv

Description:
cation, need for grading standards, and major technological breakthroughs. Despite the . The second limitation of existing models is transparency. There is much details on where the grammatical mistakes are, as well how they can be . (2011) developed a novel AES model using a learning- to-rank
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.