Table Of ContentOptimized Database Management System for Agro Informatics
Submitted in Partial fulfillment of the requirement
for the award of the degree of
DOCTOR OF PHILOSOPHY
IN
Computer Science
BY
Narendra Kumar Gupta
(ID No. : 06PHCOMP001)
FACULTY OF ENGINEERING AND TECHNOLOGY
SAM HIGGINBOTTOM UNIVERSITY OF AGRICULTURE,
TECHNOLOGY AND SCIENCES
(FORMERLY ALLAHABAD AGRICULTURAL INSTITUTE)
NAINI ALLAHABAD-211007
2016
DECLARATION
I, Narendra Kumar Gupta declare that the work presented in this thesis/dissertation
entitled 'Optimized Database Management System for Agro Informatics' submitted to the
Department of Computer Science and Information Technology in Shepherd Institute of
Engineering and Technology, Sam Higginbottom University of Agriculture, Technology
and Sciences, Naini, Allahabad, for the award of the Doctor of Philosophy in Computer
Science, is an original work. I have neither plagiarized nor submitted the same work for the
award of any other degree. In case this undertaking is found incorrect, my degree may be
withdrawn unconditionally by the University.
Place: Allahabad
Date: Narendra Kumar Gupta
(ID. No. 06PHCOMP001)
Prof. (Dr.) R. K. Isaac
, Ph.D., M.Tech (IT), MS, Engg (Ag)
Professor, Faculty of Engineering & Technology
CERTIFICATE OF ORIGINAL WORK
This is to certify that Mr. Narendra Kumar Gupta (I.D. 06PHCOMP001) has
conducted the studies reported in the present thesis during 2006-2016 under my
guidance and supervision. The results reported by him are genuine and the candidate
himself has written script of this thesis. The thesis entitled “Optimized Database
Management System for Agro Informatics” is therefore being forwarded for fulfillment of
the requirements for the award of the degree of the Doctor of Philosophy in Computer
Science under the Faculty of Engineering and Technology, Sam Higginbottom
University of Agriculture, Technology & Sciences (Formerly Allahabad Agriculture
Institute), Allahabad.
Prof. (Dr.) R. K. Isaac
Advisor
ABSTRACT
In agriculture domain huge amount of information is required to store & manipulate as per
requirements. So it is needed such type of data model which will be best fitted in this
scenario. Information/Data retrieval and storage is a time taking job while using large
amount of data, as well as if information is shared across multiple platforms and domains,
the task is much challenging. We have plenty of databases to store information, even
though the information is required across platform and in optimized way. After study it
is found that Native XML database will be better for said problem. XML is widely
acceptable as per standards of W3C, and platform independent so that it can be used in any
technical domain for information storage, retrieval and traversal. Different XML
approaches has been studied for storage purposes and other databases were also used for
comparative study. Different technologies are used for information retrieval & storage
through XML like DOM (Document Object Model), XQuery, XPath, LINQ (Language
Integrated Query, basically used to query data stored in native xml database in Microsoft
.Net platform) etc. By using above mentioned approaches and technologies, a agro
information system (AIS) is developed for information retrieval /Storage in optimized way
using NXD. Gathered information is further optimized by using Artificial Neural Network
tool to get best result.
Keywords: XML, Optimization, Agro informatics, AIS
ACKNOWLEDGEMENT
I would like to express my deep and sincere gratitude to my supervisor, Prof. (Dr) R. K.
Isaac, Faculty of Engineering & Technology of Sam Higginbottom University of
Agriculture, Technology & Sciences, Allahabad, India. His wide knowledge and logical
way of thinking have been of great value for me. His knowledge, encouragement and
personal guidance have provided a good basis for me to continue my research. I would also
like to thank to my former SAC members Late Prof. (Dr.) Wilson Kispotta, Dept. of
Agricultural Economics & ABM for their valuable guidance and motivation , Prof. (Dr.)
Ajit Paul, Dept. of Mathematics & Statistics for their continuous support and motivation
and Dr. Hari Mohan Singh, Dept. of C. S & I. T for their technical help and support.
I wish to express my warm and sincere thanks to Prof.(Dr.) R. B. Lal, Hon'ble Vice
Chancellor, Sam Higginbottom University of Agriculture, Technology & Sciences,
Allahabad, for Providing opportunity and environment to accomplish the research. I would
also wish to express my warm and sincere thanks to Prof. (Dr.) A. K. A. Lawrence,
Faculty Dean and Er. Deepak Lal, Dean, Faculty of Engineering & Technology, Sam
Higginbottom University of Agriculture, Technology & Sciences, Allahabad. For his kind
support and guidance.
I am deeply grateful to Prof. (Dr.) W. Jeberson, Professor, Dept. of Computer Science and
& IT, Sam Higginbottom University of Agriculture, Technology & Sciences, Allahabad,
for his spiritual, moral and technical support throughout my work.
I am deeply grateful to Er. R. Dileep Kumar, Assistant Professor, Dept. of Computer
Science and & IT, Sam Higginbottom University of Agriculture, Technology & Sciences,
Allahabad, for his sincere support throughout this work.
This thesis would not have been possible without the support of all Teaching, Non
Teaching and Administrative staff of Dept. of Computer Science and & IT, SHUATS. I
wish to extend my warmest thanks to all those people who supported me directly or
indirectly to accomplish this task. It is an honor for me to express my sincere thanks to
Director, Indian Institute of Sugarcane Research, Lucknow, who permitted me to use
important data for this research. I would also like to thanks to Dr. Syed Sarfaraj Hassan,
Principle Scientist, ISCRI for his support and guidance that have been of great value for my
work.
Last but not the least I owe my loving thanks to my parents Sri Dharmendra Nath Gupta,
and Smt. Shanti Gupta, brothers Late Sri Kaushalendra Nath Gupta and Sri Rajiv
Kumar Gupta, sister Smt. Sapna Gupta, wife Smt. Ruby Gupta and my loving children
Shivani Gupta and Shivam Gupta for continuous support, encouragement throughout
this work.
Narendra Kumar Gupra
TABLE OF CONTENTS
Title Page No.
TITLE PAGE
UNDERTAKING
CERTIFICATE OF ORIGINALWORK
ABSTRACT i
ACKNOWLEDGEMENT ii
TABLE OF CONTENTS iv
LIST OF TABLES vii
LIST OF FIGURES x
LIST OF ABBREVIATIONS xiv
CHAPTER I INTRODUCTION 1-14
1.1 Overview 1
1.1.1 Structured, Semi structured and Unstructured Data 4
1.2 Motivation 6
1.3 Existing System 7
1.4 Issues and Challenges 11
1.5 Scope of Work and contribution 13
1.6 Objectives 14
CHAPTER II REVIEW OF LITERATURE 15-71
2.1 Analysis of Agricultural data 16
2.2 Management of unstructured, structured and semi structured data 18
2.3 Semi Structured Data And Its Management 20
2.3.1 Nature of XML Data 20
2.3.2 XML Data Storage 22
2.3.3 Database management and data retrieval schemes 24
2.4 Native XML Databases 37
2.4.1 Data versus Documents 40
2.5 Database and Query optimization 41
2.6. A Query Processing Approach for XML Database Systems 50
2.7 Optimization by using Artificial Neural Network (ANN) 68
CHAPTER III MATERIALS AND METHODS 72-129
3.1 Work Contribution 72
3.2 Technology used 72
3.2.1 Hardware Platform used 73
3.2.2 Software Platform used 73
3.2.3 Software Tools used 74
3.3 Methodology 83
3.3.1 Data Acquisition 86
3.3.2 Data Analysis 88
3.3.3 Designing Attribute Hierarchy Schemes 89
3.3.4. Data Framework in RDBMS 91
3.3.5. Data Framework in XML 92
3.3.6. Query Complexity parameters 94
3.3.7. Method for optimization of agro data repository 95
3.3.7.1 Module wise structure representation developed 98
3.3.7.2 Representation of Aggregated Module structure in XML 100
3.3.7.3 Tree structure of agro data repository 102
3.3.7.4 Instance of aggregate module 103
3.3.8. Interface Design 105
3.3.9. Refinement of Query processing 107
3.3.10 Artificial Neural Network Optimization by 122
Levenberg-Marquardt Method
CHAPTER IV RESULT AND DISCUSSIONS 130-190
4.1 Approaches for maintaining Agro Data 130
4.2 Developing optimized Database Management System 131
4.2.1 Response time of all types of queries obtained by 133
different approaches (with dataset size 780 records)
4.2.2 Response time of all types of queries obtained by 135
different approaches excluding SQL (with dataset size 780 records)
4.2.3 Response time of all types of queries obtained by 140
different approaches (with dataset size 1560 records)
4.2.4 Response time of all types of queries obtained by 145
different approaches (with dataset size 3120 records)
4.2.5 Response time of all types of queries obtained by 150
different approaches (with dataset size 6240 records)
4.2.6 Consolidated Response Time Of Varying Data Set Size
With All The Approaches:
4.3 Artificial Neural Network Modeling and performance optimization 157
4.3.1 Simple Query time optimization and validation 158
4.3.2 Moderate Query time optimization and validation 167
4.3.3 Complex Query time optimization and validation 179
CHAPTER V SUMMARY AND CONCLUSIONS 191-194
5.1 Summary 191
5.2 Conclusion 193
REFERENCES 195-204
APPENDIX-A SCREEN SHOTS OF AIS & TESTING REPORT 205-215
APPENDIX-B RDBMS TABLE SCHEMAS 216-220
PAPERS PUBLISHED 221
LIST OF TABLES
Table Title Page No.
Table 2.1. Summarizing the comparisons LOB, OR, and the native XML 45
storage approaches.
Table 2.2. Query processing abstraction levels 51
Table 2.3. XSL Important characteristics of various XML query languages 53
Table 3.1. Hardware technology used 73
Table 3.2. Software Technology used 73
Table 3.3 Advantages of XML 80
Table 3.4. Data Acquisition Sources 87
Table 3.5. Attribute arrangement scheme 1 89
Table 3.6. Attribute arrangement scheme 2 90
Table 3.7. SQL Simple Queries (Query performed on single ) 107
Table 3.8. SQL Moderate Queries (Query performed on two tables) 107
Table 3.9. SQL Complex Queries (Query performed on More than Two 108
table)
Table 3.10. Simple Queries executed using XPath 109
Description:XSL Important characteristics of various XML query languages. 53 .. data on soil, weather, crop physiology, crop management, and historical production data to When the processing involves modifying data, random access, and in an order different Anyone can develop an ASP.NET MVC 4.