ebook img

Advancement of Deep Learning and its Applications in Object Detection and Recognition PDF

319 Pages·2023·26.051 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Advancement of Deep Learning and its Applications in Object Detection and Recognition

Advancement of Deep Learning and its Applications in Object Detection and Recognition RIVER PUBLISHERS SERIES IN COMPUTING AND INFORMATION SCIENCE AND TECHNOLOGY Series Editors: K. C. CHEN SANDEEP SHUKLA National Taiwan University, Taipei, Taiwan Virginia Tech, USA and and University of South Florida, USA Indian Institute of Technology Kanpur, India The “River Publishers Series in Computing and Information Science and Technology” covers research which ushers the 21st Century into an Internet and multimedia era. Networking suggests transportation of such multimedia contents among nodes in communication and/or computer networks, to facilitate the ultimate Internet. Theory, technologies, protocols and standards, applications/services, practice and implementation of wired/wireless networking are all within the scope of this series. Based on network and communication science, we further extend the scope for 21st Century life through the knowledge in machine learning, embedded systems, cognitive science, pattern recognition, quantum/biological/molecular computation and information processing, user behaviors and interface, and applications across healthcare and society. Books published in the series include research monographs, edited volumes, handbooks and textbooks. The books provide professionals, researchers, educators, and advanced students in the field with an invaluable insight into the latest research and developments. Topics included in the series are as follows:­ • Artificial intelligence • Cognitive Science and Brian Science • Communication/Computer Networking Technologies and Applications • Computation and Information Processing • Computer Architectures • Computer networks • Computer Science • Embedded Systems • Evolutionary computation • Information Modelling • Information Theory • Machine Intelligence • Neural computing and machine learning • Parallel and Distributed Systems • Programming Languages • Reconfigurable Computing • Research Informatics • Soft computing techniques • Software Development • Software Engineering • Software Maintenance For a list of other books in this series, visit www.riverpublishers.com Advancement of Deep Learning and its Applications in Object Detection and Recognition Editors Roohie Naaz Mir National Institute of Technology, India Vipul Kumar Sharma Jaypee University of Information Technology, India Ranjeet Kumar Rout National Institute of Technology, India Saiyed Umer Aliah University, India River Publishers Published 2023 by River Publishers River Publishers Alsbjergvej 10, 9260 Gistrup, Denmark www.riverpublishers.com Distributed exclusively by Routledge 605 Third Avenue, New York, NY 10017, USA 4 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Advancement of Deep Learning and its Applications in Object Detection and Recognition / by Roohie Naaz Mir, Vipul Kumar Sharma, Ranjeet Kumar Rout, Saiyed Umer. © 2023 River Publishers. All rights reserved. No part of this publication may be reproduced, stored in a retrieval systems, or transmitted in any form or by any means, mechanical, photocopying, recording or otherwise, without prior written permission of the publishers. Routledge is an imprint of the Taylor & Francis Group, an informa business ISBN 978-87-7022-702-5 (print) ISBN 978-10-0088-041-0 (online) ISBN 978-1-003-39365-8 (ebook master) While every effort is made to provide dependable information, the publisher, authors, and editors cannot be held responsible for any errors or omissions. Contents Contents v Preface xiii List of Figures xv List of Tables xxi List of Contributors xxv List of Abbreviations xxix 1 Recent Advances in Video Captioning with Object Detection 1 Nasib Ullah and Partha Pratim Mohanta 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Post-deep Learning Object Detection . . . . . . . . . . . . . 2 1.2.1 Region proposal-based methods (Two stages) . . . . 4 1.2.1.1 RCNN . . . . . . . . . . . . . . . . . . . 4 1.2.1.2 FastRCNN . . . . . . . . . . . . . . . . 5 1.2.1.3 FasterRCNN . . . . . . . . . . . . . . . 5 1.2.1.4 MaskRCNN . . . . . . . . . . . . . . . . 6 1.2.2 Single-stage regression-based methods . . . . . . . 6 1.2.2.1 You only look once (YOLO) . . . . . . . 7 1.2.2.2 Single-shot detection (SSD) . . . . . . . 7 1.2.3 Comparison of different object detectors . . . . . . . 9 1.3 Video Captioning with Object Detection . . . . . . . . . . . 9 1.3.1 Encoder–decoder-based framework . . . . . . . . . 9 1.3.2 Methods with object detection . . . . . . . . . . . . 10 1.3.2.1 Object-aware aggregation with bidirectional temporal graph for video captioning (OA-BTG) . . . . . . . . . . . . . . . . . 11 v vi Contents 1.3.2.2 Object relational graph with teacher recommended learning for video captioning (ORG-TRL) . . . . . . . . . . . . . . . . 12 1.3.2.3 Spatio-temporal graph for video captioning with knowledge distillation (STG-KD) . . 14 1.3.3 Comparisonofresults . . . . . . . . . . . . . . . . 16 1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 17 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 A Deep Learning-based Framework for COVID-19 Identification using Chest X-Ray Images 23 Asifuzzaman Lasker, Mridul Ghosh, Sk Md Obaidullah, Chandan Chakraborty, and Kaushik Roy 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2 RelatedWorks . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3 ProposedMethod . . . . . . . . . . . . . . . . . . . . . . . 29 2.4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.4.1 Experimentalsetup . . . . . . . . . . . . . . . . . . 32 2.4.2 Ablationstudy . . . . . . . . . . . . . . . . . . . . 32 2.4.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . 35 2.4.4 Evaluationprotocol . . . . . . . . . . . . . . . . . . 36 2.4.5 Resultandanalysis . . . . . . . . . . . . . . . . . . 37 2.5 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 40 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3 Faster Region-based Convolutional Neural Networks for the Detection of Surface Defects in Aluminium Tubes 47 Vipul Sharma, Roohie Naaz Mir, and Mohammad Nayeem Teli 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2 Detection of Surface Flaws in Aluminum Tubes . . . . . . . 49 3.3 RCNN-based Defect Recognition Method . . . . . . . . . . 51 3.4 Faster RCNN-based Defect Recognition Method . . . . . . 52 3.4.1 Improvements. . . . . . . . . . . . . . . . . . . . . 52 3.4.2 Networktraining . . . . . . . . . . . . . . . . . . . 54 3.5 Faster RCNN and Image Enhancement-based Defect DetectionMethod . . . . . . . . . . . . . . . . . . . . . . . 55 3.6 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 58 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Contents vii 4 Real Time Face Detection-based Automobile Safety System using Computer Vision and Supervised Machine Learning 63 Navpreet Singh Kapoor, Mansimar Anand, Priyanshu, Shailendra Tiwari, Shivendra Shivani, and Raman Singh 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2 LiteratureReview . . . . . . . . . . . . . . . . . . . . . . . 66 4.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . 66 4.2.2 Contribution . . . . . . . . . . . . . . . . . . . . . 67 4.2.3 Organization . . . . . . . . . . . . . . . . . . . . . 68 4.3 ProposedMethodology . . . . . . . . . . . . . . . . . . . . 68 4.3.1 Landmarkdetection . . . . . . . . . . . . . . . . . 70 4.3.2 Contraststretching . . . . . . . . . . . . . . . . . . 71 4.3.3 Datastorage . . . . . . . . . . . . . . . . . . . . . 71 4.3.4 Distractiondetection . . . . . . . . . . . . . . . . . 71 4.3.5 Yawningdetection . . . . . . . . . . . . . . . . . . 74 4.3.6 Dozing-offdetection . . . . . . . . . . . . . . . . . 76 4.4 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . 77 4.4.1 Confusionmatrix . . . . . . . . . . . . . . . . . . . 79 4.4.2 Accuracy . . . . . . . . . . . . . . . . . . . . . . . 79 4.4.3 Precision . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.4 Recall . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.5 Specificity . . . . . . . . . . . . . . . . . . . . . . 80 4.4.6 F1Score . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.7 Comparisonwithsomestate-of-the-artmethods . . . 81 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 82 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5 Texture Feature Descriptors for Analyzing Facial Patterns in Facial Expression Recognition System 87 Sanoar Hossain and Vijayan Asari 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.2 ProposalMethods . . . . . . . . . . . . . . . . . . . . . . . 92 5.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . 92 5.2.2 Featureextraction. . . . . . . . . . . . . . . . . . . 94 5.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.4 ExperimentandResults . . . . . . . . . . . . . . . . . . . . 97 5.4.1 Databaseused. . . . . . . . . . . . . . . . . . . . . 97 5.4.2 Resultanddiscussion . . . . . . . . . . . . . . . . . 98 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 101 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 101 viii Contents 6 A Texture Features-based Method to Detect the Face Spoofing 107 Somenath Dhibar and Bibhas Chandra Dhara 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.2 LiteratureReview . . . . . . . . . . . . . . . . . . . . . . . 109 6.3 ProposedMethodology . . . . . . . . . . . . . . . . . . . . 111 6.3.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . 112 6.3.2 Featureextraction. . . . . . . . . . . . . . . . . . . 112 6.3.3 Classification . . . . . . . . . . . . . . . . . . . . . 117 6.4 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . 117 6.4.1 Resultsanddiscussion . . . . . . . . . . . . . . . . 119 6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 121 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7 Enhanced Tal Hassner and Gil Levi Approach for Prediction of Age and Gender with Mask and Mask less 129 Srikanth Busa, Jayaprada Somala, T. Santhi Sri, and Padmaja Grandhe 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.2 LiteratureReview . . . . . . . . . . . . . . . . . . . . . . . 133 7.3 ProposedMethodology . . . . . . . . . . . . . . . . . . . . 137 7.3.1 Caffemodel . . . . . . . . . . . . . . . . . . . . . . 141 7.4 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . 143 7.4.1 Materialsused . . . . . . . . . . . . . . . . . . . . 143 7.4.2 Resultsanddiscussion . . . . . . . . . . . . . . . . 144 7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 147 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 147 8 A Brief Overview of Recent Techniques in Crowd Counting and Density Estimation 151 Chiara Pero 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 151 8.2 TraditionalCrowdCountingApproaches . . . . . . . . . . . 155 8.2.1 Detection-basedapproaches . . . . . . . . . . . . . 156 8.2.2 Regression-basedapproaches . . . . . . . . . . . . 156 8.2.3 Densityestimationapproaches . . . . . . . . . . . . 157 8.2.4 Clusteringapproaches . . . . . . . . . . . . . . . . 157 8.3 CNN-basedApproaches . . . . . . . . . . . . . . . . . . . 157 8.3.1 BasicCNN . . . . . . . . . . . . . . . . . . . . . . 158 8.3.2 Multiple-columnCNN . . . . . . . . . . . . . . . . 159 Contents ix 8.3.3 Single-columnCNN . . . . . . . . . . . . . . . . . 159 8.3.4 Patch-based and image-based inference . . . . . . . 160 8.4 Benchmark Datasets and Evaluation Metrics . . . . . . . . . 161 8.4.1 Databases . . . . . . . . . . . . . . . . . . . . . . . 161 8.4.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . 163 8.5 Open Challenges and Future Directions . . . . . . . . . . . 164 8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 166 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 167 9 Recent Trends in 2D Object Detection and Applications in Video Event Recognition 173 Prithwish Jana and Partha Pratim Mohanta 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 174 9.2 RelatedWorkon2DObjectDetection . . . . . . . . . . . . 176 9.2.1 Early object detection: Geometrical and shape-based approaches . . . . . . . . . . . . . . . . . . . . . . 176 9.2.2 Modern object detection techniques: Use of deep learning . . . . . . . . . . . . . . . . . . . . . . . . 177 9.2.2.1 Monolithic architectures for 2D object detection . . . . . . . . . . . . . . . . . . 177 9.2.2.2 Region proposal-based two-stage architectures for 2D object detection . . . 181 9.3 Video Activity Recognition Assisted by Object Detection . . 184 9.4 Recent Datasets on 2D Object Detection . . . . . . . . . . . 186 9.5 Performance of Object Detection Techniques . . . . . . . . 189 9.6 ConclusionandWayForward . . . . . . . . . . . . . . . . . 189 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 190 10 Survey on Vehicle Detection, Identification and Count using CNN-based YOLO Architecture and Related Applications 197 Gaurish Garg, Shailendra Tiwari, Shivendra Shivani, and Avleen Malhi 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 198 10.2 LiteratureReview . . . . . . . . . . . . . . . . . . . . . . . 198 10.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 198 10.4 Experimental Results and Applications . . . . . . . . . . . . 203 10.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 209 References. . . . . . . . . . . . . . . . . . . . . . . . . . . 209

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.