Applications of Machine Learning and Data Analytics Models in Maritime Transportation Machine learning and data analytics can be used to inform technical, commercial and financial TAA Applications of Machine Learning decisions in the maritime industry. Applications of Machine Learning and Data Analytics ranp nap Mtraondseplso ritna Mtioanri triemlaet eTdra pnsrapcotritcaatli opnro ebxlepmlosre uss tinhge dfuantad-admrievnetna ml pordineclsip, lwesit ho fa a pnaarlytisciunlga rm faorciutism oen- splytlic and Data Analytics Models in oia mDaatcah-iennea lbelaerdn imnge tahnodd oolpoegriaetsio, tnesc rhensoelaorgcihe sm, oanddel sa.p plications in maritime transportation are rtacs Mtion Maritime Transportation t clearly and concisely explained, and case studies of typical maritime challenges and solutions ioos are also included. The authors begin with an introduction to maritime transportation, followed ndo ef by chapters providing an overview of ship inspection by port state control, and the principles ls M of data driven models. Further chapters cover linear regression models, Bayesian networks, ia support vector machines, artificial neural networks, tree-based models, association rule nc Mh learning, cluster analysis, classic and emerging approaches to solving practical problems in i an maritime transport, incorporating shipping domain knowledge into data-driven models, re explanation of black-box machine learning models in maritime transport, linear optimization, it L advanced linear optimization, and integer optimization. A concluding chapter provides an ime Ran Yan and Shuaian Wang a overview of coverage and explores future possibilities in the field. er n The book will be especially useful to researchers and professionals with expertise in maritime in research who wish to learn how to apply data analytics and machine learning to their fields. g a n d D a About the Authors t a Ran Yan is a research assistant professor in the Department of Logistics and Maritime Studies at The Hong Kong Polytechnic University, China. Shuaian Wang is a professor in the Department of Logistics and Maritime Studies at The Hong Kong Polytechnic University, China. Y a n a n d W a The Institution of Engineering and Technology n theiet.org g 978-1-83953-559-8 IET TRANSPORTATION SERIES 38 Applications of Machine Learning and Data Analytics Models in Maritime Transportation Other related titles: Volume 1 Clean Mobility and Intelligent Transport Systems M. Fiorini and J-C. Lin (Editors) Volume 2 Energy Systems for Electric and Hybrid Vehicles K.T. Chau (Editor) Volume 5 Sliding Mode Control of Vehicle Dynamics A. Ferrara (Editor) Volume 6 Low Carbon Mobility for Future Cities: Principles and Applications H. Dia (Editor) Volume 7 Evaluation of Intelligent Road Transportation Systems: Methods and Results M. Lu (Editor) Volume 8 Road Pricing: Technologies, economics and acceptability J. Walker (Editor) Volume 9 Autonomous Decentralized Systems and their Applications in Transport and Infrastructure K. Mori (Editor) Volume 11 Navigation and Control of Autonomous Marine Vehicles S. Sharma and B. Subudhi (Editors) Volume 12 EMC and Functional Safety of Automotive Electronics K. Borgeest Volume 15 Cybersecurity in Transport Systems M. Hawley Volume 16 ICT for Electric Vehicle Integration with the Smart Grid N. Kishor and J. Fraile-Ardanuy (Editors) Volume 17 Smart Sensing for Traffic Monitoring Nobuyuki Ozaki (Editor) Volume 18 Collection and Delivery of Traffic and Travel Information P. Burton and A. Stevens (Editors) Volume 20 Shared Mobility and Automated Vehicles: Responding to socio-technical changes and pandemics Ata Khan and Susan Shaheen Volume 23 Behavioural Modelling and Simulation of Bicycle Traffic L. Huang Volume 24 Driving Simulators for the Evaluation of Human-Machine Interfaces in Assisted and Automated Vehicles T. Ito and T. Hirose (Editors) Volume 25 Cooperative Intelligent Transport Systems: Towards high-level automated driving M. Lu (Editor) Volume 26 Traffic Information and Control Ruimin Li and Zhengbing He (Editors) Volume 30 ICT Solutions and Digitalisation in Ports and Shipping M. Fiorini and N. Gupta Volume 32 Cable Based and Wireless Charging Systems for Electric Vehicles: Technology and control, management and grid integration R. Singh, S. Padmanaban, S.Dwivedi, M. Molinas and F. Blaabjerg (Editors) Volume 34 ITS for Freight Logistics H. Kawashima (Editor) Volume 36 Vehicular ad hoc Networks and Emerging Technologies for Road Vehicle Automation A. K. Tyagi and S Malik Volume 38 The Electric Car M.H. Westbrook Volume 45 Propulsion Systems for Hybrid Vehicles J. Miller Volume 79 Vehicle-to-Grid: Linking Electric Vehicles to the Smart Grid J. Lu and J. Hossain (Editors) Applications of Machine Learning and Data Analytics Models in Maritime Transportation Ran Yan and Shuaian Wang The Institution of Engineering and Technology Published by The Institution of Engineering and Technology, London, United Kingdom The Institution of Engineering and Technology is registered as a Charity in England & Wales (no. 211014) and Scotland (no. SC038698). © The Institution of Engineering and Technology 2022 First published 2022 This publication is copyright under the Berne Convention and the Universal Copyright Convention. All rights reserved. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may be reproduced, stored or transmitted, in any form or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publisher at the undermentioned address: The Institution of Engineering and Technology Futures Place Kings Way, Stevenage Herts, SG1 2UA, United Kingdom www.theiet.org While the authors and publisher believe that the information and guidance given in this work are correct, all parties must rely upon their own skill and judgement when making use of them. Neither the author nor publisher assumes any liability to anyone for any loss or damage caused by any error or omission in the work, whether such an error or omission is the result of negligence or any other cause. Any and all such liability is disclaimed. The moral rights of the author to be identified as author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. British Library Cataloguing in Publication Data A catalogue record for this product is available from the British Library ISBN 978-1-83953-559-8 (hardback) ISBN 978-1-83953-560-4 (PDF) Typeset in India by Exeter Premedia Services Private Limited Printed in the UK by CPI Group (UK) Ltd, Croydon Cover Image: Longhua Liao/ Moment via Getty Images Contents About the Authors ix 1 Introduction of maritime transportation 1 1.1 Overview of maritime transport 1 1.2 World fleet structure 1 1.2.1 Bulk carrier 1 1.2.2 Oil tanker 2 1.2.3 Container ship 2 1.3 Key roles in the shipping industry 3 1.3.1 Ship owner 3 1.3.2 Ship operator 3 1.3.3 Ship management company 3 1.3.4 Flag state 3 1.3.5 Classification society 4 1.3.6 Charterer 4 1.3.7 Freight forwarder 4 1.3.8 Ship broker 5 1.4 Container liner shipping 5 2 Ship inspection by port state control 9 2.1 Key issues in maritime transport 9 2.1.1 Maritime safety management 9 2.1.2 Marine pollution control 10 2.1.3 Seafarers’ management 10 2.2 Port state control 10 2.2.1 The background and development of PSC 11 2.2.2 Ship selection in PSC 11 2.2.3 Onboard inspection procedure 12 2.2.4 Inspection results 14 2.3 Data set used in this book 15 3 Introduction to data- driven models 21 3.1 Predictive problem and its application in maritime transport 21 3.1.1 Introduction of predictive problem 21 3.1.2 Examples of predictive problem in maritime transport 22 vi Machine learning and data analytics for maritime studies 3.1.3 Comparison of theory-based modeling and data-driven modeling 23 3.1.4 Popular data-driven models 23 4 Key elements of data- driven models 29 4.1 Comparison of three popular data-driven models 29 4.2 Procedure of developing ML models to address maritime transport problems 29 4.2.1 Problem specification 32 4.2.2 Feasibility assessment 32 4.2.3 Data collection 32 4.2.4 Feature engineering 34 4.2.5 Model construction 43 4.2.6 Model refinement 48 4.2.7 Model assessment, interpretation/explanation, and conclusion 50 5 Linear regression models 51 5.1 Simple linear regression and the least squares 51 5.2 Multiple linear regression 53 5.3 Extensions of multiple linear regression 55 5.3.1 Polynomial regression 55 5.3.2 Logistic regression 56 5.4 Shrinkage linear regression models 59 5.4.1 Ridge regression 60 5.4.2 LASSO regression 61 6 Bayesian networks 63 6.1 Naive Bayes classifier 63 6.2 Semi-naive Bayes classifiers 68 6.3 BN classifiers 73 7 Support vector machine 79 7.1 Hard margin SVM 79 7.2 Soft margin SVM 83 7.3 Kernel trick 86 7.4 Support vector regression 90 8 Artificial neural network 93 8.1 The structure and basic concepts of an ANN 93 8.1.1 Training of an ANN model 97 8.1.2 Hyperparameters in an ANN model 100 8.2 Brief introduction of deep learning models 103 Contents vii 9 Tree- based models 105 9.1 Basic concepts of a decision tree 105 9.2 Node splitting in classification trees 106 9.2.1 Iterative dichotomizer 3 (ID3) 106 9.2.2 C4.5 109 9.2.3 Classification and regression tree (CART) 110 9.2.4 Node splitting in regression trees 113 9.3 Ensemble learning on tree-based models 115 9.3.1 Bagging 117 9.3.2 Boosting 120 10 Association rule learning 129 10.1 Large item sets 129 10.2 Apriori algorithm 131 10.3 FP-growth algorithm 139 11 Cluster analysis 143 11.1 Distance measure in clustering 143 11.1.1 Distance measure of examples 143 11.1.2 Distance measure of clusters 146 11.2 Metrics for clustering algorithm performance evaluation 147 11.2.1 Clustering algorithms 149 11.2.2 K-means (partition-based method) 149 11.2.3 DBSCAN (density-based method) 151 11.2.4 Agglomerative algorithm (hierarchy-based methods) 154 12 Classic and emerging approaches to solving practical problems in maritime transport 161 12.1 Topics in maritime transport research 161 12.2 Research methods and their specific applications to maritime transport research 163 12.3 Issues of adopting data-driven models to address problems in maritime transportation 167 12.3.1 Data 168 12.3.2 Model 169 12.3.3 User 170 12.3.4 Target 172 13 Incorporating shipping domain knowledge into data- driven models 179 13.1 Considering feature monotonicity in ship risk prediction 180 13.1.1 Introduction of monotonicity in the ship risk prediction problem 180 13.1.2 Integration of monotonic constraint into XGBoost 182 viii Machine learning and data analytics for maritime studies 13.2 Integration of convex and monotonic constraints into ANN (artifical neural network) 184 14 Explanation of black- box ML models in maritime transport 191 14.1 Necessity of black-box ML model explanation in the maritime industry 191 14.1.1 What is the explanation for ML models 191 14.1.2 Propose and evaluate explanations for black-box ML models 194 14.2 Popular methods for black-box ML model explanation 196 14.2.1 Forms and types of explanations 196 14.2.2 Introduction of intrinsic explanation model using DT as an example 198 14.2.3 SHAP method 200 15 Linear optimization 209 15.1 Basics 209 15.2 Classification of linear optimization models according to solutions 213 15.3 Equivalence between different formulations 215 15.4 Graphical method for models with two variables 218 15.5 Using software to solve linear optimization models 225 15.6 An in-depth understanding of linear optimization 227 15.7 Useful applications of linear optimization solvers 228 16 Advanced linear optimization 231 16.1 Network flow optimization 231 16.2 Dummy nodes and links 245 16.3 Using linear formulations for nonlinear problems 248 16.4 Practice 250 17 Integer optimization 275 17.1 Formulation I: natural integer decision variables 275 17.2 Formulation II: 0–1 decision variables 278 17.3 Formulation III: complex logical constraints 281 17.4 Solving mixed-integer optimization models 282 17.5 Formulation IV: challenging problems 283 17.6 Formulation V: linearizing binary variables multiplied by another variable 288 17.7 Practice 289 18 Conclusion 297 18.1 Summary of this book 297 18.2 Future research agenda 298 Index 301 About the Authors Ran Yan is a research assistant professor in the Department of Logistics and Maritime Studies at The Hong Kong Polytechnic University (PolyU), China. Dr. Yan received her Bachelor of Science degree from Hohai University in China in 2018 and her Master of Philosophy and Doctor of Philosophy degrees from The Hong Kong Polytechnic University in 2020 and 2022, respectively. Dr. Yan’s research interests include applying data analytics methods and technologies to improve shipping effi- ciency and green shipping management. Dr. Yan has published more than 30 papers in international journals and conference proceedings, such as Transportation Research Part B/C/E, Transport Policy, Journal of Computational Science, Maritime Policy & Management, Ocean Engineering, Engineering, Sustainability, and Electronic Research Archive, and won several times of best paper/student paper award from international conferences. Dr. Yan is an editorial assistant of Cleaner Logistics and Supply Chain. Shuaian Wang is currently Professor at The Hong Kong Polytechnic University (PolyU), China. Prior to joining PolyU, he worked as a faculty member at Old Dominion University, USA, and the University of Wollongong, Australia. Dr. Wang’s research interests include big data in shipping, green shipping, shipping operations management, port planning and operations, urban transport network modeling, and logistics and supply chain management. Dr. Wang has published over 200 papers in journals such as Transportation Research Part B, Transportation Science, and Operations Research. Dr. Wang is an editor-in-chief of Cleaner Logistics and Supply Chain and Communications in Transportation Research, an associate editor of Transportation Research Part E, Flexible Services and Manufacturing Journal, Transportmetrica A, and Transportation Letters, a handle editor of Transportation Research Record, an editorial board editor of Transportation Research Part B, and an editorial board member of Maritime Transport Research. Dr. Wang dedicates to rethinking and proposing innovative solutions to improve the efficiency of maritime and urban transportation systems, to promote environmental friendly and sustainable practices, and to transform business and engineering education.