ebook img

Mathematical Gnostics: Advanced Data Analysis for Research and Engineering Practice PDF

343 Pages·2023·14.586 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Mathematical Gnostics: Advanced Data Analysis for Research and Engineering Practice

Mathematical Gnostics The book describes the theoretical principles of nonstatistical methods of data analysis but without going deep into complex mathematics. The emphasis is laid on presentation of solved examples of real data either from authors’ laboratories or from open literature. The examples cover wide range of applications such as quality assurance and quality control, critical analysis of experimental data, com- parison of data samples from various sources, robust linear regression as well as various tasks from financial analysis. The examples are useful primarily for chemical engineers including analytical/quality laboratories in industry, design- ers of chemical and biological processes. Features: • Exclusive title on Mathematical Gnostics with multidisciplinary applications, and specific focus on chemical engineering. • Clarifies the role of data space metrics including the right way of aggregation of uncertain data. • Brings a new look on the data probability, information, entropy and thermo- dynamics of data uncertainty. • Enables design of probability distributions for all real data samples including smaller ones. • Includes data for examples with solutions with exercises in R or Python. The book is aimed for Senior Undergraduate Students, Researchers, and Profes- sionals in Chemical/Process Engineering, Engineering Physics, Stats, Math- ematics, Materials, Geotechnical, Civil Engineering, Mining, Sales, Marketing and Service, and Finance. Mathematical Gnostics Advanced Data Analysis for Research and Engineering Practice Pavel Kovanic First edition published 2023 by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor & Francis Group, LLC © 2023 Taylor & Francis Group, LLC Reasonable efforts have been made to publish reliable data and information, but the author and pub- lisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information stor- age or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbook- [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. ISBN: 978-1-138-33923-1 (hbk) ISBN: 978-1-032-42351-7 (pbk) ISBN: 978-0-429-44119-6 (ebk) DOI: 10.1201/9780429441196 Typeset in Latin Modern by KnowledgeWorks Global Ltd. Contents Preface xi Introduction xv Author Biography xix 1 Introductory Kindergarten 1 1.1 Elemental Notions . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Abelian Group . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Variability . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.3 Morphism and Invariant . . . . . . . . . . . . . . . . . 3 1.1.4 Vector Space . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.5 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.6 Probability Distribution . . . . . . . . . . . . . . . . . 6 1.2 Sources of Inspiration for Mathematical Gnostics . . . . . . . 6 1.2.1 Theory of General Systems . . . . . . . . . . . . . . . 7 1.2.2 Theory of Measurement . . . . . . . . . . . . . . . . . 7 1.2.3 Geometries . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.4 Maxwell’s Contributions . . . . . . . . . . . . . . . . . 8 1.2.5 Relativistic Physics. . . . . . . . . . . . . . . . . . . . 8 1.2.6 Thermodynamics . . . . . . . . . . . . . . . . . . . . . 8 1.2.7 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Axioms 11 2.1 Axioms of the Data Model . . . . . . . . . . . . . . . . . . . 11 2.2 Applications of Axiom 1 . . . . . . . . . . . . . . . . . . . . 12 2.3 Data Aggregation as the Second Gnostic Axiom . . . . . . . 14 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3 Introduction to Non-Standard Thought 15 3.1 Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Statistical Paradigms . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Statistical Data Weighing . . . . . . . . . . . . . . . . . . . . 18 3.4 Non-Statistical Paradigms of Uncertainty . . . . . . . . . . . 19 3.5 On the Need of an Alternative to Statistics . . . . . . . . . . 20 3.6 Principles of Advanced Data Analysis . . . . . . . . . . . . . 23 v vi Contents 3.7 The Gnostic Concept . . . . . . . . . . . . . . . . . . . . . . 25 3.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4 Quantification 29 4.1 Ideal Quantification . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Real Quantification . . . . . . . . . . . . . . . . . . . . . . . 31 4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5 Estimation and Ideal Gnostic Cycle 35 5.1 A Game with Nature . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Double Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.3 Gnostic Data Characteristics . . . . . . . . . . . . . . . . . . 36 5.4 The Ideal Gnostic Cycle . . . . . . . . . . . . . . . . . . . . . 39 5.5 Information Perpetuum Mobile? . . . . . . . . . . . . . . . . 40 5.6 Existence and Uniqueness of the Ideal Gnostic Cycle . . . . 40 5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6 Geometry 43 6.1 A Historical Dispute on Robustness of Statistics . . . . . . . 43 6.2 Distance as a Problem . . . . . . . . . . . . . . . . . . . . . . 45 6.3 Additivity in Data Aggregation . . . . . . . . . . . . . . . . 47 6.3.1 Statistical Mean Value and Data Weighting . . . . . . 48 6.4 Double Robustness . . . . . . . . . . . . . . . . . . . . . . . 49 6.5 The Curvature of the Space of Uncertain Data . . . . . . . . 50 6.6 Three Geometries . . . . . . . . . . . . . . . . . . . . . . . . 51 6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7 Aggregation 55 7.1 Why the Least Squares Method (Frequently) Works . . . . . 56 7.2 Aggregation of Uncertain Data . . . . . . . . . . . . . . . . . 57 7.3 The Second Axiom . . . . . . . . . . . . . . . . . . . . . . . 59 7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 8 Thermodynamics of Uncertain Data 61 8.1 Thermodynamic Interpretation of Gnostic Data Characteristics 61 8.2 Maxwell’s Demon . . . . . . . . . . . . . . . . . . . . . . . . 63 8.3 Entropy ↔ Information Conversion . . . . . . . . . . . . . . 64 8.4 Albert Perez’s Information . . . . . . . . . . . . . . . . . . . 65 8.5 Statistical Interpretation of Gnostic Data Characteristics . . 67 8.6 Between Mediocristan and Extremistan . . . . . . . . . . . . 70 8.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Contents vii 9 Kernel Estimation 73 9.1 Parzen’s Estimating Kernel . . . . . . . . . . . . . . . . . . . 73 9.2 Gnostic Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . 74 9.3 Scale Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 75 9.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 10 Probability Distribution Functions 79 10.1 Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 10.2 Data Domains . . . . . . . . . . . . . . . . . . . . . . . . . . 80 10.3 Tasks Solvable by Distribution Functions . . . . . . . . . . . 82 10.4 The Estimating Local Distribution . . . . . . . . . . . . . . . 84 10.5 Quantifying Distributions . . . . . . . . . . . . . . . . . . . . 85 10.6 Empirical Distribution Function and the Fit . . . . . . . . . 86 10.7 Some Applications of Distribution Functions . . . . . . . . . 90 10.7.1 Revealing Historical Information . . . . . . . . . . . . 90 10.7.2 Hypotheses Testing. . . . . . . . . . . . . . . . . . . . 95 10.7.3 A Large Survey of Chemical Pollutants . . . . . . . . 96 10.8 The Homogeneity Problem . . . . . . . . . . . . . . . . . . . 99 10.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 11 Applications of Local Distributions 107 11.1 Enrichment of the EGDF-Analysis . . . . . . . . . . . . . . . 107 11.2 Revealing Inner Structure of a Data Sample . . . . . . . . . 108 11.3 Marginal Analysis . . . . . . . . . . . . . . . . . . . . . . . . 109 11.4 Information Capability of Data . . . . . . . . . . . . . . . . . 112 11.5 Interval Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 113 11.6 Diversity of Samples . . . . . . . . . . . . . . . . . . . . . . . 115 11.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 12 On the Notion of Normality 119 12.1 Normality of Data . . . . . . . . . . . . . . . . . . . . . . . . 119 12.1.1 Statistical Approach . . . . . . . . . . . . . . . . . . . 120 12.1.2 Empirical Way in Clinical Practice . . . . . . . . . . . 121 12.1.3 Similarity-Based Reference Values in Economy . . . . 122 12.1.4 Fuzzy-Set Approach . . . . . . . . . . . . . . . . . . . 123 12.1.5 Automatic Warning and Emergency Systems . . . . . 124 12.2 Requirements to Ideal Estimation of Bounds of Normality . 124 12.3 Elements of Gnostic Solution of the Normality Problem in a One-Dimensional Analysis . . . . . . . . . . . . . . . . . . . 125 12.4 Critics on the Identity Gaussian ≡ Normal . . . . . . . . . . 127 12.4.1 Re-definition of Normality . . . . . . . . . . . . . . . . 127 12.4.2 On a Still Daydreamed Research Project BONUS . . . 129 12.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 viii Contents 13 Applications of Global Distribution Functions 135 13.1 Global Distribution Function . . . . . . . . . . . . . . . . . . 135 13.2 Comparison of Global with Local Distribution . . . . . . . . 139 13.3 Two Didactic Stories . . . . . . . . . . . . . . . . . . . . . . 140 13.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 14 Data Censoring 143 14.1 Uncensored Data . . . . . . . . . . . . . . . . . . . . . . . . . 143 14.2 Left-Censored Data . . . . . . . . . . . . . . . . . . . . . . . 144 14.3 Right-Censored Data . . . . . . . . . . . . . . . . . . . . . . 146 14.4 Interval Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 14.5 On an Unknown Limit of Detection . . . . . . . . . . . . . . 150 14.6 Examples of Surviving . . . . . . . . . . . . . . . . . . . . . . 151 14.7 Non-Standard Application of Data Censoring . . . . . . . . . 153 14.7.1 Data and Psychology. . . . . . . . . . . . . . . . . . . 153 14.7.2 Three Aspects of Data Interpretation . . . . . . . . . 154 14.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 15 Gnostic Thermodynamic Analysis of Data Uncertainty 159 15.1 Gnostic Data Calibration . . . . . . . . . . . . . . . . . . . . 159 15.1.1 Real Data for Examples . . . . . . . . . . . . . . . . . 159 15.2 Data Calibration . . . . . . . . . . . . . . . . . . . . . . . . . 163 15.2.1 LS-Optimal Numerical Operators . . . . . . . . . . . . 164 15.3 Calibration of the NIST12 Data . . . . . . . . . . . . . . . . 164 15.4 Calibration of the NIST37 Data . . . . . . . . . . . . . . . . 167 15.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 16 Robust Estimation of a Constant 171 16.1 Gnostic Data Aggregation Principle Used in Estimation . . . 171 16.2 Scale Parameter . . . . . . . . . . . . . . . . . . . . . . . . . 172 16.3 More on the Gnostic Data Aggregation . . . . . . . . . . . . 173 16.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 174 16.3.2 ExampleofRobustEstimationoftheMeanof Multiplicative Data . . . . . . . . . . . . . . . . . . . 176 16.3.3 Robust Estimation of the Mean of Simulated Data . . 177 16.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 17 Measuring the Data Uncertainty 181 17.1 Shortly on the Standard Approach . . . . . . . . . . . . . . . 181 17.2 The Need of Objective Measuring the Variability . . . . . . . 183 17.3 The Triplication of the Mean Values . . . . . . . . . . . . . . 183 17.4 The Need of a Unit of Uncertainty . . . . . . . . . . . . . . . 184 17.5 The Error of a Mean . . . . . . . . . . . . . . . . . . . . . . 186 17.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 17.6.1 Swiss Fertility and Socioeconomic Indicators (1888) Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Contents ix 17.6.2 Financial Statement Analysis . . . . . . . . . . . . . . 188 17.6.3 Weather Parameters . . . . . . . . . . . . . . . . . . . 189 17.6.4 An Important Medical Parameter. . . . . . . . . . . . 190 17.6.5 Non-homogeneous Data . . . . . . . . . . . . . . . . . 191 17.6.6 Parameters of Uncertainty . . . . . . . . . . . . . . . . 194 17.7 Discussion on Different Means . . . . . . . . . . . . . . . . . 196 17.7.1 Re-definition of Variance. . . . . . . . . . . . . . . . . 196 17.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 18 Homo- or Heteroscedastic Data 199 18.1 Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . 199 18.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 18.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 19 Gnostic Multidimensional Regression Models 205 19.1 Formulation of the Robust Regression Problem . . . . . . . . 207 19.2 Additive and Multiplicative Regression Models . . . . . . . . 214 19.3 Comparison of Robust Regression Models . . . . . . . . . . . 215 19.3.1 Statistical Methods for Comparison . . . . . . . . . . 215 19.3.2 Robust Regression in Mathematical Gnostics . . . . . 216 19.3.3 Data for Comparison . . . . . . . . . . . . . . . . . . . 218 19.3.4 Criteria for Evaluation of Methods . . . . . . . . . . . 219 19.3.5 Results of Comparison . . . . . . . . . . . . . . . . . . 220 19.3.6 Discussion of the Results . . . . . . . . . . . . . . . . 221 19.4 The Explicit and Implicit Regression Models . . . . . . . . . 222 19.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 19.6 Homogeneity of an MD-Model . . . . . . . . . . . . . . . . . 229 19.7 An Important Multidimensional Model . . . . . . . . . . . . 230 19.8 Applications of the Robust Regression Models . . . . . . . . 233 19.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 20 Data Filtering 235 20.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 20.2 Total Data Variability and Its Components . . . . . . . . . . 235 20.3 Filtering by Regression . . . . . . . . . . . . . . . . . . . . . 236 20.4 Filtering Effect of Proper Data Aggregation . . . . . . . . . 237 20.5 Improving the Matrix Quality . . . . . . . . . . . . . . . . . 238 20.6 Cleaning of Matrices . . . . . . . . . . . . . . . . . . . . . . . 240 20.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 21 Decision Making in Mathematical Gnostics 243 21.1 Datacratic Decision Making in Mathematical Gnostics . . . . 244 21.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.