ebook img

Classification and Data Analysis: Theory and Applications PDF

334 Pages·2020·8.875 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Classification and Data Analysis: Theory and Applications

Studies in Classification, Data Analysis, and Knowledge Organization Krzysztof Jajuga Jacek Batóg Marek Walesiak Editors Classification and Data Analysis Theory and Applications fi Studies in Classi cation, Data Analysis, and Knowledge Organization Managing Editors Editorial Board WolfgangGaul, Karlsruhe, Germany DanielBaier, Bayreuth, Germany Maurizio Vichi, Rome,Italy FrankCritchley, MiltonKeynes, UK ClausWeihs, Dortmund, Germany ReinholdDecker, Bielefeld, Germany Edwin Diday, Paris,France Michael Greenacre, Barcelona,Spain CarloNatale Lauro,Naples, Italy JacquelineMeulman,Leiden,TheNetherlands PaolaMonari, Bologna, Italy ShizuhikoNishisato, Toronto, Canada Noboru Ohsumi,Tokyo,Japan Otto Opitz, Augsburg,Germany GunterRitter,FakultätfürMathematiku. Informatik,UniversitätPassau,Passau, Germany Martin Schader,Mannheim, Germany More information about this series at http://www.springer.com/series/1564 ó Krzysztof Jajuga Jacek Bat g Marek Walesiak (cid:129) (cid:129) Editors fi Classi cation and Data Analysis Theory and Applications 123 Editors Krzysztof Jajuga JacekBatóg Department ofFinancial Investments Institute of Econometrics andRisk Management andStatistics Wroclaw University of Economics University of Szczecin andBusiness Szczecin,Poland Wroclaw,Poland MarekWalesiak Department ofEconometrics andComputer Science Wroclaw University of Economics andBusiness Wroclaw,Poland ISSN 1431-8814 ISSN 2198-3321 (electronic) Studies in Classification,Data Analysis, andKnowledgeOrganization ISBN978-3-030-52347-3 ISBN978-3-030-52348-0 (eBook) https://doi.org/10.1007/978-3-030-52348-0 Mathematics Subject Classification: 62Hxx, 62H25, 62H30, 62H86, 62-07, 62-09, 68Uxx, 68U20, 62Pxx,62P12,62P20,62P25 ©SpringerNatureSwitzerlandAG2020 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained hereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregard tojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface This volume presents the papers from the 28th Conference of Section of Classification and Data Analysis of Polish Statistical Society held at University of Szczecin on September 18–20, 2019. The papers presented referred to a set of studies addressing a wide range of recent methodological aspects and applications of classification and data analysis tools in micro and macroeconomic problems. In the final selection, we accepted 20 of the papers that were presented at the con- ference. Each of the submissions has been reviewed by two anonymous referees and the Authors have subsequently revised their original manuscripts and incor- porated the comments and suggestions of the referees. The selection criteria were based on the contribution of the papers to the theory and applications of modern classification and data analysis. The chapters have been organized along the major fields and themes in classi- fication and data analysis: Methodology, Application in Finance, Application in Economics and Application in Social Issues. The part on Methodology contains five papers. The paper by Batóg and Wawrzyniakfocusesonmodificationsofselectedformulaswhichallowtoreceivea transformation of nominant into stimulant that ensures that the order of objects before and after the transformation is consistent with the real values of the nomi- nant.Dudekinhispaperpresentstherecommendationsontheprinciplesofcorrect application of the Silhouette index, indicating that the “mechanical” use of the index leads to results that do not correspond to the actual structure of the classes. The paper by Grzenda discusses how discretization of continuous variables can improvetheclassificationaccuracyofmachinelearningmodels,withanapplication of supervised discretization of continuous variables based on the entropy criterion and the Gini criterion in demography. Jefmański in his paper proposes an intu- itionistic fuzzy synthetic measure for ordinal data based on the Hellwig’s linear ordering method that allows for a comparative analysis of objects due to the complex phenomenon described by ordinal measurement scales as well as to take into account the uncertainty in comparing objects expressed in the form of neutral points on the ordinal measurement scales. Pełka in his paper conducts research on the usefulness and prediction power of extracting variables from neural networks v vi Preface (multilayerperceptronforsymbolicdata)asthemethodofvariableselectionforthe purposes of ensemble learning for symbolic data. The paper on Application in Finance contains also five papers. The paper by Doszyń using the so-called Szczecin algorithm of real estate mass appraisal is aimedtoanalyzeifaneconometricmodelwithrestrictionsmaysupporttheprocess ofreal estatemass appraisal, providing a more precise determinationoftheimpact of real property attributes on the prices than an analogous model without restric- tions. Krężołek in his paper address the issue of estimation of tail index of prob- ability distribution using Hill estimator and its modification, comparing selected non-parametric and parametric models. The paper by Mikulec and Misztal using prediction error curves based on the bootstrap cross-validation estimates of the prediction error estimation provides the evidence that survival function made in each of the obtained subsets of objects with the use of Kaplan-Meier method enables more precise estimate of firm’s duration than the use of Kaplan-Meier function for the total data. Pawełek and Pociecha in their paper compare the effectiveness prediction of the logit leaf model as a hybrid classification algorithm that enhances logistic regression and decision tree with the use of individual classifiers.ThepaperbyTrzpiotexaminestherelationbetweeneconomic,financial and demographic variables and longevity in terms of long-term investment port- foliosthataresensitivetoriskfactorsaccordingtotheAPTportfoliofactormodel, using the Principal Component Regression. The part on Application in Economics contains four papers. The paper by Markowicz and Baran investigates the issue of mirror data concerning intra- Community supplies of goods, with the use of their original indicators of data asymmetry and an empirical example based on data from the Eurostat COMEXT database. Cheba and Bąk in their paper explore the relationships between sus- tainabledevelopmentandgreeneconomyandassesstheresultsobtainedfortheEU countriesinfourparticularareasusingataxonomicdevelopmentmeasurebasedon the Weber median. Misztal and Kupis-Fijałkowska in their paper analyze the ICT development level in Poland against other European Union countries in the indi- vidual users and households perspective, using the exploratory data analysis methodsandtheHellwig’smethodoflinearordering.SaganandGrabowskiintheir paper identify cause-effect relationships as the impact of unknown disturbing variables affecting both the mediation and focal dependent variables by applying a simulations of correlated disturbances effect of dependent variables in the tech- nology acceptance models on the degree of average causal mediation effect bias. The part on Application in Social Issues contains six papers. Bieszk-Stolorz in her paper verifies whether risk of subsequent registrations in the labour office dependsonthecharacteristicsoftheunemployedpersonsusingPrentice-Williams- Peterson’sconditionalmodels,whichconsiderthetimeuntiltheeventoccursfrom the beginning of observation, and the time from the previous event. The paper by Głowicka-WołoszynandWysockianswersthequestionwhethercorrectionofideal values occurring in Hellwig’s and TOPSIS methods by the quartile criterion, contributed to the improvement of consistency between the identified levels of the Polishcommunesfinancialautonomyandthesyntheticmeasurevaluesassignedto Preface vii them. Konarzewska in her paper conducts research on the problem of statistical independenceofchosenpropertiesofobjectsandespeciallythechoiceofadequate weightsinmulti-criteriarankings,applyingthevaluesofVarianceInflationFactors, Principal Component Analysis and Multi-Criteria Principal Components. Landmesser in her paper presents the comparison of personal income distributions takingintoaccountthegenderincomegapfor28Europeancountriesandusingthe Oaxaca-Blinderdecompositionprocedure,thedecompositionproceduretodifferent quantile points along the whole income distribution, and finally the counterfactual distribution based on the Recentered Influence Function—Regression approach. ThepaperbyMajewskaandTrzpiotevaluatesdifferentapproachestoidentification of the existence of the common mortality trends and derives the mortality time-varying indicator from the Lee-Carter model to obtain the similarities of dif- ferent countries via a semi-parametric comparison approach to prove that multi-population mortality models are superior to individual mortality forecasting models. Matuszewska-Janica in her paper verifies whether selected attributes of employees affect the level of their wages, considering the impact of outliers on changes in relative importance of analysed features. WewishtothanktheAuthorsformakingtheirstudiesavailableforourvolume. Their scholarly efforts and research inquiries made this volume possible. We are alsoindebtedtotheanonymousrefereesforprovidinginsightfulreviewswithmany useful comments and suggestions. In spite of our intention to address a wide range of problems pertaining to classification and data analysis theory there are issues that still need to be resear- ched. We hope that the studies included in our volume will encourage further research and analyses in modern data science. Wroclaw, Poland Krzysztof Jajuga Szczecin, Poland Jacek Batóg Wroclaw, Poland Marek Walesiak January, 2020 Contents Methods Comparison of Proposals of Transformation of Nominants into Stimulants on the Example of Financial Ratios of Companies Listed on the Warsaw Stock Exchange . . . . . . . . . . . . . . . . . . . . . . . . . 3 Barbara Batóg and Katarzyna Wawrzyniak Silhouette Index as Clustering Evaluation Tool . . . . . . . . . . . . . . . . . . . 19 Andrzej Dudek The Role of Discretization of Continuous Variables in Socioeconomic Classification Models on the Example of Logistic Regression Models and Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Wioletta Grzenda Intuitionistic Fuzzy Synthetic Measure for Ordinal Data . . . . . . . . . . . . 53 Bartłomiej Jefmański Improving Classification Accuracy of Ensemble Learning for Symbolic Data Trough Neural Networks’ Feature Extraction . . . . . 73 Marcin Pełka Applications in Finance Inequality Restricted Least Squares (IRLS) Model of Real Estate Prices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Mariusz Doszyń Application of Hill Estimator to Assess Extreme Risks in the Metals Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Dominik Krężołek ix x Contents Segmentation of Enterprises on the Basis of Their Duration Using Survival Trees—Results of an Analysis for Legal Persons and Organizational Entities Without Legal Personality in the Łódzkie Voivodship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Artur Mikulec and Małgorzata Misztal Corporate Bankruptcy Prediction with the Use of the Logit Leaf Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Barbara Pawełek and Józef Pociecha The Impact of Longevity on a Valuation of Long-Term Investments Returns: The Case of Selected European Countries. . . . . . . . . . . . . . . . 147 Grażyna Trzpiot Applications in Economics SustainableDevelopmentandGreenEconomyintheEuropeanUnion Countries—Statistical Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Katarzyna Cheba and Iwona Bąk The Review of Indicators of Data Quality in Intra-Community Trade in Goods. The Choice of an Indicator and Its Effect on the Ranking of Countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Iwona Markowicz and Paweł Baran Development of ICT in Poland in Comparison with the European Union Countries—Multivariate Statistical Analysis . . . . . . . . . . . . . . . . 203 Małgorzata Misztal and Aleksandra Kupis-Fijałkowska Sensitivity Analysis in Causal Mediation Effects for TAM Model . . . . . 221 Adam Sagan and Mariusz Grabowski Applications in Social Problems Prentice–Williams–PetersonModelsintheAssessmentoftheInfluence of the Characteristics of the Unemployed on the Intensity of Subsequent Registrations in the Labour Office . . . . . . . . . . . . . . . . . 237 Beata Bieszk-Stolorz Right-SkewedDistributionofFeaturesandtheIdentificationProblem of the Financial Autonomy of Local Administrative Units . . . . . . . . . . . 251 Romana Głowicka-Wołoszyn and Feliks Wysocki Multi-criteria Rankings with Interdependent Criteria: Case of EU Countries on Their Way to Healthy Lives and Well-Being . . . . . . . . . . 265 Iwona Konarzewska The Comparison of Income Distributions for Women and Men in the European Union Countries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Joanna Landmesser

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.