ebook img

Federated Learning PDF

207 Pages·2021·4.412 MB·English
by  Rehman
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Federated Learning

Studies in Computational Intelligence 965 Muhammad Habib ur Rehman Mohamed Medhat Gaber   Editors Federated Learning Systems Towards Next-Generation AI Studies in Computational Intelligence Volume 965 Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland The series “Studies in Computational Intelligence” (SCI) publishes new develop- mentsandadvancesinthevariousareasofcomputationalintelligence—quicklyand with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago. AllbookspublishedintheseriesaresubmittedforconsiderationinWebofScience. More information about this series at http://www.springer.com/series/7092 Muhammad Habib ur Rehman (cid:129) Mohamed Medhat Gaber Editors Federated Learning Systems Towards Next-Generation AI 123 Editors Muhammad Habib urRehman MohamedMedhat Gaber Centerfor Cyber-Physical Systems Schoolof Computing KhalifaUniversity of Science andDigital Technology andTechnology Birmingham City University AbuDhabi, UnitedArab Emirates Birmingham, UK ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN978-3-030-70603-6 ISBN978-3-030-70604-3 (eBook) https://doi.org/10.1007/978-3-030-70604-3 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNature SwitzerlandAG2021 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseof illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained hereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregard tojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland “A learned individual can benefit rest of the world.” Preface BusinessesandGovernmentsarecollectingmassivedataabouttheircustomersand citizens. These organizations store the data in centralized cloud storage infras- tructures to perform large-scale training to make complex and timely decisions. This data collection process remains continuous; however, customers and citizens always show their concerns about the data management and processing. Considering this, several regulatory measures were taken and various laws were approvedbyGovernmentssuchastheEuropeanUnion(GDPR),theUnited States ofAmerica(HIPAA/CCPA),andCanada(PIPDEA).However,still,thecustomers and citizens are raising concerns over their privacy protection and false decisions made by machine learning systems. Therefore, the inability to collect fresh data poses a serious threat to well-informed and realistic decisions. Google introduced the term federated learning (FL) to enable the machine learning models to be initially trained at the customers’ or citizens’ devices and systemsandlaterthemodelupdatesareaggregatedatthecentralizedcloudservers. Considering this notion of FL, a large plethora of research activities has been performed by researchers and practitioners in academia and industry. Hence, numerousresearchpublicationswereproducedtosolvetheactiveresearchissuesin termsofprivacy,security,dataandmodelsynchronization,modeldevelopmentand deployment, personalization, incentivization, and heterogeneity across the FL systems. This book aims to study the FL ecosystem with a broader perspective to cover the theoretical as well as applied aspects of FL systems. Therefore, this book is structured into eight chapters in total. In the first chapter, Ali et al. performed a thoroughbibliometricanalysisofthefieldofFL.Authorshaveconductedthorough researchoftheScopusdatabasetouncoverthepublicationtrends.Theyfound476 scholarlydocumentsintotalandthenanalyzedthedatasettofindthegrowthtrends in FL research. Also, they studied subject areas and ranked them in terms of the numberofpublications.Also,theyoutlinedthetop-10citedpapers,top-10authors, top-10institutions,andtop-10countries.Moreover,theycategorizedthedocuments into various types and then uncovered the top-10 sources of these documents. Finally, the authors have performed the domain profiling of the FL research area vii viii Preface and they identified five hot domains such as the internet of things (IoT), wireless communication,privacyandsecurity,dataanalytics,andlearningandoptimization, where most of the FL research has been creating impact. Christopher et al., in Chap.2,reviewFLasanapproachforperformingmachinelearningondistributed data to protect the privacy of user-generated data. They highlight pertinent chal- lengesinanIoTcontextsuchasreducingcommunicationcostsassociatedwithdata transmission, learning from data under heterogeneous conditions, and applying additional privacy protections to FL. Throughout this review, they identify the strengths and weaknesses of different methods applied to FL, and finally, they outline future directions for privacy-preserving FL research, particularly focusing on IoT applications. Toeffectivelypreventinformationleakage,K.Weietal.(inChap.3)investigate a differential privacy mechanism in which, at the clients’ side, artificial noises are added to parameters before uploading. Moreover, they propose a K-client random schedulingpolicy,inwhichKclientsarerandomlyselectedfromatotalofNclients toparticipateineachcommunicationround.Furthermore,atheoreticalconvergence bound is derived from the loss function of the trained FL model. In detail, con- sidering a fixed privacy level, the theoretical bound reveals that there exist an optimalnumberofclientsKthatcanachievethebestconvergenceperformancedue to the tradeoff between the volume of user data and the variances of aggregated artificial noises. To optimize this tradeoff, they further provide a differentially private FL-based client selection (DP-FedCS) algorithm, which can dynamically select the number of training clients. Their experimental results validate their the- oretical conclusions and also show that the proposed algorithm can effectively improve both the FL training efficiency and FL model quality for a given privacy protection level. FL provides privacy-by-design. It trains a machine learning model collabora- tivelyoverseveraldistributedclients(rangingfromtwotomillions)suchasmobile phones, without sharing their raw data with any other participant. In practical scenarios, all clients do not have sufficient computing resources (e.g., Internet of Things), the machine learning model has millions of parameters, and its privacy between the server and the clients while training/testing is a prime concern (e.g., rivalparties).Inthisregard,FLisnotsufficient,sosplitlearning(SL)isintroduced in Chap. 4 by C. Thapa et al. SL is reliable in these scenarios as it splits a model into multiple portions, distributes them among clients and server, and trains/tests theirrespectivemodelportionstoaccomplishthefullmodeltraining/testing.InSL, the participants do not share both data and their model portions with any other parties,andusually,asmallernetworkportionisassignedtotheclientswheredata resides.Recently,ahybridofFLandSL,calledSplitFedlearning,isintroducedto elevatethebenefitsofbothFL(fastertraining/testingtime)andSL(modelsplitand training). Following the developments from FL to SL, and considering the importanceofSL,thischapterisdesignedtoprovideextensivecoverageinSLand itsvariants.Thecoverageincludesfundamentals,existingfindings,integrationwith privacy measures such as differential privacy, open problems, and code implementation. Preface ix Chapter 5 presents the practitioner view on FL research whereby a group of researchersfromthePySyftCommunityhaselaboratedonthekeyfeaturesoftheir FL tool. PySyft is an open-source multi-language library enabling secure and pri- vate machine learning by wrapping and extending popular deep learning frame- works such as PyTorch in a transparent, lightweight, and user-friendly manner. Its aimistobothhelppopularizeprivacy-preservingtechniquesinmachinelearningby making them as accessible as possible via Python bindings and common tools familiartoresearchersanddatascientists,aswellastobeextensiblesuchthatnew FederatedLearning,Multi-Party Computation,orDifferential Privacymethods can beflexiblyandsimplyimplementedandintegrated.Thischapterwillintroducethe methods available within the PySyft library and describe their implementations. The authors also provide a proof-of-concept demonstration of an FL workflow usinganexampleofhowtotrainaconvolutionalneuralnetwork.Next,reviewthe use of PySyft in academic literature to date and discuss future use-cases and development plans. Most importantly, they introduce Duet: their tool for easier federated learning for scientists and data owners. Inthemedicalorhealthcareindustry,wherethealreadyavailableinformationor data is never sufficient, decisions can be performed with the help of FL by empowering AI models to learn on private data without conceding privacy. The primaryobjectiveofChap.6istohighlighttheadaptabilityandworkingoftheFL techniques in the healthcare system especially in drug development, clinical diag- nosis, digital health monitoring, and various disease predictions and detection system.Thefirstsectionofthechaptercomprisedofmotivation,FLforhealthcare, FLworkingmodelinhealthcare,andvariousbenefitsofFL.Thenextsectionofthe chapter described the reported work which highlights the working of different researchers who used the FL model. The final section of the chapter presented the comparativeanalysisofdifferentFLalgorithmsfordifferenthealthsectorsbyusing parameters such as accuracy, the area under the curve, precision, recall, and F-score. Ahmed et al., in Chap. 7, envision the idea of fully decentralized FL system. They emphasized on using blockchain-empowered smart contract tech- nologies to enable fairness and trust among the FL participants over underlying peer-to-peer networks. Finally, David and Zaid in Chap. 8 analyze existing vulnerabilities of FL and subsequently perform a literature review of the possible attack methods targeting FL privacy protection capabilities. These attack methods are then categorized by a basic taxonomy. Additionally, they provide a literature study of the most recent defensivestrategiesandalgorithms for FLaimedtoovercomethese attacks.These defensive strategies are categorized by their respective underlying defense princi- ple. The chapteradvocatesthat theapplicationofasingledefensivestrategy isnot enough to provide adequate protection against all available attack methods. Althoughthisbookprimarilytargetsthecomputerscience,informationtechnology, and data science communities. However, considering the generalized content, this bookwillbeequallyhelpfulforthestudents,researchers,andpractitionersfromall walksoflife.SincethisisthefirstcontributedresearchbookonthetopicofFL,we aim to include more important and active research topics in the future editions of x Preface this book. Finally, we would like to thank and acknowledge the efforts of all contributors including authors, reviewers, and editorial staff for putting their untiring efforts into making this project a success. Abu Dhabi, United Arab Emirates Muhammad Habib ur Rehman Birmingham, UK Mohamed Medhat Gaber December 2020

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.