ebook img

Sequential Learning and Decision-Making in Wireless Resource Management PDF

121 Pages·2016·2.393 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Sequential Learning and Decision-Making in Wireless Resource Management

Wireless Networks Rong Zheng Cunqing Hua Sequential Learning and Decision-Making in Wireless Resource Management Wireless Networks Series editor Xuemin (Sherman) Shen University of Waterloo, Waterloo, Ontario, Canada The purpose of Springer’s new Wireless Networks book series is to establish the state of the art and set the course for future research and development in wireless communication networks. The scope of this series includes not only all aspects of wireless networks (including cellular networks, WiFi, sensor networks, and vehicular networks), but related areas such as cloud computing and big data. The series serves as a central source of references for wireless networks research and development.Itaimstopublishthoroughandcohesiveoverviewsonspecifictopics in wireless networks, as well as works that are larger in scope than survey articles and that contain more detailed background information. The series also provides coverage of advanced and timely topics worthy of monographs, contributed volumes, textbooks and handbooks. More information about this series at http://www.springer.com/series/14180 Rong Zheng Cunqing Hua (cid:129) Sequential Learning and Decision-Making in Wireless Resource Management 123 RongZheng CunqingHua Department ofComputing Schoolof Information Security Engineering McMaster University ShanghaiJiao Tong University Hamilton, ON Shanghai Canada China ISSN 2366-1186 ISSN 2366-1445 (electronic) Wireless Networks ISBN978-3-319-50501-5 ISBN978-3-319-50502-2 (eBook) DOI 10.1007/978-3-319-50502-2 LibraryofCongressControlNumber:2016959801 ©SpringerInternationalPublishingAG2016 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor foranyerrorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Resource management has been a perpetual theme in wireless network design, deployment,andoperations.Intoday’swirelesssystems,datademandscontinueto grow with a diverse range of applications from bandwidth-hungry multimedia streaming, delay-sensitive instant messaging, and online gaming to bulk data transfer. The ever-increasing needs for high-speed ubiquitous network access by mobile users are further aggravated by emerging machine-to-machine communi- cation for home and industrial automation, wide-area sensing and monitoring, autonomous vehicles, etc. Delivery of these rich sets of applications is funda- mentallylimitedbyresourcescarcityinwirelessnetworksthatmanifestsatvarious levels. For instance, spectrum scarcity has emerged as a primary problem when trying to launch new wireless services. Vendors and operators are increasingly looking into millimeter radio bands for 5G cellular standard though the spectrum was previously considered unsuitable for wider area applications. Interferences among wireless transceivers in close proximity continue to pose challenges to the delivery of reliable and timely services. Mobile devices are inherently power-constrained, demanding efficient communication schemes and protocols. Many resource management solutions in wireless networks operate on the assumptionthatthedecisionmakershavethecompleteknowledgeofsystemstates and parameters (e.g., channel states, network topology, and user density). When such information is unavailable or incomplete, probing or learning has to be con- ducted prior to the making of resource management decisions. As an example, in orthogonal frequency-division multiplexing (OFDM) systems, pilot signals are transmittedeitheralongadedicatedsetofsubcarriersoraspecificperiodacrossall subcarriers for channel estimation. This allows the adaption of subsequent trans- missions to current channel conditions. Sequential learning, in contrast, is a para- digm where learning and decision-making are performed concurrently. The framework is applicable in a variety of scenarios where the utility of resource management resources follows single-parametrized independent and identically distributions,Markovianorunknownadversarialprocesses.Itisapowerfultoolin wirelessresourcemanagement.Sequentiallearninganddecision-makinghavebeen v vi Preface successfully applied to for resource management in cognitive radio networks, wireless LANs, and mesh networks. However,therearesignificantbarriersinthewideradoptionoftheframeworkin addressing resourcemanagement problems inthewireless community. We believe that this can be attributed to two reasons: First, the sequential learning theory originates from complex stochastic concepts posing a significant learning curve. Effective algorithms often rely on underlying assumptions such as stationary and independent stochastic processes. Identifying a suitable solution to a specific problem can be a tall order for beginners. Second, there is a disconnect between theoryandpracticalconstraintsinreal-worldsettings.Forexample,inpractice,the timeliness of decision-making often trumps optimality. In contrast, the primary concernsofsequentiallearningliteraturearetheconvergencerateinalongrunand the optimality of the sequence of actions. This book is the first attempt to bridge the aforementioned gaps. Though the literature on sequential learning is abundant, a comprehensive treatment of its applications in wireless networks is lacking. In this book, we aim to lay out the theoreticalfoundationoftheso-calledmulti-armedbandit(MAB)problemsandput it in the context of resource management in wireless networks. Part I of this book presents the formulations, algorithms, and performance of three forms of MAB problems, namely stochastic, Markov, and adversarial. To the best of our knowl- edge,thisisthefirstworkthatcoversallthethreeformsofMABproblems.PartII of this book provides detailed discussions of representative applications of the sequentiallearningframeworkinwirelessresourcemanagement.Theyserveascase studies both to illustrate how the theoretical framework and tools in Part I can be appliedandalsotodemonstratehowexistingalgorithmscanbeextendedtoaddress practical concerns in operational wireless networks. We believe both the industry andthewirelessresearchcommunitycanbenefitfromacomprehensiveandtimely treatment of these topics. Hamilton, ON, Canada Rong Zheng Shanghai, China Cunqing Hua September 2016 Acknowledgements We would like to thank our families and funding agencies for the continuing supports of our work. Some of the research work included is not possible without the help of our students, Arun Chhetri, Pallavi Arora, Thanh Le, Mohamed Hammouda, and Lingzhi Wang, and collaborators Dr. Zhu Han, Dr. Csaba Szepesvári, and Rui Ni. vii Contents Part I Theory 1 Introduction.... .... .... ..... .... .... .... .... .... ..... .... 3 1.1 The Gambler’s Dilemma ... .... .... .... .... .... ..... .... 3 1.2 A Taxonomy of Multi-armed Bandit Problems... .... ..... .... 4 1.3 Organization.... .... ..... .... .... .... .... .... ..... .... 6 References.. .... .... .... ..... .... .... .... .... .... ..... .... 6 2 Stochastic Multi-armed Bandit.. .... .... .... .... .... ..... .... 9 2.1 Problem Formulation . ..... .... .... .... .... .... ..... .... 9 2.2 Theoretical Lower Bound... .... .... .... .... .... ..... .... 11 2.3 Algorithms . .... .... ..... .... .... .... .... .... ..... .... 12 2.3.1 Upper Confidence Bound (UCB) Strategies ... ..... .... 13 2.3.2 e-Greedy Policy..... .... .... .... .... .... ..... .... 17 2.3.3 Thompson Sampling Policy ... .... .... .... ..... .... 18 2.4 Variants of Stochastic Multi-armed Bandit.. .... .... ..... .... 19 2.4.1 Multiplay MAB..... .... .... .... .... .... ..... .... 19 2.4.2 MAB with Switching Costs ... .... .... .... ..... .... 20 2.4.3 Pure Exploration MAB... .... .... .... .... ..... .... 21 2.5 Summary .. .... .... ..... .... .... .... .... .... ..... .... 24 References.. .... .... .... ..... .... .... .... .... .... ..... .... 24 3 Markov Multi-armed Bandit ... .... .... .... .... .... ..... .... 27 3.1 Problem Formulation . ..... .... .... .... .... .... ..... .... 27 3.1.1 Markov MAB and Markov Decision Process .. ..... .... 28 3.1.2 Optimal Policies for Restless Markov MABs with Complete Information.... .... .... .... ..... .... 29 ix x Contents 3.2 Algorithms . .... .... ..... .... .... .... .... .... ..... .... 30 3.2.1 Rested Markov MAB .... .... .... .... .... ..... .... 31 3.2.2 Restless Markov MAB ... .... .... .... .... ..... .... 32 3.3 Summary .. .... .... ..... .... .... .... .... .... ..... .... 39 References.. .... .... .... ..... .... .... .... .... .... ..... .... 39 4 Adversarial Multi-armed Bandit .... .... .... .... .... ..... .... 41 4.1 Problem Formulation . ..... .... .... .... .... .... ..... .... 41 4.2 Algorithms . .... .... ..... .... .... .... .... .... ..... .... 45 4.2.1 Weighted Average Prediction Algorithm.. .... ..... .... 45 4.2.2 Following-the-Perturbed-Leader (FPL) Algorithm.... .... 48 4.2.3 Implicitly Normalized Forecaster (INF) Algorithm ... .... 51 4.2.4 Internal-Regret Minimizing Algorithm ... .... ..... .... 52 4.3 Game Theoretical Results for Multiplayer Adversarial Multi-armed Bandit .. ..... .... .... .... .... .... ..... .... 54 4.4 Summary .. .... .... ..... .... .... .... .... .... ..... .... 57 References.. .... .... .... ..... .... .... .... .... .... ..... .... 57 Part II Applications 5 Spectrum Sensing and Access in Cognitive Radio Networks ... .... 61 5.1 Introduction .... .... ..... .... .... .... .... .... ..... .... 61 5.2 Problem Formulation . ..... .... .... .... .... .... ..... .... 63 5.2.1 Single SU with IID Rewards... .... .... .... ..... .... 63 5.2.2 Single SU with Markov Reward Processes.... ..... .... 64 5.2.3 Multiple SUs.. ..... .... .... .... .... .... ..... .... 64 5.3 Solution Approaches.. ..... .... .... .... .... .... ..... .... 65 5.3.1 Cooperative Spectrum Access.. .... .... .... ..... .... 65 5.3.2 Distributed Learning and Allocation. .... .... ..... .... 67 5.4 Summary .. .... .... ..... .... .... .... .... .... ..... .... 68 References.. .... .... .... ..... .... .... .... .... .... ..... .... 68 6 Sniffer-Channel Assignment in Multichannel Wireless Networks....... 71 6.1 Introduction .... .... ..... .... .... .... .... .... ..... .... 71 6.2 Problem Formulation . ..... .... .... .... .... .... ..... .... 73 6.2.1 Optimal Channel Assignment in the Nominal Form .. .... 74 6.2.2 Linear Bandit for Optimal Channel Assignment with Uncertainty .... .... .... .... .... .... ..... .... 75 6.2.3 Extensions.... ..... .... .... .... .... .... ..... .... 77 6.3 Solution Approaches.. ..... .... .... .... .... .... ..... .... 79 6.3.1 Spanners. .... ..... .... .... .... .... .... ..... .... 79 6.3.2 An Upper Confidence Bound (UCB)-Based Policy... .... 79 6.3.3 An e-Greedy Algorithm with Spanner.... .... ..... .... 80

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.