ebook img

Bayesian Tensor Decomposition for Signal Processing and Machine Learning: Modeling, Tuning-Free Algorithms, and Applications PDF

189 Pages·2023·3.564 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Bayesian Tensor Decomposition for Signal Processing and Machine Learning: Modeling, Tuning-Free Algorithms, and Applications

Lei Cheng Zhongtao Chen Yik-Chung Wu Bayesian Tensor Decomposition for Signal Processing and Machine Learning Modeling, Tuning-Free Algorithms, and Applications Bayesian Tensor Decomposition for Signal Processing and Machine Learning · · Lei Cheng Zhongtao Chen Yik-Chung Wu Bayesian Tensor Decomposition for Signal Processing and Machine Learning Modeling, Tuning-Free Algorithms, and Applications LeiCheng ZhongtaoChen CollegeofInformationScience DepartmentofElectricalandElectronic andElectronicEngineering Engineering ZhejiangUniversity TheUniversityofHongKong Hangzhou,China HongKong,China Yik-ChungWu DepartmentofElectricalandElectronic Engineering TheUniversityofHongKong HongKong,China ISBN 978-3-031-22437-9 ISBN 978-3-031-22438-6 (eBook) https://doi.org/10.1007/978-3-031-22438-6 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNature SwitzerlandAG2023 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuse ofillustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Ourworldisfullofdata,andthesedataoftenappearinhigh-dimensionalstructures, witheachdimensiondescribingauniqueattribute.Examplesincludedatainsocial sciences, medicines, pharmacology, and environmental monitoring, just to name a few. To make sense of the multi-dimensional data, advanced computational tools, whichdirectlyworkwithtensorratherthanfirstconvertingatensortoamatrix,are needed to unveil the hidden patterns of the data. This is where tensor decomposi- tionmodelscomeintoplay.Duetotheremarkablerepresentationcapability,tensor decompositionmodelshaveledtostate-of-the-artperformancesinmanydomains, including social network mining, image processing, array signal processing, and wirelesscommunications. Previousresearchontensordecompositionsmainlyapproachedfromanoptimiza- tion perspective, which unfortunately does not come with the capability of tensor ranklearningandrequiresheavyhyper-parametertuning.Whilethesetwotasksare importantincomplexitycontrolandavoidingoverfitting,theyareoftenoverlooked or downplayed in current research, and assumed can be achieved by trivial opera- tions, or somehow can be obtained from other methods. In reality, estimating the tensorrankandapropersetofhyper-parametersusuallyinvolveexhaustivesearch. This requires running the same algorithm many times, effectively increasing the computationalcomplexityinactualmodeldeployment. Another path for model learning is Bayesian methods. They provide a natural recipefortheintegrationoftensorranklearning,automatichyper-parameterdeter- mination,andtensordecomposition.Duetothisuniquecapability,Bayesianmodels andinferencetriggerarecentinterestintensordecompositionsforsignalprocessing andmachinelearning.Fromtheserecentworks,Bayesianmodelsshowcomparable orevenbetterperformancethanoptimization-basedcounterparts. However,Bayesianmethodsareverydifferentfromoptimizationmethods,with theformerlearningdistributionsoftheunknownparameters,andthelatterlearning apointestimate.Theprocessofbuildingthemodelsandinferencealgorithmderiva- tions are fundamentally different as well. This leads to a barrier between the two groups of researchers working on similar problems but starting from different v vi Preface perspectives.ThisbookaimstodistilltheessentialsofBayesianmodelingandinfer- ence in tensor research, and present a unified view of various models. The book addressestheneedsofpostgraduatestudents,researchers,andpracticingengineers whoseinterestslieintensorsignalprocessingandmachinelearning.Itcanbeused as a textbook for short courses on specific topics, e.g., tensor learning methods, Bayesianlearning,andmulti-dimensionaldataanalytics.Democodescanbedown- loadedfromhttps://github.com/leicheng-tensor/Reproducible-Bayesian-Tensor-Mat rix-Machine-Learning-SOTA. It is our hope that by lowering the barrier to under- standingandenteringtheBayesianlandscape,moreideasandnovelalgorithmscan bestimulatedandfacilitatedintheresearchcommunity. This book starts by reviewing the basics and classical algorithms for tensor decompositions,andthenintroducestheircommonchallengeonrankdetermination (Chap. 1). To overcome this challenge, this book develops models and algorithms under the Bayesian sparsity-aware learning framework, with the philosophy and keyresultselaboratedinChap.2.InChaps.3and4,weusethemostbasictensor decomposition format, Canonical Polyadic Decomposition (CPD), as an example to elucidate the fundamental Bayesian modeling and inference that can achieve automatic rank determination and hyper-parameter learning. Both parametric and non-parametric modeling and inference are introduced and analyzed. In Chap. 5, we demonstrate how Bayesian CPD is connected with stochastic optimization in ordertofitlarge-scaledata.InChap.6,weshowhowthebasicmodelcanincorpo- rateadditionalnonnegativestructurestoachieveenhancedperformancesinvarious signal processing and machine learning tasks. Chapter 7 discusses the extension ofBayesian methods tocomplex-valued data,handling orthogonal constraintsand outliers. Chapter 8 uses the direction-of-arrival estimation, which has been one of thefocusesofarraysignalprocessingfordecades,asacasestudytointroducethe Bayesian tensor decomposition under missing data. Finally, Chap. 9 extends the modelingideapresentedinpreviouschapterstoothertensordecompositionformats, including tensor Tucker decomposition, tensor-train decomposition, PARAFAC2 decomposition,andtensorSVD. Theauthorssincerelythankthegroupmembers,LeXu,XuekeTong,andYangge Chen,atTheUniversityofHongKongforworkingonthistopictogetheroverthe years.ThisprojectissupportedinpartbytheNSFCunderGrant62001309,andin part by the General Research Fund from the Hong Kong Research Grant Council underGrant17207018. Hangzhou,China LeiCheng HongKong,China ZhongtaoChen HongKong,China Yik-ChungWu August2022 Contents 1 Tensor Decomposition: Basics, Algorithms, and Recent Advances ...................................................... 1 1.1 TerminologiesandNotations ................................. 1 1.1.1 Scalar,Vector,Matrix,andTensor ...................... 1 1.1.2 TensorUnfolding/Matricization ........................ 2 1.1.3 TensorProductsandNorms ........................... 2 1.2 RepresentationLearningviaTensors .......................... 5 1.2.1 CanonicalPolyadicDecomposition(CPD) ............... 6 1.2.2 TuckerDecomposition(TuckerD) ...................... 7 1.2.3 TensorTrainDecomposition(TTD) ..................... 8 1.3 ModelFittingandChallengesAhead .......................... 9 1.3.1 Example:TensorCPD ................................ 10 1.3.2 ChallengesinRankDetermination ...................... 13 References ..................................................... 13 2 BayesianLearningforSparsity-AwareModeling .................. 15 2.1 Bayes’Theorem ............................................ 15 2.2 BayesianLearningandSparsity-AwareLearning ................ 16 2.3 PriorDesignforSparsity-AwareModeling ..................... 17 2.4 InferenceAlgorithmDevelopment ............................ 19 2.5 Mean-FieldVariationalInference ............................. 20 2.5.1 GeneralSolution ..................................... 20 2.5.2 TractabilityofMF-VI ................................ 21 2.5.3 DefinitionofMPCEFModel ........................... 28 2.5.4 OptimalVariationalPdfsforMPCEFModel ............. 31 References ..................................................... 34 3 BayesianTensorCPD:ModelingandInference ................... 35 3.1 AUnifiedProbabilisticModelingUsingGSMPrior ............. 35 3.2 PCPD-GG:ProbabilisticModeling ............................ 37 3.3 PCPD-GH:ProbabilisticModeling ............................ 39 3.4 PCPD-GH,PCPD-GG:InferenceAlgorithm ................... 44 vii viii Contents 3.4.1 OptimalVariationalPdfs .............................. 45 3.4.2 SettingtheHyper-parameters .......................... 47 3.5 AlgorithmSummaryandInsights ............................. 48 3.5.1 ConvergenceProperty ................................ 48 3.5.2 AutomaticTensorRankLearning ....................... 48 3.5.3 ComputationalComplexity ............................ 50 3.5.4 ReducingtoPCPD-GG ............................... 50 3.6 Non-parametricModeling:PCPD-MGP ....................... 51 3.7 PCPD-MGP:InferenceAlgorithm ............................ 53 References ..................................................... 57 4 Bayesian Tensor CPD: Performance and Real-World Applications ................................................... 59 4.1 NumericalResultsonSyntheticData .......................... 59 4.1.1 SimulationSetup ..................................... 59 4.1.2 PCPD-GHVersusPCPD-GG .......................... 60 4.1.3 ComparisonswithNon-parametricPCPD-MGP .......... 65 4.2 Real-WorldApplications .................................... 69 4.2.1 FluorescenceDataAnalytics ........................... 69 4.2.2 HyperspectralImagesDenoising ....................... 73 References ..................................................... 74 5 WhenStochasticOptimizationMeetsVI:ScalingBayesian CPDtoMassiveData ........................................... 77 5.1 CPDProblemReformulation ................................. 77 5.1.1 Probabilistic Model and Inference fortheReformulatedProblem ......................... 78 5.2 Interpreting VI Update from Natural Gradient Descent Perspective ................................................ 80 5.2.1 OptimalVariationalPdfsinExponentialFamilyForm ..... 81 5.2.2 VIUpdatesasNaturalGradientDescent ................. 83 5.3 ScalableVIAlgorithmforTensorCPD ........................ 86 5.3.1 SummaryofIterativeAlgorithm ........................ 87 5.3.2 FurtherDiscussions .................................. 87 5.4 NumericalExamples ........................................ 90 5.4.1 ConvergencePerformanceonSyntheticData ............. 90 5.4.2 TensorRankEstimationonSyntheticData ............... 93 5.4.3 VideoBackgroundModeling .......................... 97 5.4.4 ImageFeatureExtraction .............................. 99 References ..................................................... 100 6 BayesianTensorCPDwithNonnegativeFactors ................... 103 6.1 TensorCPDwithNonnegativeFactors ......................... 103 6.1.1 MotivatingExample—SocialGroupClustering ........... 103 6.1.2 GeneralProblemandChallenges ....................... 105 6.2 ProbabilisticModelingforCPDwithNonnegativeFactors ........ 106 Contents ix 6.2.1 PropertiesofNonnegativeGaussian-GammaPrior ........ 106 6.2.2 ProbabilisticModelingofCPDwithNonnegative Factors ............................................. 109 6.3 Inference Algorithm for Tensor CPD with Nonnegative Factors .................................................... 110 6.3.1 DerivationforVariationalPdfs ......................... 112 6.3.2 SummaryoftheInferenceAlgorithm ................... 114 6.3.3 DiscussionsandInsights .............................. 115 6.4 AlgorithmAccelerations ..................................... 117 6.5 NumericalResults .......................................... 122 6.5.1 ValidationonSyntheticData ........................... 123 6.5.2 FluorescenceDataAnalysis ........................... 127 6.5.3 ENRONE-mailDataMining .......................... 129 References ..................................................... 133 7 Complex-ValuedCPD,OrthogonalityConstraint,andBeyond GaussianNoises ................................................ 135 7.1 ProblemFormulation ....................................... 135 7.2 ProbabilisticModeling ...................................... 136 7.3 InferenceAlgorithmDevelopment ............................ 138 7.3.1 Derivationfor Q((cid:2)(cid:2)(k))(cid:3),1≤k ≤ P ..................... 140 7.3.2 Derivationfor Q (cid:2)(k) ,P +1≤k ≤ N ................ 141 7.3.3 Derivationfor Q(E) ...(cid:2).......(cid:3)........................ 142 7.3.4 Derivationsfor Q(γl),Q ζi1,...,iN ,and Q(β) ............. 143 7.3.5 SummaryoftheIterativeAlgorithm .................... 144 7.3.6 FurtherDiscussions .................................. 144 7.4 SimulationResultsandDiscussions ........................... 146 7.4.1 ValidationonSyntheticData ........................... 147 7.4.2 BlindDataDetectionforDS-CDMASystems ............ 150 7.4.3 LinearImageCodingforaCollectionofImages .......... 151 References ..................................................... 154 8 HandlingMissingValue:ACaseStudyinDirection-of-Arrival Estimation ..................................................... 155 8.1 LinkingDOASubspaceEstimationtoTensorCompletion ........ 155 8.2 ProbabilisticModeling ...................................... 159 8.3 MPCEF Model Checking and Optimal Variational Pdfs Derivations ................................................ 160 8.3.1 MPCEFModelChecking ............................. 160 8.3.2 OptimalVariationalPdfsDerivations ................... 163 8.4 AlgorithmSummaryandRemarks ............................ 164 8.5 SimulationResultsandDiscussions ........................... 165 References ..................................................... 167 x Contents 9 FromCPDtoOtherTensorDecompositions ...................... 169 9.1 TuckerDecomposition(TuckerD) ............................. 169 9.2 TensorTrainDecomposition(TTD) ........................... 171 9.3 PARAFAC2 ............................................... 173 9.4 Tensor-SVD(T-SVD) ....................................... 178 References ..................................................... 182

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.