Tensors for Data Processing This page intentionally left blank Tensors for Data Processing Theory, Methods, and Applications Edited by Yipeng Liu School of Information and Communication Engineering University of Electronic Science and Technology of China (UESTC) Chengdu, China AcademicPressisanimprintofElsevier 125LondonWall,LondonEC2Y5AS,UnitedKingdom 525BStreet,Suite1650,SanDiego,CA92101,UnitedStates 50HampshireStreet,5thFloor,Cambridge,MA02139,UnitedStates TheBoulevard,LangfordLane,Kidlington,OxfordOX51GB,UnitedKingdom Copyright©2022ElsevierInc.Allrightsreserved. MATLAB®isatrademarkofTheMathWorks,Inc.andisusedwithpermission. TheMathWorksdoesnotwarranttheaccuracyofthetextorexercisesinthisbook. Thisbook’suseordiscussionofMATLAB®softwareorrelatedproductsdoesnotconstitute endorsementorsponsorshipbyTheMathWorksofaparticularpedagogicalapproachorparticularuse oftheMATLAB®software. Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans,electronic ormechanical,includingphotocopying,recording,oranyinformationstorageandretrievalsystem, withoutpermissioninwritingfromthepublisher.Detailsonhowtoseekpermission,further informationaboutthePublisher’spermissionspoliciesandourarrangementswithorganizationssuch astheCopyrightClearanceCenterandtheCopyrightLicensingAgency,canbefoundatourwebsite: www.elsevier.com/permissions. Thisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythe Publisher(otherthanasmaybenotedherein). Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchandexperience broadenourunderstanding,changesinresearchmethods,professionalpractices,ormedicaltreatment maybecomenecessary. Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgeinevaluating andusinganyinformation,methods,compounds,orexperimentsdescribedherein.Inusingsuch informationormethodstheyshouldbemindfuloftheirownsafetyandthesafetyofothers,including partiesforwhomtheyhaveaprofessionalresponsibility. Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors,assume anyliabilityforanyinjuryand/ordamagetopersonsorpropertyasamatterofproductsliability, negligenceorotherwise,orfromanyuseoroperationofanymethods,products,instructions,orideas containedinthematerialherein. LibraryofCongressCataloging-in-PublicationData AcatalogrecordforthisbookisavailablefromtheLibraryofCongress BritishLibraryCataloguing-in-PublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary ISBN:978-0-12-824447-0 ForinformationonallAcademicPresspublications visitourwebsiteathttps://www.elsevier.com/books-and-journals Publisher:MaraConner AcquisitionsEditor:TimPitts EditorialProjectManager:CharlotteRowley ProductionProjectManager:PremKumarKaliamoorthi Designer:MilesHitchen TypesetbyVTeX Contents Listofcontributors ............................................. xiii Preface ...................................................... xix CHAPTER 1 Tensor decompositions: computations, applications, and challenges .................... 1 YingyueBi, YingcongLu, Zhen Long, Ce Zhu, and Yipeng Liu 1.1 Introduction ...................................... 1 1.1.1 Whatisatensor? ............................ 1 1.1.2 Whydoweneedtensors? ..................... 2 1.2 Tensoroperations ................................. 3 1.2.1 Tensornotations ............................ 3 1.2.2 Matrixoperators ............................ 4 1.2.3 Tensortransformations ....................... 6 1.2.4 Tensorproducts ............................. 7 1.2.5 Structuraltensors ............................ 11 1.2.6 Summary.................................. 13 1.3 Tensordecompositions ............................. 13 1.3.1 Tuckerdecomposition ........................ 13 1.3.2 Canonicalpolyadicdecomposition ............... 14 1.3.3 Blocktermdecomposition ..................... 16 1.3.4 Tensorsingularvaluedecomposition ............. 18 1.3.5 Tensornetwork ............................. 19 1.4 Tensorprocessingtechniques ......................... 24 1.5 Challenges....................................... 25 References....................................... 26 CHAPTER 2 Transform-based tensor singular value decomposition in multidimensional image recovery 31 Tai-Xiang Jiang, Michael K. Ng, and Xi-Le Zhao 2.1 Introduction ...................................... 32 2.2 Recentadvancesofthetensorsingularvaluedecomposition .. 34 2.2.1 Preliminariesandbasictensornotations ........... 34 2.2.2 Thet-SVDframework ........................ 35 2.2.3 Tensornuclearnormandtensorrecovery .......... 38 2.2.4 Extensions ................................. 41 2.2.5 Summary.................................. 44 2.3 Transform-basedt-SVD ............................. 44 2.3.1 Linearinvertibletransform-basedt-SVD .......... 45 v vi Contents 2.3.2 Beyondinvertibilityanddataadaptivity ........... 47 2.4 Numericalexperiments ............................. 49 2.4.1 Exampleswithinthet-SVDframework ........... 49 2.4.2 Examplesofthetransform-basedt-SVD .......... 51 2.5 Conclusionsandnewguidelines....................... 53 References....................................... 55 CHAPTER 3 Partensor ...................................... 61 Paris A. Karakasis, Christos Kolomvakis, George Lourakis, George Lykoudis,Ioannis Marios Papagiannakos, Ioanna Siaminou, Christos Tsalidis, and Athanasios P. Liavas 3.1 Introduction ...................................... 62 3.1.1 Relatedwork ............................... 62 3.1.2 Notation .................................. 63 3.2 Tensordecomposition .............................. 64 3.2.1 Matrixleast-squaresproblems .................. 65 3.2.2 Alternatingoptimizationfortensordecomposition ... 69 3.3 Tensordecompositionwithmissingelements ............. 70 3.3.1 Matrixleast-squareswithmissingelements ........ 71 3.3.2 Tensordecompositionwithmissingelements:the unconstrainedcase........................... 74 3.3.3 Tensordecompositionwithmissingelements:the nonnegativecase ............................ 75 3.3.4 Alternatingoptimizationfortensordecomposition withmissingelements ........................ 75 3.4 Distributedmemoryimplementations ................... 75 3.4.1 SomeMPIpreliminaries ...................... 75 3.4.2 Variablepartitioninganddataallocation........... 77 3.4.3 Tensordecomposition ........................ 79 3.4.4 Tensordecompositionwithmissingelements ....... 81 3.4.5 Someimplementationdetails ................... 82 3.5 Numericalexperiments ............................. 83 3.5.1 Tensordecomposition ........................ 83 3.5.2 Tensordecompositionwithmissingelements ....... 84 3.6 Conclusion ...................................... 87 Acknowledgment.................................. 88 References....................................... 88 CHAPTER 4 A Riemannian approach to low-rank tensor learning ....................................... 91 Hiroyuki Kasai, Pratik Jawanpuria, and Bamdev Mishra 4.1 Introduction ...................................... 91 4.2 AbriefintroductiontoRiemannianoptimization .......... 93 Contents vii 4.2.1 Riemannianmanifolds ........................ 94 4.2.2 Riemannianquotientmanifolds ................. 95 4.3 RiemannianTuckermanifoldgeometry ................. 97 4.3.1 Riemannianmetricandquotientmanifoldstructure .. 97 4.3.2 Characterizationoftheinducedspaces ............ 100 4.3.3 Linearprojectors ............................ 102 4.3.4 Retraction ................................. 103 4.3.5 Vectortransport ............................. 104 4.3.6 Computationalcost .......................... 104 4.4 Algorithmsfortensorlearningproblems ................ 104 4.4.1 Tensorcompletion ........................... 105 4.4.2 Generaltensorlearning ....................... 106 4.5 Experiments ..................................... 107 4.5.1 Choiceofmetric ............................ 108 4.5.2 Low-ranktensorcompletion ................... 109 4.5.3 Low-ranktensorregression .................... 113 4.5.4 Multilinearmultitasklearning .................. 115 4.6 Conclusion ...................................... 116 References....................................... 117 CHAPTER 5 Generalized thresholding for low-rank tensor recovery: approaches based on model and learning ....................................... 121 Fei Wen, ZhonghaoZhang, and YipengLiu 5.1 Introduction ...................................... 121 5.2 Tensorsingularvaluethresholding ..................... 123 5.2.1 Proximityoperatorandgeneralizedthresholding .... 123 5.2.2 Tensorsingularvaluedecomposition ............. 126 5.2.3 Generalizedmatrixsingularvaluethresholding ..... 128 5.2.4 Generalizedtensorsingularvaluethresholding ...... 129 5.3 Thresholdingbasedlow-ranktensorrecovery ............. 131 5.3.1 Thresholdingalgorithmsforlow-ranktensorrecovery 132 5.3.2 Generalizedthresholdingalgorithmsforlow-rank tensorrecovery ............................. 134 5.4 Generalizedthresholdingalgorithmswithlearning ......... 136 5.4.1 Deepunrolling.............................. 137 5.4.2 Deepplug-and-play .......................... 140 5.5 Numericalexamples ............................... 141 5.6 Conclusion ...................................... 145 References....................................... 147 CHAPTER 6 Tensor principal component analysis ............. 153 Pan Zhou, Canyi Lu, and Zhouchen Lin 6.1 Introduction ...................................... 153 viii Contents 6.2 Notationsandpreliminaries .......................... 155 6.2.1 Notations.................................. 156 6.2.2 DiscreteFouriertransform ..................... 157 6.2.3 T-product ................................. 159 6.2.4 Summary.................................. 160 6.3 TensorPCAforGaussian-noisydata ................... 161 6.3.1 Tensorrankandtensornuclearnorm ............. 161 6.3.2 AnalysisoftensorPCAonGaussian-noisydata ..... 165 6.3.3 Summary.................................. 166 6.4 TensorPCAforsparselycorrupteddata ................. 166 6.4.1 RobusttensorPCA .......................... 167 6.4.2 Tensorlow-rankrepresentation ................. 172 6.4.3 Applications ............................... 186 6.4.4 Summary.................................. 191 6.5 TensorPCAforoutlier-corrupteddata .................. 191 6.5.1 OutlierrobusttensorPCA ..................... 192 6.5.2 ThefastOR-TPCAalgorithm .................. 196 6.5.3 Applications ............................... 198 6.5.4 Summary.................................. 206 6.6 OthertensorPCAmethods........................... 207 6.7 Futurework ...................................... 208 6.8 Summary ........................................ 208 References....................................... 209 CHAPTER 7 Tensors for deep learning theory ................. 215 Yoav Levine, Noam Wies, Or Sharir, Nadav Cohen, and Amnon Shashua 7.1 Introduction ...................................... 215 7.2 Boundingafunction’sexpressivityviatensorization........ 217 7.2.1 Ameasureofcapacityformodelinginput dependencies ............................... 218 7.2.2 Boundingcorrelationswithtensormatricizationranks 220 7.3 Acasestudy:self-attentionnetworks ................... 223 7.3.1 Theself-attentionmechanism .................. 223 7.3.2 Self-attentionarchitectureexpressivityquestions .... 227 7.3.3 Resultsontheoperationofself-attention .......... 230 7.3.4 Boundingtheseparationrankofself-attention ...... 235 7.4 Convolutionalandrecurrentnetworks .................. 242 7.4.1 Theoperationofconvolutionalandrecurrentnetworks 243 7.4.2 Addressedarchitectureexpressivityquestions ...... 243 7.5 Conclusion ...................................... 245 References....................................... 245 CHAPTER 8 Tensor network algorithms for image classification 249 Cong Chen, Kim Batselier, and Ngai Wong Contents ix 8.1 Introduction ...................................... 249 8.2 Background ...................................... 251 8.2.1 Tensorbasics ............................... 251 8.2.2 Tensordecompositions ....................... 253 8.2.3 Supportvectormachines ...................... 256 8.2.4 Logisticregression .......................... 257 8.3 Tensorialextensionsofsupportvectormachine ........... 258 8.3.1 Supervisedtensorlearning ..................... 258 8.3.2 Supporttensormachines ...................... 260 8.3.3 Higher-ranksupporttensormachines ............. 263 8.3.4 SupportTuckermachines...................... 265 8.3.5 Supporttensortrainmachines .................. 269 8.3.6 Kernelizedsupporttensortrainmachines .......... 275 8.4 Tensorialextensionoflogisticregression ................ 284 8.4.1 Rank-1logisticregression ..................... 285 8.4.2 Logistictensorregression ..................... 286 8.5 Conclusion ...................................... 288 References....................................... 289 CHAPTER 9 High-performance tensor decompositions for compressing and accelerating deep neural networks....................................... 293 Xiao-YangLiu, Yiming Fang, Liuqing Yang, Zechu Li, and Anwar Walid 9.1 Introductionandmotivation .......................... 294 9.2 Deepneuralnetworks .............................. 295 9.2.1 Notations.................................. 295 9.2.2 Linearlayer ................................ 295 9.2.3 Fullyconnectedneuralnetworks ................ 298 9.2.4 Convolutionalneuralnetworks.................. 300 9.2.5 Backpropagation ............................ 303 9.3 Tensornetworksandtheirdecompositions ............... 305 9.3.1 Tensornetworks ............................ 305 9.3.2 CPtensordecomposition ...................... 308 9.3.3 Tuckerdecomposition ........................ 310 9.3.4 HierarchicalTuckerdecomposition .............. 313 9.3.5 Tensortrainandtensorringdecomposition ........ 315 9.3.6 Transform-basedtensordecomposition ........... 318 9.4 Compressingdeepneuralnetworks .................... 321 9.4.1 Compressingfullyconnectedlayers .............. 321 9.4.2 CompressingtheconvolutionallayerviaCP decomposition .............................. 322 9.4.3 CompressingtheconvolutionallayerviaTucker decomposition .............................. 325