ebook img

The Nonnegative Matrix Factorization: a tutorial. - Amy N. Langville PDF

64 Pages·2007·2.26 MB·English
by  LangvilleAmy
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The Nonnegative Matrix Factorization: a tutorial. - Amy N. Langville

The Nonnegative Matrix Factorization a tutorial Barbara Ball Atina Brooks Amy Langville [email protected] [email protected] [email protected] C. of Charleston N.C. State U. C. of Charleston Mathematics Dept. Statistics Dept. Mathematics Dept. NISS NMF Workshop February 23–24, 2007 Outline • Two Factorizations: — Singular Value Decomposition — Nonnegative Matrix Factorization • Why factor anyway? • Computing the NMF — Early Algorithms — Recent Algorithms • Extensions of NMF Data Matrix A with rank r × m n Examples term-by-document matrix feature-by-item matrix pixel intensity-by-image matrix user-by-purchase matrix gene-by-DNA microarray matrix terrorist-by-action matrix SVD (cid:1) T r T A = UΣ V = σ u v i=1 i i i What is the SVD? 7 of 30 decreasing importance The SVD 8 of 30 Data Matrix A with rank r × m n Examples term-by-document matrix feature-by-item matrix pixel intensity-by-image matrix user-by-purchase matrix gene-by-DNA microarray matrix terrorist-by-action matrix SVD (cid:1) T r T A = UΣ V = σ u v i=1 i i i Low Rank Approximation (cid:1) k T use A = σ u v in place of A k i=1 i i i                    SVD Rank Reduction 10 of 30 Why use Low Rank Approximation? • Data Compression and Storage when k << r • Remove noise and uncertainty ⇒ improved performance on data mining task of retrieval (e.g., find similar items) ⇒ improved performance on data mining task of clustering Properties of SVD • basis vectors u and v are orthogonal i i • u , v are mixed in sign ij ij T A = U Σ V k k k k nonneg mixed nonneg mixed • U, V are dense • uniqueness—while there are many SVD algorithms, they all create the same (truncated) factorization • optimality—of all rank-k approximations, A is optimal k (cid:2) − (cid:2) (cid:2) − (cid:2) A A = min A B ≤ k F rank(B) k F Summary of Truncated SVD Strengths • using A in place of A gives improved performance k • noise reduction isolates essential components of matrix • best rank-k approximation • A is unique k Weaknesses • storage—U and V are usually completely dense k k • interpretation of basis vectors is difficult due to mixed signs • good truncation point k is hard to determine 8 7 6 • 5 orthogonality restriction sigma 4 3 2 1 00 20 40 60 80 100 120 k=28

Description:
Nonnegative Matrix Factorization a tutorial. Barbara Ball. Atina Brooks Amy Langville [email protected] [email protected] [email protected].
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.