ebook img

Coresets for k-Means and k-Median Clustering and their Applications PDF

50 Pages·2012·0.62 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Coresets for k-Means and k-Median Clustering and their Applications

Coresets for k-Means and k-Median Clustering and their Applications Sariel Har-Peled Soham Mazumdar UIUC d Compute C = fc ; : : : ; c g (cid:18) IR - centers 1 k Minimize price: (cid:23) (P; k) = min dist(p; C) opt d C(cid:18)IR ;jCj=k X p2P where dist(p; C) = min kpc k i i Advantages: Less sensitive to noise. Theoretically nice. Problem k median clustering P d Input: - set of n points in IR . k - number of clusters O k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.1 Advantages: Less sensitive to noise. Theoretically nice. Problem k median clustering P d Input: - set of n points in IR . k - number of clusters d Compute C = fc ; : : : ; c g (cid:18) IR - centers 1 k Minimize price: (cid:23) (P; k) = min dist(p; C) opt d C(cid:18)IR ;jCj=k X p2P where dist(p; C) = min kpc k i i O k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.1 Problem k median clustering P d Input: - set of n points in IR . k - number of clusters d Compute C = fc ; : : : ; c g (cid:18) IR - centers 1 k Minimize price: (cid:23) (P; k) = min dist(p; C) opt d C(cid:18)IR ;jCj=k X p2P where dist(p; C) = min kpc k i i Advantages: Less sensitive to noise. Theoretically nice. k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.1 [Arora et al. (1998)] O(1=")+1 O n (cid:16) (cid:17) [Kolliopoulos and Rao (1999)] O(% (cid:1) n log n log k) (Discrete) where d(cid:0)1 % = exp [O((1 + log 1=")=") ] Our result: O(1) O(1) O n + %k log n (cid:16) (cid:17) (1 + ")-approx k-Median Motivated by [Arora (1998)] - Approx. TSP O k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.2 [Kolliopoulos and Rao (1999)] O(% (cid:1) n log n log k) (Discrete) where d(cid:0)1 % = exp [O((1 + log 1=")=") ] Our result: O(1) O(1) O n + %k log n (cid:16) (cid:17) (1 + ")-approx k-Median Motivated by [Arora (1998)] - Approx. TSP [Arora et al. (1998)] O(1=")+1 O n (cid:16) (cid:17) O k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.2 Our result: O(1) O(1) O n + %k log n (cid:16) (cid:17) (1 + ")-approx k-Median Motivated by [Arora (1998)] - Approx. TSP [Arora et al. (1998)] O(1=")+1 O n (cid:16) (cid:17) [Kolliopoulos and Rao (1999)] O(% (cid:1) n log n log k) (Discrete) where d(cid:0)1 % = exp [O((1 + log 1=")=") ] O k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.2 (1 + ")-approx k-Median Motivated by [Arora (1998)] - Approx. TSP [Arora et al. (1998)] O(1=")+1 O n (cid:16) (cid:17) [Kolliopoulos and Rao (1999)] O(% (cid:1) n log n log k) (Discrete) where d(cid:0)1 % = exp [O((1 + log 1=")=") ] Our result: O(1) O(1) O n + %k log n (cid:16) (cid:17) k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.2 (1 + ")-approx k-Median High Dimension [Ba˘ doiu et al. (2002)] (k=")O(1) O(1) O(k) 2 d n log n k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.3 d Compute: C = fc ; : : : ; c g (cid:18) IR - centers 1 k 2 Price: (cid:23) (P; k) = min (dist(pC)) opt d C(cid:18)IR ;jCj=k X p2P where dist(p; C) = min kpc k i i Advantages: Less sensitive to noise. Efficient heuristic: Lloyd’s method. k-means clustering d Input: P - set of n points in IR . k - number of cluster O k k Coresetsfor -Meansand -MedianClusteringandtheirApplications–p.4

Description:
Motivated by [Arora (1998)] - Approx. TSP. [Arora et al. (1998)]. O(n. O(1/ε)+1). ▽Coresets for k-Means and k-Median Clustering and their Applications
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.