Algorithms & Randomness Center (ARC) and Indo-US Virtual Center Seminar
Pravesh Kothari (CMU)
Monday, June 8, 2020
Virtual via Bluejeans - 11:30 am
Title: Outlier-robust Clustering of Gaussian Mixtures
Abstract: We give efficient algorithms for robustly clustering of mixtures of "reasonable" distributions, including the well-known open problem of robustly clustering a mixture of arbitrary Gaussians. Specifically, we give an outlier-robust efficient algorithm for clustering a mixture of k Gaussians with pairwise TV distance 1-exp(k^k/\eta). The running time of our algorithm is d^{(k/\eta)^{O(k)}}. More generally, our algorithm succeeds for mixtures of distributions that satisfy two well-studied analytic assumptions - certifiable hypercontractivity and anti-concentration. Thus, it extends to clustering mixtures of arbitrary affine transforms of the uniform distribution on the d-dimensional unit sphere. Even the information-theoretic clusterability of distributions satisfying our analytic assumptions was not known and is likely to be of independent interest. Our techniques expand the sum-of-squares toolkit to show robust certifiability of TV-separated Gaussian clusters in data. This involves a low-degree sum-of-squares proof of statements that relate parameter distance to total variation distance simply relying on hypercontractivity and anti-concentration.
It remains open to improve the running time of the algorithms and to give a robust parameter estimation algorithm for Gaussian mixtures with no separation assumptions.
Based on joint work with Ainesh Bakshi (CMU).
----------------------------------
Videos of recent talks are available at: https://smartech.gatech.edu/handle/1853/46836
Click here to subscribe to the seminar email list: arc-colloq@Klauscc.gatech.edu