ARC Colloquium: Ravi Kannan (MSR)

Algorithms & Randomness Center (ARC)

Ravi Kannan (MSR)

Monday, March 25, 2019

Klaus 1116E - 11:00 am

 

Title:  A General Algorithm for Unsupervised Learning problems

Abstract:  The following simply-stated geometric problem includes as special cases the core problems of a  number  of  areas  in  Unsupervised  Learning,  including,  Topic  Modeling,  Non-negative  Matrix Factorization, Clustering, Stochastic Block Models and Overalapping Communities Detection:

There is an unknown polytope K  ∈ Rd with k vertices.  We are given n data points, each aperturbation of some point in K.  The problem is to find K, i.e., its vertices (approximately).  [The perturbations are large; indeed, many data points lie outside K.]

Our main result is an algorithm which solves this general problem under two natural assumptions.  Our assumptions are technically different,  but similar in spirit to existing models for the special cases.  We assume separation between the vertices of K  and the existence of “pure” data points whose unperturbed versions are close to the vertices of K.  Notably we do not assume any stochastic  model  of  data.   Our  algorithm  has  better  complexity  than  known  algorithms  for  the special cases when the input matrix A is sparse and k is relatively small compared to n, d.

The algorithm is simply stated, but the proof of correctness is involved.  It draws on tools in Numerical Analysis, especially perturbation of singular spaces of matrices.  Here is a description of our algorithm:  It has k stages; in each stage, it picks a certain random vector u, finds the m largest u · x over data points x and outputs the average of these data points as an approximation to a new vertex of K.

Joint Work with C. Bhattacharyya

 

----------------------------------

Speaker's Webpage

Videos of recent talks are available at: https://smartech.gatech.edu/handle/1853/46836

Click here to subscribe to the seminar email list: arc-colloq@cc.gatech.edu

Event Details

Date/Time:

  • Monday, March 25, 2019
    12:00 pm - 1:00 pm