In this project, we study a new technique for distance learning on a graph. Traditional distance learning minimizes the distance between samples of same label and at the same time keeps distance between samples of different label as big as possible. This lead to complex, usually non-convex algorithm. We reverse the problem by looking at it in a new point of view. We will optimize the distance such that it will produce a good graph. To do so, we assume that the labels are smooth on final graph in the Dirichlet sense. Mathematically we express it at y^tLy has to be small with respect of y the labels and L the graph Laplacian.

The problem is convex, however, it is not trivial to write an efficient implementation of it. Moreover the student will have to try different kernels than exponential. Finally, we will test the algorithm on datasets like USPS, MNIST,…

Depending on the quality of the work and the results, this could lead to a conference paper.

On my blog