# Research

## Audio inpainting with similarity graphs

I’m very proud to announce the release of a new kind of audio inpainting algorithm. It is able to reconstruct long missing parts of a song by searching through the rest of the content for a suitable replacement.

You can try the algorithm online or download the code here and run it on your machine. A technical report associated with this algorithm is available on arXiv.

Enjoy!

Abstract
In this contribution, we present a method to compensate for long duration data gaps in audio signals, in particular music. To achieve this task, a similarity graph is constructed, based on a short-time Fourier analysis of reliable signal segments, e.g. the uncorrupted remainder of the music piece, and the temporal regions adjacent to the unreliable section of the signal. A suitable candidate segment is then selected through an optimization scheme and smoothly inserted into the gap.

## Shall I use MATLAB or Python during my thesis?

At the beginning of my thesis, I chose to use MATLAB as a main programming language for my simulations. Today I believe it was the wrong choice.

Why did I opt for MATLAB?

Mainly because I was very used to it. Since it’s a quick and easy prototyping language, I was able to test my ideas in no time. The user interface is really intuitive too and makes debugging simple. And, since I was maintaining two MATLAB toolboxes, I had a lot of code that and I didn’t want to rewrite in Python. Finally, Python was frightening me because it had a very bad user-interface. So, to master this “almost” new language, I would have had to go through a slow learning process and invested a lot of time too.

What changed my mind today?

Ipython-notebook is an interface that connects the python-console to a web-browser, allowing the user to easily make plots, run cells, add comments, etc. Because of its success, it was extended to other programming languages and developed into a project called Jupyter. You can even use MATLAB with it.

The notebook gave a fresh new start to Python in the scientific community and new toolboxes were ported from MATLAB to Python. For my personal use, the gap in scientific tools between the two languages has been hugely reduced in the last two years.

On the other hand though, MATLAB isn’t able to deal with its main flows. It’s still expensive, close-source, inefficient and complicated to interface with other programming languages. And, remember no one cares if you know MATLAB, however mastering Python is a great asset for you CV.

Conclusion

While I believe MATLAB is still a great tool to experiment and play with, I’m not sure Python isn’t even better for this task. When it comes to seriously implement something, I believe Python is better. At the beginning of a thesis, PhD students often believe that they need to be productive. This is wrong and the first year should be leveraged to understand the fundamentals of the field and to find appropriate work tools. So, if you’re at the beginning of you thesis, I can only recommend you learn Python. I’m quite sure you won’t regret it.

## Where to find datasets?

I do not know any website or repository gathering all datasets. In this blog, I’m just listing a few links pointing to datasets or datasets websites. This list will grow with time.

## A starter kit for Deep Learning

Courses

1. A MOOC from Geoffrey Hinton, one of the fathers of deep learning
https://www.coursera.org/course/neuralnets
2. https://cs231n.github.io/

Blog posts

Selected software
The three main tools are:

1. The classic guy in python
http://deeplearning.net/software/theano
Tutorials found in
http://deeplearning.net/tutorial/
2. The other guy in the competition. (I started with that one)
http://torch.ch/
Torch has the advantage to be interfaced with Lua. It offers a simple way to create the neural nets. Recently, pytorch gained a lot of attention
http://pytorch.org/
3. Tensorflow, the new coming guy from google
http://www.tensorflow.org/get_started/index.html

As a recommendation, I would advice pytorch of tensorflow.

MATLAB is not a very appropriate language for deep learning. However, it is interesting to use for learning purposes.

Publications (To be done)
http://research.microsoft.com/pubs/192769/tricks-2012.pdf