Supervision: Benjamin Ricaud

Project type: Master thesis

Finished

This project may be done as a company internship for a master student, for the summer 2021 instead of a master project.

With the actual data deluge, companies are facing difficult challenges finding the right information among terabytes of data. For example, distributed sensors delivering measured values at a high temporal sampling rate require a fast and efficient processing. The data has to be analyzed in a streaming fashion, with a small latency. The actual methods are quickly overwhelmed by the amount of data and new methods have to be designed.

In many applications, only a small part of the signal is of interest. This is the case when the purpose of the sensors is to detect anomalies or deviations from the normal activity. We can take advantage of the sparsity and rarety of the anomalous patterns to reduce dramatically the processing load. A 3-steps approach can be adopted comprising 1) preprocessing with data reduction and filtering, 2) feature extraction and 3) machine learning classification. The first step is crucial as it allows for more elaborated algorithms in step 3) and hence potentially better classification of threats and alarms.

In this master project co-supervised by the LTS2 and a Swiss company, the student will explore signal processing methods and machine learning algorithms that would enable the detection of abnormal activity among the terabytes of data already collected. He will combine them in a pipeline of data reduction, feature extraction and supervised or unsupervised learning steps to capture key information.

Main objectives:

  • Understand the measurement principle and the application in order to design efficient data reduction methods

  • Review and test different data processing and feature extraction methods,

  • Classify anomalies using machine learning, identify the most promising combinations of signal processing and machine learning models.

The student must have a good knowledge of signal processing. Previous projects and experience working on the analysis/filtering of time series and/or sensors signals and machine learning would be a plus. A basic knowledge of machine learning is required and the student is expected to learn advanced ML techniques during the project.