Loading...
Vacancies

PhD student | Video action recognition

Cees Snoek

The goal of this work package is to recognize actions in video content. The current state-of-the-art in video recognition exploits bigger and bigger video training sets to feed the learning by supervised convolutional neural networks. Recently, several machine learning methods have been explored to minimize the number of examples for robust recognition of activities in video. Most notably self-supervised learning, weakly-supervised learning, few-shot learning, and zero-shot learning. Existing data-efficient video representations are pre-trained. Hence, they suffer from domain-specialization and are unable to adapt dynamically, leading to limited generalization. In this work package we study, develop and benchmark new data-efficient learning algorithms and architectures for spatio-temporal video action recognition that exploit the sensory, semantic and streaming ability of the video medium in an offline and online fashion.

Details to the application process can be found here soon.

PhD student | Multi-task multi-modal learning

Cees Snoek

The goal of this work package is to contribute novel algorithms for multi-task and multi-modal learning. Rather than specializing the learning for a single-purpose, multi-task networks exploit multiple purposes simultaneously for mutual benefit. Examples are learning depth regression, semantic and instance segmentation from a single image, or two-stream networks, where one branch learns what the object is, while the other branch learns the object colors. Many multi-task and multi-modal networks are data-hungry as they need labels per task and per modality. Recently, several machine learning methods have been explored to minimize the number of examples for robust multi-task and multi-modal learning. Most notably meta-learning, self-supervised learning, and transfer learning. In this work package we study, develop and benchmark new data-efficient algorithms and architectures for multi-task learning that exploit the commonalities and differences between modalities at sensory, representation and semantic levels.

Details to the application process can be found here soon.

PhD student | Video representation and efficiency

Cees Snoek, Efstratios Gavves

The goal of this work package is efficient deep neural networks for spatiotemporal video inputs of different lengths and complexities. The current state-of-the-art relies on heavy computational blocks and cumbersome transfer learning methodologies to pre-train on bigger and bigger imaging datasets, thus limiting applicability to short and not too complex video sequences. In the late years several approaches –mostly for images- have been proposed to improve data efficiency as well as computational efficiency, including separate convolutions in MobileNet, convolution decomposition, automated neural architecture search and multi-scale feature pyramids. Notably, for most of the said approaches it is not straightforward to export them to video inputs due to the inherent spatiotemporal complexity. In this work package we study, develop and benchmark new data-efficient and computationally-efficient neural network models and architectures that are optimal for videos of varying lengths and complexities.

Details to the application process can be found here soon.

PhD student | Hardware-aware learning

Max Welling

The goal of this work package is to automatically optimize Neural Networks for different edge devices in mobile and autonomous driving platforms. This entails research towards a unified framework for hardware-aware learning: combining quantization schemes for inference and training, hardware-aware neural architecture search, network compression and conditional computing for image, video, language and speech processing. The current state-of-the-art typically treats these topics as completely separated subjects, where most recent work focuses on image classification only, is either hardware agnostic, for GPU-processing only or based on human-designed heuristics. In this workpackage, we study and develop novel approaches for hardware-aware learning, focusing on actual hardware constraints, and work towards a unified framework for hardware-aware learning.

Details to the application process can be found here soon.

PhD student | Federated learning

Max Welling

Federated learning deals with distributed learning of models from data across many user devices, while keeping the user data private by only communicating the parameters of the model between the server and user devices. There are many challenges both in terms of learning in a distributed setting and at the same time keeping the learning protocol privacy preserving. For instance, can we learn robustly when the data are not identically distributed over the shards, can we learn in a continuous online manner, can we learn with limited communication bandwidth, can we learn on heterogenous devices, and can we maintain privacy while learning? While federated systems offers a certain degree of privacy, more sophisticated techniques are necessary for formal guarantees such as differential privacy and secure aggregation. Furthermore, such a system must also be robust in the presence of malicious adversaries during training, e.g. via data and model update poisoning, as well as during inference, e.g. evading the prediction of the model. In this work package we will study and develop novel robust distributed algorithms and techniques that will advance the state of the art in federated learning while focusing on the privacy and safety aspects.

Details to the application process can be found here soon.

PhD student | Combinatorial optimization

Max Welling

Discrete optimization is everywhere: from chip layout design, to neural architecture search, to compiler optimization and optimal quantization. The usual gradient descent techniques we use for deep learning do not work when discrete variables are involved. Some solutions have been proposed in the past to handle gradients of discrete variables, such as the ‘Gumbel-Softmax’ method, straight through estimators, or REINFORCE estimators with variance reduction. However, they either suffer from bias or variance and are often insufficient to use at scale or in deep models. Another line of research is to apply Bayesian optimization methods or Reinforcement learning to combinatorial optimization. In this WP we will study new ways to incorporate and combine discrete combinatorial optimization with deep neural networks. We will study new Bayesian optimization methods, improve RL based methods, and incorporate classical combinatorial solvers into deep neural architectures.

Details to the application process can be found here soon.

PhD student | Unsupervised learning for source compression

Max Welling

Audio, images, and video data can take up a lot of space and bandwidth, and so compression of these data is an important problem. The problem of source compression has deep roots in information and probability theory, and has connections to generative modelling and variational Bayesian inference, VAEs, GANs, and representation learning. Recently, deep learning based compression methods have started to significantly outperform classical methods on both lossless and lossy compression. The goal of this project is to further improve deep learning based lossy and lossless compression methods in terms of their rate/distortion performance and visual quality. This will involve developing new theoretical insights and fundamental innovation as well as serious engineering and experimentation. Specifically, better and faster methods are required for modelling the distribution of latent codes, e.g. using hierarchical latent variable models. Perceptual quality is currently measured by MS-SSIM, but better metrics (learned or fixed) are needed. Bits-back coding currently allows for optimal lossless compression with stochastic encoders, but there is no practical equivalent for lossy compression. Finally, practical deployment of DL-based compression requires bitwise exact reproducibility, which requires investigating quantization techniques.

Details to the application process can be found here soon.