The mission of the QUVA-lab is to perform world-class research on deep vision. Such vision strives to automatically interpret with the aid of deep learning what happens where, when and why in images and video. Deep learning is a form of machine learning with neural networks, loosely inspired by how neurons process information in the brain. Research projects in the lab focus on learning to recognize objects in images from a single example, personalized event detection and summarization in video, and privacy preserving deep learning. The research is published in the best academic venues and secured in patents. Read the full blog post...
This project aims to learn representations able to maintain the sequential structure for video, for use cases in temporal video prediction tasks.
Automatically recognize fine-grained categories with interactive accuracy by using very deep convolutional representations computed from automatically segmented objects and automatically selected features.
Automatically detect events in a set of videos with interactive accuracy for the purpose of personal video retrieval and summarization. We strive for a generic representation that covers detection, segmentation, and recounting simultaneously, learned from few examples.
The goal of this project is to accurately count the number of arbitrary objects in an image and video independent of their apparent size, their partial presence, and other practical distractors. For use cases as in Internet of Things or robotics.
In an experimental view of tracking, the objective is to track the target’s position over time given a starting box in frame 1 or alternatively its typed category especially for long-term, robust tracking.
Often when searching for something, a user will have available just 1 or very few images of the instance of search with varying degrees of background knowledge.
The objective of this work package is to automatically generate grammatical descriptions of images that represent the meaning of a single image, based on the annotations resulting from the above projects.
Often when telling a story one is not interested in what happens in general in the video, but what happens to this instance (a person, a car to pursue, a boat participating in a race). The goal is to infer what the target encounters and describe the events that occur it.
Future applications of deep learning will run on mobile devices and use data from distributed sources. In this project we will develop new efficient distributed deep learning algorithms to improve the efficiency of learning and to exploit distributed data sources.
Deep neural networks have a very large number of hyper-parameters. In this project we develop new methods to automatically and efficiency determine these hyperparameters from data for deep neural networks.
The training process of deep neural networks requires huge datasets which are expensive to collect. In this project we aim to improve the networks data efficiency by modeling domain adapted prior knowledge like symmetry properties of the data into the networks architectures.
Successful deep learning algorithms on massive datasets and distributed over many CPUs and GPUs require dedicated algorithms. In this project we will develop novel deep learning algorithms that process observations online and make effective use of memory and computational resources.