The mission of the QUVA lab is to perform world-class research on deep vision. Such vision strives to automatically interpret with the aid of deep learning what happens where, when and why in images and video. Deep learning is a form of machine learning with neural networks, loosely inspired by how neurons process information in the brain (see side bar). Research projects in the lab will focus on learning to recognize objects in images from a single example, personalized event detection and summarization in video, and privacy preserving deep learning. The research will be published in the best academic venues and secured in patents.
Project 1 CS: Temporal modeling in videos.
This project aims to learn representations able to maintain the sequential structure of video, for use cases in temporal video prediction tasks.
Project 2 CS: Fine-grained object recognition.
Automatically recognize fine-grained categories with interactive accuracy by using very deep convolutional representations computed from automatically segmented objects and automatically selected features.
Project 3 CS: Personal event detection and recounting.
Automatically detect events in a set of videos with interactive accuracy for the purpose of personal video retrieval and summarization. We strive for a generic representation that covers detection, segmentation, and recounting simultaneously, learned from few examples.
Project 4 CS: Counting.
The goal of this project is to accurately count the number of arbitrary objects in an image and video independent of their apparent size, their partial presence, and other practical distractors. For use cases as in Internet of Things or robotics.
Project 5 AS: Robust Mobile Tracking.
In an experimental view of tracking, the objective is to track the target’s position over time given a starting box in frame 1 or alternatively its typed category especially for long-term, robust tracking.
Project 6 AS: One shot visual instance search.
Often when searching for something, a user will have available just 1 or very few images of the instance of search with varying degrees of background knowledge.
Project 7 AS: Statistical machine translation.
The objective of this work package is to automatically generate grammatical descriptions of images that represent the meaning of a single image, based on the annotations resulting from the above projects.
Project 8 AS: The story of this.
Often when telling a story one is not interested in what happens in general in the video, but what happens to this instance (a person, a car to pursue, a boat participating in a race). The goal is to infer what the target encounters and describe the events that occur it.
Project 9 MW: Distributed deep learning.
Future applications of deep learning will run on mobile devices and use data from distributed sources. In this project we will develop new efficient distributed deep learning algorithms to improve the efficiency of learning and to exploit distributed data sources.
Project 10 MW: Automated Hyper-parameter Optimization.
Deep neural networks have a very large number of hyper-parameters. In this project we develop new methods to automatically and efficiency determine these hyperparameters from data for deep neural networks.
Project 11 MW: Symmetry adapted network architectures.
The training process of deep neural networks requires huge datasets which are expensive to collect. In this project we aim to improve the networks data efficiency by modeling domain adapted prior knowledge like symmetry properties of the data into the networks architectures.
Project 12 MW:New learning rules for deep generative models
Peter O’ Connor
Successful deep learning algorithms on massive datasets and distributed over many CPUs and GPUs require dedicated algorithms. In this project we will develop novel deep learning algorithms that process observations online and make effective use of memory and computational resources.