The mission of the QUVA lab is to perform world-class research on deep vision. Such vision strives to automatically interpret with the aid of deep learning what happens where, when and why in images and video. Deep learning is a form of machine learning with neural networks, loosely inspired by how neurons process information in the brain (see side bar). Research projects in the lab will focus on learning to recognize objects in images from a single example, personalized event detection and summarization in video, and privacy preserving deep learning. The research will be published in the best academic venues and secured in patents.
Project 1 CS: Spatiotemporal representations for action recognition.
Automatically recognize actions in video, preferably which action appears when and where as captured by a mobile phone, and learned from example videos and without example videos.
Project 2 CS: Fine-grained object recognition.
Automatically recognize fine-grained categories with interactive accuracy by using very deep convolutional representations computed from automatically segmented objects and automatically selected features.
Project 3 CS: Personal event detection and recounting.
Automatically detect events in a set of videos with interactive accuracy for the purpose of personal video retrieval and summarization. We strive for a generic representation that covers detection, segmentation, and recounting simultaneously, learned from few examples.
Project 4 CS: Counting.
The goal of this project is to accurately count the number of arbitrary objects in an image and video independent of their apparent size, their partial presence, and other practical distractors. For use cases as in Internet of Things or robotics.
Project 5 AS: One shot visual instance search.
Often when searching for something, a user will have available just 1 or very few images of the instance of search with varying degrees of background knowledge.
Project 6 AS: Robust Mobile Tracking.
In an experimental view of tracking, the objective is to track the target’s position over time given a starting box in frame 1 or alternatively its typed category especially for long-term, robust tracking.
Project 7 AS: The story of this.
Often when telling a story one is not interested in what happens in general in the video, but what happens to this instance (a person, a car to pursue, a boat participating in a race). The goal is to infer what the target encounters and describe the events that occur it.
Project 8 AS: Statistical machine translation.
The objective of this work package is to automatically generate grammatical descriptions of images that represent the meaning of a single image, based on the annotations resulting from the above projects.
Project 9 MW: Distributed deep learning.
Future applications of deep learning will run on mobile devices and use data from distributed sources. In this project we will develop new efficient distributed deep learning algorithms to improve the efficiency of learning and to exploit distributed data sources.
Project 10 MW: Automated Hyper-parameter Optimization.
Deep neural networks have a very large number of hyper-parameters. In this project we develop new methods to automatically and efficiency determine these hyperparameters from data for deep neural networks.
Project 11 MW: Privacy Preserving Deep Learning.
Training deep neural networks from distributed data sources must take privacy considerations into account. In this project we will develop new distributed and privacy preserving learning algorithms for deep neural networks.
Project 12 MW: Novel Deep Learning Algorithms
Peter O’ Connor
Successful deep learning algorithms on massive datasets and distributed over many CPUs and GPUs require dedicated algorithms. In this project we will develop novel deep learning algorithms that process observations online and make effective use of memory and computational resources.