action recognition

Feature-Supervised Action Modality Transfer

This paper strives for action recognition and detection in video modalities like RGB, depth maps or 3D-skeleton sequences when only limited modality-specific labeled examples are available. For the RGB, and derived optical-flow, modality many …

Interactivity Proposals for Surveillance Videos

This paper introduces spatio-temporal interactivity proposals for video surveillance. Rather than focusing solely on actions performed by subjects, we explicitly include the objects that the subjects interact with. To enable interactivity proposals, …

Heterogeneous Non-Local Fusion for Multimodal Activity Recognition

In this work, we investigate activity recognition using mul- timodal inputs from heterogeneous sensors. Activity recog- nition is commonly tackled from a single-modal perspective using videos. In case multiple signals are used, they come from the …

Shuffled ImageNet-Banks for Video Event Detection and Search

This article aims for the detection and search of events in videos, where video examples are either scarce or even absent during training. To enable such event detection and search, ImageNet concept banks have shown to be effective. Rather than …

Few-Shot Ensemble Learning for Video Classification with SlowFast Memory Networks

In the era of big data, few-shot learning has recently received much attention in multimedia analysis and computer vision due to its appealing ability of learning from scarce labeled data. However, it has been largely underdeveloped in the video …

Actor-Transformers for Group Activity Recognition

This paper strives to recognize individual actions and group activities from videos. While existing solutions for this challenging problem explicitly model spatial and temporal relationships based on location of individual actors, we propose an …

Searching for Actions on the Hyperbole

In this paper, we introduce hierarchical action search. Starting from the observation that hierarchies are mostly ignored in the action literature, we retrieve not only individual actions but also relevant and related actions, given an action name or …

Dance with Flow: Two-in-One Stream Action Detection

The goal of this paper is to detect the spatio-temporal extent of an action. The two-stream detection network based on RGB and flow provides state-of-the-art accuracy at the expense of a large model-size and heavy computation. We propose to embed RGB …

Timeception for Complex Action Recognition

This paper focuses on the temporal aspect for recognizing human activities in videos; an important visual cue that has long been undervalued. We revisit the conventional definition of activity and restrict it to “Complex Action”: a set of one-actions …

Action recognition with dynamic image networks

We introduce the concept of "dynamic image", a novel compact representation of videos useful for video analysis, particularly in combination with convolutional neural networks (CNNs). A dynamic image encodes temporal data such as RGB or optical flow …