DVU 2022 : Deep Video Understanding Grand Challenge, ACM MM 2022

posted by organizer: trecvid || 2395 views || tracked by 1 users: [display]

DVU 2022 : Deep Video Understanding Grand Challenge, ACM MM 2022

Link: https://sites.google.com/view/dvuchallenge2022

When	Oct 10, 2022 - Oct 14, 2022
Where	Lisbon
Submission Deadline	Jun 18, 2022
Notification Due	Jul 7, 2022
Final Version Due	Jul 20, 2022

Categories multimedia understanding video retrieval multimodal interaction movie analysis

Call For Papers

Deep video understanding is a difficult task which requires systems to develop a deep analysis and understanding of the relationships between different entities in video, to use known information to reason about other, more hidden information, and to populate a knowledge graph (KG) representation with all acquired information. To work on this task, a system should take into consideration all available modalities (speech, image/video, and in some cases text). The aim of this challenge series is to push the limits of multimodal extraction, fusion, and analysis techniques to address the problem of analyzing long duration videos holistically and extracting useful knowledge to utilize it in solving different types of queries. The target knowledge includes both visual and non-visual elements. As videos and multimedia data are getting more and more popular and usable by users in different domains and contexts, the research, approaches and techniques we aim to be applied in this Grand Challenge will be very relevant in the coming years and near future.

Challenge Overview:

Interested participants are invited to apply their approaches and methods on an extended novel Deep Video Understanding (DVU) dataset being made available by the challenge organizers. The dataset is split into a development data of 14 movies from the 2020-2021 versions of this challenge with a Creative Commons licenses, and a new set of 10 movies licensed from KinoLorberEdu platform. 4 new movies out of the 10 will be added to the 14 movies, while 6 will be chosen as the testing data in 2022. The development data includes: original while videos, segmented scene shots, image examples of main characters and locations, movie-level KG representation of the relationships between main characters, relationships between characters key-locations, scene-level KG representation of each scene in a movie (location type, characters, interactions between them, order of interactions, sentiment of scene, and a short textual summary), and a global shared ontology of locations, relationships (family, social, work), interactions and sentiments.

The organizers will support evaluation and scoring for a hybrid of main query types, at the overall movie level and at the individual scene level distributed with the dataset. Participants will be given the choice to submit results for either the movie-level or scene-level queries, or both. And for each category, queries are grouped for more flexible submission options (please refer to the dataset webpage for more details):

Example Question types at Overall Movie Level:

Multiple choice question answering on part of Knowledge Graph for selected movies.

Possible path analysis between persons / entities of interest in a Knowledge Graph extracted from selected movies.

Fill in the Graph Space - Given a partial graph, systems will be asked to fill in the graph space.

Example Question types at Individual Scene Level:

Find next or previous interaction, given two people, a specific scene, and the interaction between them.

Find a unique scene given a set of interactions and a scene list.

Fill in the Graph Space - Given a partial graph for a scene, systems will be asked to fill in the graph space.

Match between selected scenes and set of scene descriptions written in natural language .

Scene sentiment classification.

A new addition to 2022 challenge is that systems will be asked to submit with their results for some queries a temporal segment from the movie or scene (e.g. using starting/ending timestamps) to act as an evidence for their answers. This requirement will be evaluated independently from the main scoring method and it's objective is to demonstrate if systems can explain their results and if they are submitting their answers for the correct reasons.

IMPORTANT DATES
DVU development data release: Available from the link on the main website

Testing dataset release : Available now

Testing queries release: May 17th

Run submissions due to organizers: June 28, 2022

Paper submission deadline: June 18, 2022

Results released back to participants: July 5, 2022

Notification to authors: July 7, 2022

camera-ready submission: July 20, 2022

ACM Multimedia dates: October 10 - 14, 2022

Related Resources

Multimodal Superintelligence 2025 The Grand Challenge on Multimodal Superintelligence

Ei/Scopus-ITCC 2026 2026 6th International Conference on Information Technology and Cloud Computing (ITCC 2026)

AMLDS 2026 IEEE--2026 2nd International Conference on Advanced Machine Learning and Data Science

CVPR-Avatar 2025 CVPR 2025 Photorealistic Avatar Challenge

AAIML 2026 IEEE--2026 International Conference on Advances in Artificial Intelligence and Machine Learning

KDD 2026 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining

IEEE-ACAI 2025 2025 IEEE 8th International Conference on Algorithms, Computing and Artificial Intelligence (ACAI 2025)

Ei/Scopus-CEICE 2026 2026 3rd International Conference on Electrical, Information and Communication Engineering (CEICE 2026)

VIPERC 2025 4th International Conference on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding

Ei/Scopus-CMLDS 2026 2026 3rd International Conference on Computing, Machine Learning and Data Science (CMLDS 2026)