CFP

REACT2024: REACT2024 Challenge, in conjunction with FG 2024

ITU Campus

Istanbul, Turkey, May 27-31, 2024

Conference website	https://sites.google.com/cam.ac.uk/react2024
Submission link	https://easychair.org/conferences/?conf=react2024

In dyadic interactions, humans communicate their intentions and state of mind using verbal and non-verbal cues, where multiple different facial reactions might be appropriate in response to a specific speaker behaviour. Then, how to develop a machine learning (ML) model that can automatically generate multiple appropriate, diverse, realistic and synchronised human facial reactions from an previously unseen speaker behaviour is a challenging task. Following the successful organisation of the first REACT challenge (REACT2023), we propose the second REACT Challenge focusing on developing generative models that can automatically output multiple appropriate, diverse, realistic and synchronised facial reactions under both online and offline settings. Different from the first edition of the REACT Challenge, the second REACT Challenge encourages the participants to generate realistic images and video clips as results of their submission. Participants will develop and benchmark Machine Learning (ML) models that can be used to generate appropriate facial reactions given an input stimulus under various dyadic video conference settings, using two state-of-the-art datasets, namely, NOXI and RECOLA. As part of the challenge, we will provide challenge participants with the REACT Challenge Dataset that will be a compilation of NOXI and RECOLA recordings segmented into 30-secs interaction video-clips (pairs of videos) and baseline PyTorch code (including a well-developed dataloader). We will then invite the participating groups to submit their developed / trained ML models for evaluation, which will be benchmarked in terms of the appropriateness diversity, realism and synchrony of the generated facial reactions.

Important Dates

Launching Challenge website and call for participation poster: November 2, 2023
Registration open: November 5, 2023
Training and validation sets released: November 14, 2023
Baseline paper and code released: December 31, 2023 January 10, 2024
Test set released: March 1, 2024
Final result and model submission: March 15, 2024
Paper submission deadline: March 29, 2024
Paper acceptance notification: April 5, 2024
Camera ready paper submission deadline: April 11, 2024

The First Edition (REACT23 @ ACM-MM)

The first edition of the REACT challenge was held in conjunction with the with the ACM Multimedia (ACM-MM) 2023 in Ottawa, Canada.

As result of the first edition, we released the baseline code in this GitHub repository and corresponding paper. The call for participation attracted registration of 11 teams from 6 countries, with 10 teams participating in the Offline and Online sub-challenges, respectively. The top 3 teams have successfully submitted valid models, results and papers for the challenge, with each paper submission being assigned two reviewers.

The information about the previous edition can be found on this website.

Challenge Tasks

Given the spatio-temporal behaviours expressed by a speaker at the time period, the proposed REACT 2024 Challenge will consist of the following two sub-challenges whose theoretical underpinnings have been defined and detailed in this paper.

Task 1 - Offline Appropriate Facial Reaction Generation

This task aims to develop a machine learning model that takes the entire speaker behaviour sequence as the input, and generates multiple appropriate and realistic / naturalistic spatio-temporal facial reactions, consisting of AUs, facial expressions, valence and arousal state representing the predicted facial reaction. As a result, facial reactions are required to be generated for the task given each input speaker behaviour.

Task 2 - Online Appropriate Facial Reaction Generation

This task aims to develop a machine learning model that estimates each frame, rather than taking all frames into consideration. The model is expected to gradually generate all facial reaction frames to form multiple appropriate and realistic / naturalistic spatio-temporal facial reactions consisting of AUs, facial expressions, valence and arousal state representing the predicted facial reaction. As a result, facial reactions are required to be generated for the task given each input speaker behaviour.

Challenge Datasets

The second REACT challenge relies on two video conference corpora: RECOLA [3], and NOXI [4]. Specifically, we first segmented each audio-video clip in two datasets into a 30-seconds long clip. Then, we cleaned the dataset by selecting only the dyadic interactions with complete data of both conversational partners (where both faces were within the frame of the camera). This resulted in 5919 clips of 30 seconds each (71,8 hours of audio-video clips), specifically: 5870 clips (49 hours) from the NoXi dataset and 54 clips (0,4 hour) from the RECOLA dataset. We divided the datasets into training, test and validation sets. We split the datasets with a subject-independent strategy (i.e., the same subject was never included in the train and test sets).

The REmote COLlaboration and Affective (RECOLA) database contains 9,5 hours of audio, visual, and physiological recordings of online dyadic interactions between 46 French-speaking participants collaborating on a task.
NOXI (NOvice eXpert Interaction) is a database containing screen-mediated face-to-face interactions. It is annotated during an information retrieval task targeting multiple languages, multiple topics, and the occurrence of unexpected situations.

Organizers

Dr Micol Spitale*, Politecnico di Milano, Italy & University of Cambridge, Cambridge, United Kingdom
Dr Siyang Song*, University of Leicester & University of Cambridge, United Kingdom
Cheng Luo, Monash University, Australia
Cristina Palmero, Universitat de Barcelona, Barcelona, Spain
German Barquero, Universitat de Barcelona, Barcelona, Spain
Prof Sergio Escalera, Universitat de Barcelona, Barcelona, Spain
Prof Michel Valstar, University of Nottingham, Nottingham, United Kingdom
Dr Tobias Baur, University of Augsburg, Augsburg, Germany
Dr Fabien Ringeval, Université Grenoble Alpes, Grenoble, France
Prof Elisabeth Andrè, University of Augsburg, Augsburg, Germany
Prof Hatice Gunes, University of Cambridge, Cambridge, United Kingdom

Feel free to contact us at this email: reactmultimodalchallenge@gmail.com