EmotioNet Challenge

Timeline

Training Phase: Dec 2019 - Feb 2020
Validation Phase: Feb 3 - March 25 2020
Testing Phase: March 30 - April 2 2020

Introduction

The EmotioNet Challenge 2020 (ENC-2020) evaluates the ability of computer vision algorithms to automatically analyze a large number of "in the wild" images for facial expressions.

Research groups that have designed or are developing algorithms for the analysis of facial expressions are encouraged to participate in this challenge.

What's New?
Compared to the 2018 challenge, the ENC-2020 has:

1. 11 more AUs. AU 10, 15, 18, 24, 28, 51, 52, 53, 54, 55, 56 have been added to the challenge for a total of 23 AUs. These AUs will be manually annotated in the optimization, validation and testing set

2. A chance to submit and speak in a CVPR Workshop. Top 3 participants will be invited to submit a CVPR workshop paper and present their method in the workshop

3. A single task - recognition of AU in the wild.

How to Submit

http://cbcsl.ece.ohio-state.edu/EmotionNetChallenge/form.html

The challenge registration only provides access to the evaluation server, which does not include access to the EmotioNet training or optimization sets. Please submit a data access form at:
http://cbcsl.ece.ohio-state.edu/dbform_emotionet.html

AU in the Wild

This track requires the identification of 23 action units (AUs). The AUs included in the challenge are: 1, 2, 4, 5, 6, 9, 10, 12, 15, 17, 18, 20, 24, 25, 26, 28, 43, 51, 52, 53, 54, 55, 56.

Training data: The EmotioNet database includes 950,000 images with annotated AUs. These were annotated with the algorithm described in [1]. You can train your system using this set. You can also use any other annotated dataset you think appropriate. This dataset has been used to successfully train a variety of classifiers, including several deep networks.

Optimization data: We also include 25K (24,600 to be precise) manually annotated AUs. You may want to use this dataset to see how well your algorithm works or to optimize the parameters of your algorithm.

Verification phase: Participants will have access to a server where they can test their algorithm. Participants will receive a unique access code. Each participant will be able to test their algorithm twice. Comparative results against those of the 2017 EmotioNet Challenge [4] will be provided.

Challenge phase: Participants will be able to connect to the server one final time to complete the final test. The test dataset used in this phase will be different than the one in the verification phase. The results of this phase will be used to compute the final scores of the challenge.

Evaluation: Identification accuracy of AUs will be measured using two criteria – accuracy and F-scores. Algorithms will then be classified from first (best) to last based on an ordinal ranking of their performance on the mean of the recognition of all AUs. Formally, these criteria are defined as follows.

Accuracy is a measurement of closeness to the true value. We will compute recognition accuracy of each AU, i.e., $accuracy_i$ corresponds to the accuracy of your algorithm in identifying that an image has AU i active/present, with i= 1, 2, 4, 5, 6, 9, 10, 12, 15, 17, 18, 20, 24, 25, 26, 28, 43, 51, 52, 53, 54, 55, 56. Formally,

$accuracy_i=\frac{\sum \mbox{true positives}_i+\sum \mbox{true negatives}_i}{\sum \mbox{total population}}$

where $\mbox{true positives}_i$ are AU $i$ instances correctly identified in the test images, $\mbox{true negatives}_i$ are images correctly labeled as not having AU $i$ active/present, and total population is the total number of test images.

The mean accuracy is $accuracy= m^{-1}\sum accuracy_i$ , and the standard deviation is $\sigma^2=m^{-1} \sum(accuracy_i-accuracy)^2$ , where $m$ is the number of AUs.

F-scores will also be provided by each AU before computing the mean and standard deviation. The F-score of AU i is given by

$F_{\beta_i}=(1+\beta^2)\frac{precision_i \cdot recall_i}{\beta^2presion_i+recall_i}$

where $precision_i$ is the fraction of AU $i$ is correctly identified, $recall_i$ is the number of correct recognitions of AU i over the actual number of images with AU $i$ active, and $\beta$ defines the relative importance of precision over recall. We will use $\beta=.5,1,2$. $\beta=.5$ gives more importance to precision (this is useful in applications where false positives are not as important as precision), $\beta=2$ emphasizes recall (which is important in applications where false negatives are unwanted), and $\beta=1$ provides a measure where recall and precision are equally relevant. We will compute the mean $F_{\beta}=m^{-1}\sum F_{\beta_i}$.

The final ranking of all participants will be given by the average of accuracy and $F_1$, i.e., $\mathbf{\mbox{final ranking}}=.5(accuracy+F_1 )$. We may also provide rankings for each of the individual measurements (i.e., $accuracy$ and $F_\beta$, with $\beta=.5,1,2$) as well as the number of times each algorithm wins in the classification of each AU using these evaluations. But only the final ranking will be used to order submissions from first to last.

A variety of additional experiments might be performed on the data to better understand the limitations of current algorithms, but these will have no effect on the final ranking.

2020 Challenge

 Group Mean Accuracy ­ F1 Final Score TAL .9147 .5465 .7306 University of Magdeburg .9124 .5478 .7301 SIAT-NTU .9013 .4410 .6711 USTC-alibaba .8609 .3497 .6053

2020 validation top-3

 Group Mean Accuracy ­ F1 Final Score TAL .9200 .5720 .7460 University of Magdeburg .9198 .5706 .7452 SIAT-NTU .9195 .3531 .6363

2018

 Group Mean Accuracy ­ F1 Final Score PingAn-GammaLab .9446 .5659 .7553 VisionLabs .9207 .4229 .6718 MIT .9298 .4125 .6711 University of Washington .8869 .3730 .6300 PingAn-Tech .8694 .3747 .6221 University of Denver .8576 .2296 .5436

2017

 Group Mean Accuracy ­ F1 Final Score I2R-CCNU-NTU-2 .8215 .6398 .7290 JHU .7714 .6405 .7101 I2R-CCNU-NTU-1 .7837 .6296 .7023 I2R-CCNU-NTU-3 .7766 .6203 .6961

Note: Final score takes a value between 0 and 1, with 1 the best possible score and 0 the worst one. The final score is the average of accuracy and F1 score.

FAQ

Do I need to publish my results/algorithm?

The participants with the top 3 results will be asked to submit a 3-page methods paper to the workshop. Participating in this challenge does NOT mean you need to publish any paper describing your algorithm or results anywhere else. Of course, you are welcome to submit a full paper to the workshop or at another conference or journal.

Where can I publish my algorithm and results?

You can publish your algorithm and results wherever you feel is best suited. Or you can decide not to publish them. Your results will however be posted on websites and might appear in articles and press releases.

Can I participate using an already published algorithm?

Yes. There are no restrictions on who or which algorithm can participate but the algorithm must be yours. You cannot participate using an algorithm derived/implemented by someone else. Your implementation can of course use open access code available on websites (e.g., openCV,GitHub)

Can I participate anonymously?

Participation to this challenge requires registration. This includes your name and University, Institution or Company where you work. Only the name of your institution will be made available online and in papers and only after the release of the results. If there are multiple entries from the same institution, the name of the institution is going to be followed by a dash and a number, e.g., OSU-1, OSU-2, etc.

Can companies participate?

Yes, but the results will be made publicly available in websites, papers and press releases.

If I register but do not participate in the challenge phase, will I be listed in the website?

No.

References

[1] Benitez-Quiroz, C. F., Srinivasan, R., & Martinez, A. M. (2016). EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In Proceedings of IEEE International Conference on Computer Vision & Pattern Recognition (CVPR'16), Las Vegas, NV, USA.

[2] Du, S., Tao, Y., & Martinez, A. M. (2014). Compound facial expressions of emotion. Proceedings of the National Academy of Sciences, 111(15), E1454-E1462.

[3] Benitez-Quiroz, C. F., Liu, Y., & Martinez, A. M. (2016). Recognition of Action Units in the Wild with Deep Nets. ICCV 2017.

[4] Benitez-Quiroz, C. F., Srinivasan, R., Feng, Q., Wang, Y., & Martinez, A. M. (2017). EmotioNet Challenge: Recognition of facial expressions of emotion in the wild. arXiv preprint arXiv:1703.01210