# EmotionNet Challenge

## EmotionNet Challenge

The results of the 2017 EmotioNet Challenge are now available

Research groups and companies that have designed or are developing algorithms for the analysis of facial expressions are encouraged to participate in the FIRST facial expression recognition in the wild challenge. The competition has two tracks. You may decide to participate in a single track or in both tracks.

## Track 1. Recognition of AUs in the wild

This track requires the identification of 11 action units (AUs). The AUs included in the challenge are: 1, 2, 4, 5, 6, 9, 12, 17, 20, 25 and 26.

Training data: The EmotioNet database (release 1.1) includes 950,000 images with annotated AUs. These were annotated with the algorithm described in [1]. You can train your system using this set. You can also use any other annotated dataset you wish to train your system. This set was recently used to train a variety of classifiers, including a deep network [3].

Verification data: We also include 25,000 manually annotated AUs. Use this system to see how well your algorithm works and to set the parameters of your approach.

Testing data: A week before the challenge deadline, participants will be provided with a large (previously unseen) test image set, which will be used to evaluate participant's algorithms. Participants will have to submit a file with their solutions before the deadline to be considered in the challenge.

Evaluation: Identification accuracy of AUs will be measured using two criteria - accuracy and F-scores. Algorithms will then be classified from first (best) to last based on an ordinal ranking of their performance on the recognition of each AU as well as the mean of each criterion. Formally, these criteria are defined as follows.

Accuracy is a measurement of closeness to the true value. We will compute recognition accuracy of each AU, i.e., $accuracy_i$ corresponds to the accuracy of your algorithm in identifying that an image has AU i active/present, with i= 1, 2, 4, 5, 6, 9, 12, 17, 20, 25, 26. Formally,

$accuracy_i=\frac{\sum \mbox{true positives}+\sum \mbox{true negatives}}{\sum \mbox{total population}}$

where $\mbox{true positives}_i$ are AU $i$ instances correctly identified in the test images, $\mbox{true negatives}_i$ are images correctly labeled as not having AU $i$ active/present, and total population is the total number of test images.

The mean accuracy is $accuracy= m^{-1}\sum accuracy_i$ , and the standard deviation is $\sigma^2=m^{-1} \sum(accuracy_i-accuracy)^2$ , where $m$ is the number of AUs.

F-scores will also be provided by each AU before computing the mean and standard deviation. The F-score of AU i is given by

$F_{\beta_i}=(1+\beta^2)\frac{precision_i \cdot recall_i}{\beta^2presion_i+recall_i}$

where $precision_i$ is the fraction of AU is correctly identified, $recall_i$ is the number of correct recognitions of AU i over the actual number of images with AU $i$ active, and $\beta$ defines the relative importance of precision over recall. We will use $\beta=.5,1,2$. $\beta=.5$ gives more importance to precision (this is useful in applications where false positives are not as important as precision), $\beta=2$ emphasizes recall (which is important in applications where false negatives are unwanted), and $\beta=1$ provides a measure where recall and precision are equally relevant. We will compute the mean $F_{\beta}=m^{-1}\sum F_{\beta_i}$.

The final ranking of all participants will be given by the average of accuracy and $F_1$, i.e., $\mathbf{\mbox{final ranking}}=.5(accuracy+F_1 )$. We will also provide rankings for each of the individual measurements (i.e., accuracy and $F_\beta$, with $\beta=.5,1,2$) as well as the number of times each algorithm wins in the classification of each AU using these evaluations. But only the final ranking will be used to order submissions from first to last.

A variety of additional experiments might be performed on the data to better understand the limitations of current algorithms, but these will have no effect on the final ranking.

## Track 2. Recognition of basic and compound emotions

Recognition of basic and compound emotions: This track is for algorithms that can recognize emotion categories in face images. You can identify the emotion category based on the detection of AUs, but you can also use any other system (e.g., one that uses shape or appearance, e.g., [2]). Your do not need to participate in track a to be eligible to participate in this track. Of course, you may want to participate in track a and not in this one. Or you may wish to participate in both.

Training data: A subset of the images in the EmotioNet database correspond to basic and compound emotions. Release 1.2 (see below) will provide a file with these annotations. These annotations are given by the algorithms described in references [1,2]. You can use this dataset and any other manually or automatically annotated dataset to train your system. The database of [2] provides a large number of manually annotated images in lab conditions and can be downloaded by following this online form: http://cbcsl.ece.ohio-state.edu/dbform_compound.html. Note that only post-doctoral researchers and faculty members can apply for this dataset.

Verification data: Many images in EmotioNet have been manually annotated as expressing basic and compound emotions (release 1.2). Use this set of determine how well your algorithm works on images in the wild.

Testing data: A week before the challenge deadline, participants will be provided with a large (previously unseen) test image set, which will be used to evaluate participant's algorithms.

Evaluation: The same evaluation used in track 1 will be used, with the obvious difference that i will now mean emotion category $i$ rather than AU $i$.

## Data Releases

The EmotioNet database can be accessed here: http://cbcsl.ece.ohio-state.edu/dbform_emotionet.html

EmotioNet 1.0 (August 2016): 975,000 images of facial expressions in the wild were made available in August 2016. This dataset is defined in reference [1]. Manual annotations of 11 AUs on 25,000 images are also included. This is the verification set.

EmotioNet 1.1 (November 25, 2016): AU annotations for a total of 950,000 images will be released. These annotations were obtained using the algorithm described in [1]. This set can be used to train the system, but it should be understood that these annotations are not perfect.

EmotioNet 1.2 (December 2016): File with manual and automatic annotations of emotion categories. Automatic annotations were obtained using the algorithms described in [1-2]. The automatic set is not perfect but still very accurate. You may want to use this set to train your system.

## EmotioNet Challenge submissions and results

Registration: Registration: In order to participate, you need to first complete the registration form.

Submissions: After the release of the test set, participants will have to submit a text file with the format specified in the following link: Guidelines for submitting results . Examples of the files for Track 1 can be found here and for Track 2 can be found here. You must use this templates.

Test set release date: February 8, 2017.

Submissions deadline: February 12, 2017, 5 pm EST.

Results announced: February 24, 2017.

IMPORTANT NOTE: All results will be posted on the EmotioNet website and might be published in papers and included in press releases. By participating in this challenge, you and your institution assume all responsibilities and liability. You can only participate using algorithms developed by you and your co-authors. Analyses of the results given by all algorithms might be extended without prior notice and published on websites, papers and press releases. These will NOT change the outcome of the challenge (i.e., the rankings will be determined using the methodology described above) but are useful statistics that will help the community better understand the strengths and limitations of each algorithm and the area as a whole. By participating you agree to all terms and conditions stated in this website/email/post.

## 2017 Results

The results of the EmotioNet challenge are summarized here (reference [4]).

Last November we announced the 2017 EmotioNet Challenge. Training and verification data were made available in mid November and early December for tracks 1 and 2, respectively.

Research groups and companies interested in participating in the challenged were required to register. 37 groups signed up and received the training and verification sets.

The testing data was sent to the 37 participants in early February. Teams had a few days to process the data and submit their automatic annotations – AUs in track 1 and emotion categories in track 2.

Of the initial 37 groups, only 4 groups successfully completed track 1 on time. Only 2 groups completed track 2 before the deadline. The results are summarized below. Additional results and a detailed analysis of the results will be published in a working paper within a few weeks.

## Track 1

 Group Final score I2R-CCNU-NTU-2 .728985 JHU .710087 I2R-CCNU-NTU-1 .702322 I2R-CCNU-NTU-3 .69608

Note: Final score takes a value between 0 and 1, with 1 the best possible score and 0 the worst one. The final score is the average of accuracy and F1 score.

 Group Accuracy I2R-CCNU-NTU-2 .8215 I2R-CCNU-NTU-1 .783667 I2R-CCNU-NTU-3 .776583 JHU .771417

 Group F1 ­ F2 F.5 JHU .6405 .635416667 .638083333 I2R-CCNU-NTU-2 .639833333 .624916667 .64325 I2R-CCNU-NTU-1 .629583333 .625416667 .635083333 I2R-CCNU-NTU-3 .622833333 .620333333 .626583333

## Track 2

 Group Final score NTechLab 0.596767708 JHU 0.479914583

Note: Final score takes a value between 0 and 1, with 1 the best and 0 the worst possible scores, respectively. The final score is the average of accuracy and F1 score. In parentheses, we show the final score for those images the group was able to analyze.

 Group Accuracy NTechLab 0.9415 JHU 0.8358125

 Group F1 ­ F2 F.5 NTechLab 0.254969 0.25981875 0.257669 JHU 0.142375 0.12695625 0.181725

## FAQ

Do I need to publish my results/algorithm?

No. Participating in this challenge does NOT mean you need to publish any paper describing your algorithm or results. Of course, you are welcome to publish papers describing any algorithm you have developed to participate in this challenge or the results obtained on an existing algorithm and submit it to the journal, conference, workshop or symposium of your choice.

Where can I publish my algorithm and results?

You can publish your algorithm and results wherever you feel is best suited. Or you can decide not to publish them. Your results will however be posted on websites and might appear in articles and press releases.

Can I participate using an already published algorithm?

Yes. There are no restrictions on who or which algorithm can participate but the algorithm must be yours. You cannot participate using an algorithm derived/implemented by someone else. Your implementation can of course use open access code available on websites (e.g., openCV,GitHub)

Can I participate anonymously?

Only with prior approval. Participation to this challenge requires registration. This includes your name and University, Institution or Company where you work. Only the name of your institution will be made available online and in papers and only after the results deadline (February 15, 2017). If there are multiple entries from the same institution, the name of the institution is going to be followed by a dash and a number, e.g., OSU-1, OSU-2, etc. People wanting to remain anonymous should contact us at martinez.158@osu.edu asap. The results of these groups will be reported using a pseudonym.

Can companies participate?

Yes, but the results will be made publicly available in websites, papers and press releases after February 15, 2017. Companies wanting to remain anonymous (using a pseudonym) should contact us at martinez.158@osu.edu asap.

I have detected an error/typo in a file, what should I do?