Similar-looking LFW (SLLFW) Database


Motivation

Welcome to Similar-looking LFW (SLLFW) database, a renovation of Labeled Faces in the Wild (LFW), the de facto standard testbed for unconstraint face verification. Note that we refered to Similar-looking LFW (SLLFW) database as Fine-grained LFW (FGLFW) database in the past.

As performance on some aspects of LFW benchmark approaches 100% accuracy, there is an intense debate on whether unconstrained face verification problem has already been solved. Common face verification addresses mainly large intra-class variations, such as pose, illumination, and expression. After inspecting the LFW databases, one can identify a main limiting factor for its unconstrained face verification task: almost all the negative face pairs are quite easy to distinguish. The negative pairs are randomly selected from different individual, and it is common that two random individuals have large differences in appearance. Many face pairs even have different genders. Thus, verification is, by its nature a problem in which many examples are very easy with large inter-class variance, because the collection of LFW database is based on the assumption of random imposter attack. For practical usage, however, it is likely that a desperate impostor may attempt to spoof a genuine user by seeking a similar-looking people. To simulate this deliberate imposture attack, we construct SLLFW database, which deliberately selects 3000 similar-looking face pairs within original image folders by human crowdsourcing to replace the random negative pairs in LFW.

We dedicate to maintain the protocols, dataset size, and the image ensemble (fold) of LFW database, in order to encourage fair and meaningful comparisons, and allow easy comparison and replication of results. Since SLLFW only modifies the negative face pairs defined in the standard protocol, the original training and testing paradigms of LFW can be directly used. You can find more information about standard LFW protocol in Labeled Faces in the Wild (LFW).


We expect SLLFW could promote algorithms to make reliable verification judgement, and close the large gap between the reported performance on benchmarks and performance on real world tasks.


Performance comparison

We evaluate several state-of-the-art metric learning, face descriptors, and deep learning methods on the new SLLFW database, and their accuracy drops about 10–20% compared to the corresponding LFW performance.

Verification accuracy(%) on LFW dataset and SLLFW under image unrestricted setting using labeled outside data.

Method Training images LFW SLLFW
DeepFace 0.5M 92.87% 78.78%
DeepID2 0.2M 95.00% 78.25%
VGG-Face 2.6M 96.70% 85.78%
DCMN1 0.5M 98.03% 91.00%
Noisy Softmax2 0.5M 99.18% 94.50%
Human 1 n/a 99.85% 92%

Reference

Please cite as:

Weihong Deng, Jiani Hu, Nanhai Zhang, Binghui Chen, and Jun Guo.
Fine-grained face verification: FGLFW database, baselines, and human-DCMN partnership.
Pattern Recognition, 2017, 66:63-73.

BibTeX entry:
@article{deng2017fine,
   title={Fine-grained face verification: FGLFW database, baselines, and human-DCMN partnership},
   author={Deng, Weihong and Hu, Jiani and Zhang, Nanhai and Chen, Binghui and Guo, Jun},
   journal={Pattern Recognition},
   volume={66},
   pages={63--73},
   year={2017},
   }


Nanhai Zhang, Weihong Deng.
Fine-grained LFW database.
International Conference on Biometrics. IEEE, 2016:1-6.

@inproceedings{Zhang2016Fine,
   title={Fine-grained LFW database},
   author={Zhang, Nanhai and Deng, Weihong},
   booktitle={International Conference on Biometrics},
   pages={1-6},
   year={2016},
   }

Download the database

Since SLLFW only modifies the negative face pairs defined in the standard protocol, the original images of LFW can be directly used. You can download images in Labeled Faces in the Wild (LFW). We provide you with a list of SLLFW pair_SLLFW.txt in the form of 10-fold cross validation using splits founded by LFW. There are 10 sets, every of which consists of 300 matched pairs and 300 mismatched pairs. The list is formatted as follows: 10 sets are arranged in order. The first 600 lines give the matched pairs in set 1 in the following format:

name1/name1_n1.jpg
name1/name1_n2.jpg

which means the matched pair consists of the "n1" and "n2" images for the person with "name1". For instance,

Abel_Pacheco/Abel_Pacheco_0001.jpg
Abel_Pacheco/Abel_Pacheco_0004.jpg

would mean that the pair consists of images Abel_Pacheco_0001.jpg and Abel_Pacheco_0001.jpg.

The following 600 lines give the mismatched pairs in the following format:

name1/name1_n1.jpg
name2/name2_n2.jpg

which means the mismatched pair consists of the "n1" image of person "name1" and the "n2" image of person "name2". For instance,

Jeffrey_Archer/Jeffrey_Archer_0002.jpg
Luis_Ernesto_Derbez_Bautista/Luis_Ernesto_Derbez_Bautista_0003.jpg

would mean that the pair consists of images Jeffrey_Archer_0002.jpg and Luis_Ernesto_Derbez_Bautista_0003.jpg.

This procedure is then repeated 9 more times to give the pairs for the next 9 sets.


Click here to download the list pair_SLLFW.txt and cite as Reference


Contact

Please contact Yaoyao Zhong and Weihong Deng for questions about the database.