------------SUFR_ver1.3------------------------- Joel Z. Leibo, Qianli Liao, and Tomaso Poggio ---Contents: 1. SUFR-W 2. SUFR ---Description: This package contains SUFR-W, a dataset of "in the wild" natural images of faces gathered from the internet. The protocol used to create the dataset is described in Leibo, Liao and Poggio (2014). It also contains the full set of SUFR synthetic datasets, called the "Subtasks of Unconstrained Face Recognition Challenge" in Leibo, Liao and Poggio (2014). ------------------------------------------------ ---Details: 1. ------ SUFR-W ** SUFR_in_the_wild/SUFR_in_the_wild_info.mat matlab struct "info" contains two fields: - id : the ID of the person depicted by each image - name : the name of the person depicted by each image ** SUFR_in_the_wild/SUFR_in_the_wild_info.txt Contains the same information as SUFR_in_the_wild_info.mat, but in plain text ** SUFR_in_the_wild/splits_10_folds.mat i. matlab struct "sufr_train_val_test_names" contains three fields: train val test Each field contains a 1x10 cell. The i-th element of a cell contains the names(ID) of i-th training/test/validation fold. The "names" are from 1 to 400, they are actually the IDs of the people. ii. matlab struct "sufr_train_val_test" contains three fields: train val test Each field contains a 1x10 cell. The i-th element of a cell contains the training/test/validation pairs (and labels) of the i-th fold. The first two columns are the image indices of the training/test/validation pairs. The last column is the label. 1: same person, -1: different people. ** SUFR_in_the_wild/splits_10_folds_text a folder contains the text version of SUFR_in_the_wild/splits_10_folds.mat ** Note: Similar to the protocol of LFW (Huang et al. 2007), use 10-fold cross validation. Training, validation, and test data are provided for each fold. They are not overlapping --- individual people appearing in the test set do not appear in the training or validation set. That is, if any image of person X appears in the training set, then no images of person X will appear in the test set. ------------------------------------------------- 2. ------ SUFR Each dataset contains the following annotations: **** Information is provided in two formats: .txt and .mat ** info.mat a matlab struct contains: -- sku: 3D model names we used to build the dataset. -- id : object ID of each image -- angle: rotation angle of each image -- ilum: ilumination info -- shift: translation -- scale: size of the face -- affine: the affine transformation matrix -- background: background ID ** info.txt text version of info.mat ** bounding_box_info.txt The bounding box of the face in each image ** splits.mat -- sufr_train_test_sets Training and testing pairs and labels The first two columns are the image indices of the training/test/validation pairs. The last column is the label. 1: same person, -1: different people. -- sufr_train_val_sets Training and validation pairs and labels -- sufr_train_val_test_names Training, validation and testing IDs. The IDs correspond to the "id" field in info.mat ** test.txt, test_names.txt, train.txt, train_names.txt, val.txt, val_names.txt text version of splits.mat ** Note: given the large number of synthetic datasets, we do not require 10-fold cross-validation. The model should be developed only using training and validation sets. The test set should only be used once when reporting results. ------------------------------------------------ ---Version history: - This is ver1.3 of SUFR-W. - The following two papers report results on slightly older versions of the dataset. The differences between versions are minor, a few label mistakes were corrected and slightly different training/test splits were used. 1.1: Liao Q, Leibo JZ, Poggio T. Learning invariant representations and applications to face verification (2013). Advances in Neural Information Processing Systems (NIPS). Lake Tahoe, NV. 1.2: Liao Q, Leibo JZ, Mroueh Y, Poggio T. Can a biologically-plausible hierarchy effectively replace face detection, alignment, and recognition pipelines? (2013) arXiv:1311.4082, November 16, 2013. - There is only one version of the synthetic datasts (this one). - We do NOT anticipate any further changes to either SUFR-W or SUFR. This version (1.3) is the first to be publicly released, so going forward all reported results will be on version 1.3. ------------------------------------------------- Reference--------------------------------------- ------------------------------------------------- Please cite as: Leibo J. Z., Liao Q., and Poggio T. Subtasks of Unconstrained Faces Recognition (2014). 9th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. (VISAPP). Lisbon, Portugal. -- Available from: http://cbcl.mit.edu/publications/ps/Leibo_Liao_Poggio_VISAPP_2014.pdf -- Presentation available at: http://cbcl.mit.edu/publications/ps/Subtasks_Presentation_VISAPP2014.pdf -- Bibtex at: http://www.jzleibo.com/bio/subtasks ------------------------------------------------- Acknowledgment--------------------------------- ------------------------------------------------- This material is based upon work supported by the Center for Minds, Brains and Machines (CBMM), funded by NSF STC award CCF-1231216.