/***************************************************** * ImageNet attribute labels, v 1.0, August 2nd 2010 * *****************************************************/ These labels were collected using Amazon Mechanical Turk for [1]. ATTRIBUTES ========== attributes is a list of the attributes IMAGES ====== images is a list of image names (without extensions) The images are available at www.image-net.org for academic use only. You will have to request an account to download the original JPEGs. The imageNet naming convention is n12345678_123 where n12345678 is the synset id, and n12345678_123.JPEG is the name of the image. BBOXES ====== All of the labeled images have bounding box annotations available for download at www.image-net.org. The attribute labels correspond to the *first bounding box* within each image. We were using the spring10 release of imageNet. The bounding boxes are also provided here for convenience. Each bounding box contains the fields x1, x2, y1, y2, all normalized to be between 0 and 1. LABELS ====== labels is a matrix of size (#images) x (# attributes). The labels mean: 1 = positive for that attribute -1 = negative for that attribute 0 = ambiguous They were obtained by using Amazon Mechanical Turk in 2 steps: (1) First, (9600 images) / (100 images per task) * (25 attributes per image) = 2400 tasks were submitted to AMT, each one assigned to 3 users. The images were cropped using the bounding box annotations, so that only the part of the image inside the first bounding box was visible to the worker. For color attributes, such as red, the instructions were: Consider the _object_ in the image (_not_ the background). Is a _significant_ part of the object red? with two possible answers: "Yes, at least a quarter (25%) of this object is red" "No, this object contains little or no red" For non-color attributes, such as round, the instructions were: Consider the _object_ in the image (_not_ the background). Would you describe the object as a _whole_ as round? with two possible answers: "Yes, this object is round" "No, this object is not round" There were 6 "gold standard" images (with known, unambigous labels) planted randomly among the 100 images in each task. The user was allowed to make at most 2 errors on those; after more than 2 errors, he was forced to restart. (2) After the images were labeled, many attributes were still ambiguous, i.e. there was some disagreement among the 3 workers (as discussed in [1]). All of the ambiguous images were resubmitted for labeling by one additional user, with tigher quality controls (at most 1 error was allowed on the 6 gold standard images). A label was assigned to the image if 3 out of the 4 total workers agreed on a label; otherwise, it was still considered ambiguous. This produced a significantly cleaner dataset (the number of ambigous labels dropped by almost 60%). REFERENCE ========= [1] Attribute learning in large-scale datasets, Olga Russakovsky and Li Fei-Fei. Parts and Attributes Workshop at the European Conference on Computer Vision (ECCV), 2010.