6.2 Human-labelled Images

Another important type of ground-truth data used for evaluating visual attention models is the human-labelled images. Usually, the salient areas to the human eyes in visual scenes correspond to the salient objects. Many studies have used the saliency map to detect objects for natural images [3, 4, 7, 9]. The quantitative evaluation can be performed if an appropriate database with ground-truth is available. One widely used database of this type is the one including 5000 images with ground-truth salient objects marked with bounding boxes by nine subjects [7]. Some sample images from the database are shown in Figure 6.2(a). The human-labelled ground-truth data and the saliency maps of these images from the visual attention model in [15] are shown in Figure 6.2(b) and (c), respectively. Of course, it is possible for subjects to mark the salient objects more precisely (rather than just using bounding boxes as in Figure 6.2(b)) [10]. Additional human-labelled databases can be found in [7, 10].

Figure 6.2 (a) Sample images (from [7]); (b) the ground-truth (human-labeled) images; (c) the corresponding saliency maps (from the model in [15]). Figure 6.2(b) Reproduced from T. Liu, J. Sun, N. Zheng, X. Tang and H. Y. Shum, ‘Learning to detect a salient object,’ Microsoft Research Asia, http://research.microsoft.com/en-us/um/people/jiansun/salientobject/salient_object.htm (accessed November 25, 2012); Figure 6.2(c) © 2012 IEEE. Reprinted, with permission, from Y. ...

Get Selective Visual Attention: Computational Models and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.