Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks
While feedforward deep convolutional neural networks
(CNNs) have been a great success in computer vision, it
is important to note that the human visual cortex generally
contains more feedback than feedforward connections. In
this paper, we will briefly introduce the background of feedbacks
in the human visual cortex, which motivates us to develop
a computational feedback mechanism in deep neural
networks. In addition to the feedforward inference in traditional
neural networks, a feedback loop is introduced to
infer the activation status of hidden layer neurons according
to the “goal” of the network, e.g., high-level semantic
labels. We analogize this mechanism as “Look and Think
Twice.” The feedback networks help better visualize and
understand how deep neural networks work, and capture
visual attention on expected objects, even in images with
cluttered background and multiple objects. Experiments on
ImageNet dataset demonstrate its effectiveness in solving
tasks such as image classification and object localization.
Download: pdf
Text Reference
Chunshui Cao, Xianming Liu, Yi Yang, Yinan Yu, Jiang Wang, Zilei Wang, Yongzhen Huang, Liang Wang, Chang Huang, Wei Xu, Deva Ramanan, and Thomas S. Huang. Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In IEEE International Conference on Computer Vision. 2015.BibTeX Reference
@INPROCEEDINGS{CaoLYYWWHXRH_ICCV_2015,author = "Cao, Chunshui and Liu, Xianming and Yang, Yi and Yu, Yinan and Wang, Jiang and Wang, Zilei and Huang, Yongzhen and Wang, Liang and Huang, Chang and Xu, Wei and Ramanan, Deva and Huang, Thomas S.",
booktitle = "IEEE International Conference on Computer Vision",
title = "Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks",
year = "2015"
}