1-13hit |
Takao YAMANAKA Tatsuya SUZUKI Taiki NOBUTSUNE Chenjunlin WU
Omni-directional images have been used in wide range of applications including virtual/augmented realities, self-driving cars, robotics simulators, and surveillance systems. For these applications, it would be useful to estimate saliency maps representing probability distributions of gazing points with a head-mounted display, to detect important regions in the omni-directional images. This paper proposes a novel saliency-map estimation model for the omni-directional images by extracting overlapping 2-dimensional (2D) plane images from omni-directional images at various directions and angles of view. While 2D saliency maps tend to have high probability at the center of images (center bias), the high-probability region appears at horizontal directions in omni-directional saliency maps when a head-mounted display is used (equator bias). Therefore, the 2D saliency model with a center-bias layer was fine-tuned with an omni-directional dataset by replacing the center-bias layer to an equator-bias layer conditioned on the elevation angle for the extraction of the 2D plane image. The limited availability of omni-directional images in saliency datasets can be compensated by using the well-established 2D saliency model pretrained by a large number of training images with the ground truth of 2D saliency maps. In addition, this paper proposes a multi-scale estimation method by extracting 2D images in multiple angles of view to detect objects of various sizes with variable receptive fields. The saliency maps estimated from the multiple angles of view were integrated by using pixel-wise attention weights calculated in an integration layer for weighting the optimal scale to each object. The proposed method was evaluated using a publicly available dataset with evaluation metrics for omni-directional saliency maps. It was confirmed that the accuracy of the saliency maps was improved by the proposed method.
Yoshiaki UEDA Seiichi KOJIMA Noriaki SUETAKE
In this letter, we propose a color quantization method based on saliency. In the proposed method, the salient colors are selected as representative colors preferentially by using saliency as weights. Through experiments, we verify the effectiveness of the proposed method.
Yibo JIANG Hui BI Hui LI Zhihao XU
The 3D measurement is widely required in modern industries. In this letter, a method based on the RGBD saliency detection with depth range adjusting (RGBD-DRA) is proposed for 3D measurement. By using superpixels and prior maps, RGBD saliency detection is utilized to detect and measure the target object automatically Meanwhile, the proposed depth range adjusting is processing while measuring to prompt the measuring accuracy further. The experimental results demonstrate the proposed method automatic and accurate, with 3 mm and 3.77% maximum deviation value and rate, respectively.
Zhengxue CHENG Masaru TAKEUCHI Kenji KANAI Jiro KATTO
Image quality assessment (IQA) is an inherent problem in the field of image processing. Recently, deep learning-based image quality assessment has attracted increased attention, owing to its high prediction accuracy. In this paper, we propose a fully-blind and fast image quality predictor (FFIQP) using convolutional neural networks including two strategies. First, we propose a distortion clustering strategy based on the distribution function of intermediate-layer results in the convolutional neural network (CNN) to make IQA fully blind. Second, by analyzing the relationship between image saliency information and CNN prediction error, we utilize a pre-saliency map to skip the non-salient patches for IQA acceleration. Experimental results verify that our method can achieve the high accuracy (0.978) with subjective quality scores, outperforming existing IQA methods. Moreover, the proposed method is highly computationally appealing, achieving flexible complexity performance by assigning different thresholds in the saliency map.
Hironori TAKIMOTO Syuhei HITOMI Hitoshi YAMAUCHI Mitsuyoshi KISHIHARA Kensuke OKUBO
It is estimated that 80% of the information entering the human brain is obtained through the eyes. Therefore, it is commonly believed that drawing human attention to particular objects is effective in assisting human activities. In this paper, we propose a novel image modification method for guiding user attention to specific regions of interest by using a novel saliency map model based on spatial frequency components. We modify the frequency components on the basis of the obtained saliency map to decrease the visual saliency outside the specified region. By applying our modification method to an image, human attention can be guided to the specified region because the saliency inside the region is higher than that outside the region. Using gaze measurements, we show that the proposed saliency map matches well with the distribution of actual human attention. Moreover, we evaluate the effectiveness of the proposed modification method by using an eye tracking system.
Most unsupervised video segmentation algorithms are difficult to handle object extraction in dynamic real-world scenes with large displacements, as foreground hypothesis is often initialized with no explicit mutual constraint on top-down spatio-temporal coherency despite that it may be imposed to the segmentation objective. To handle such situations, we propose a multiscale saliency flow (MSF) model that jointly learns both foreground and background features of multiscale salient evidences, hence allowing temporally coherent top-down information in one frame to be propagated throughout the remaining frames. In particular, the top-down evidences are detected by combining saliency signature within a certain range of higher scales of approximation coefficients in wavelet domain. Saliency flow is then estimated by Gaussian kernel correlation of non-maximal suppressed multiscale evidences, which are characterized by HOG descriptors in a high-dimensional feature space. We build the proposed MSF model in accordance with the primary object hypothesis that jointly integrates temporal consistent constraints of saliency map estimated at multiple scales into the objective. We demonstrate the effectiveness of the proposed multiscale saliency flow for segmenting dynamic real-world scenes with large displacements caused by uniform sampling of video sequences.
Chen proposed an image quality assessment method to evaluate image quality at a ratio of noise in an image. However, Chen's method had some drawbacks that unnoticeable noise is reflected in the evaluation or noise position is not accurately detected. Therefore, in this paper, we propose a new image quality measurement scheme using the mean-centered WLNI (Weber's Law Noise Identifier) and the saliency map. The experimental results show that the proposed method outperforms Chen's and agrees more consistently with human visual judgment.
Takatsugu HIRAYAMA Toshiya OHIRA Kenji MASE
Intelligent information systems captivate people's attention. Examples of such systems include driving support vehicles capable of sensing driver state and communication robots capable of interacting with humans. Modeling how people search visual information is indispensable for designing these kinds of systems. In this paper, we focus on human visual attention, which is closely related to visual search behavior. We propose a computational model to estimate human visual attention while carrying out a visual target search task. Existing models estimate visual attention using the ratio between a representative value of visual feature of a target stimulus and that of distractors or background. The models, however, can not often achieve a better performance for difficult search tasks that require a sequentially spotlighting process. For such tasks, the linear separability effect of a visual feature distribution should be considered. Hence, we introduce this effect to spatially localized activation. Concretely, our top-down model estimates target-specific visual attention using Fisher's variance ratio between a visual feature distribution of a local region in the field of view and that of a target stimulus. We confirm the effectiveness of our computational model through a visual search experiment.
Hironori TAKIMOTO Tatsuhiko KOKUI Hitoshi YAMAUCHI Mitsuyoshi KISHIHARA Kensuke OKUBO
It is commonly believed that improved interaction between humans and electronic device, it is effective to draw the viewer's attention to a particular object. Augmented reality (AR) applications can call attention to real objects by overlaying highlight effects or visual stimuli (such as arrows) on a physical scene. Sometimes, more subtle effects would be desirable, in which case it would be necessary to smoothly and naturally guide the user's gaze without external stimuli. Here, a novel image modification method is proposed for directing a viewer's gaze to specific regions of interest. The proposed method uses saliency analysis and color modulation to create modified images in which the region of interest is the most salient region in the entire image. The proposed saliency map model that is used during saliency analysis reduces computational costs and improves the naturalness of the image using the LAB color space and simplified normalization. During color modulation, the modulation value of each LAB component is determined in order to consider the relationship between the LAB components and the saliency value. With the image obtained in this manner, the viewer's attention is smoothly attracted to a specific region very naturally. Gaze measurements as well as a subjective experiments were conducted to prove the effectiveness of the proposed method. These results show that a viewer's visual attention is indeed attracted toward the specified region without any sense of discomfort or disruption when the proposed method is used.
Xing ZHANG Keli HU Lei WANG Xiaolin ZHANG Yingguan WANG
In this study, we address the problem of salient region detection. Recently, saliency detection with contrast based approaches has shown to give promising results. However, different individual features exhibit different performance. In this paper, we show that the combination of color uniqueness and color spatial distribution is an effective way to detect saliency. A Color Adaptive Thresholding Watershed Fusion Segmentation (CAT-WFS) method is first given to retain boundary information and delete unnecessary details. Based on the segmentation, color uniqueness and color spatial distribution are defined separately. The color uniqueness denotes the color rareness of salient object, while the color spatial distribution represents the color attribute of the background. Aiming at highlighting the salient object and downplaying the background, we combine the two characters to generate the final saliency map. Experimental results demonstrate that the proposed algorithm outperforms existing salient object detection methods.
Xin HE Huiyun JING Qi HAN Xiamu NIU
Existing salient object detection methods either simply use a threshold to detect desired salient objects from saliency map or search the most promising rectangular window covering salient objects on the saliency map. There are two problems in the existing methods: 1) The performance of threshold-dependent methods depends on a threshold selection and it is difficult to select an appropriate threshold value. 2) The rectangular window not only covers the salient object but also contains background pixels, which leads to imprecise salient object detection. For solving these problems, a novel saliency threshold-free method for detecting the salient object with a well-defined boundary is proposed in this paper. We propose a novel window search algorithm to locate a rectangular window on our saliency map, which contains as many as possible pixels belonging the salient object and as few as possible background pixels. Once the window is determined, GrabCut is applied to extract salient object with a well-defined boundary. Compared with existing methods, our approach doesn't need any threshold to binarize the saliency map and additional operations. Experimental results show that our approach outperforms 4 state-of-the-art salient object detection methods, yielding higher precision and better F-Measure.
Xin HE Huiyun JING Qi HAN Xiamu NIU
We propose a novel saliency detection model based on Bayes' theorem. The model integrates the two parts of Bayes' equation to measure saliency, each part of which was considered separately in the previous models. The proposed model measures saliency by computing local kernel density estimation of features in the center-surround region and global kernel density estimation of features at each pixel across the whole image. Under the proposed model, a saliency detection method is presented that extracts DCT (Discrete Cosine Transform) magnitude of local region around each pixel as the feature. Experiments show that the proposed model not only performs competitively on psychological patterns and better than the current state-of-the-art models on human visual fixation data, but also is robust against signal uncertainty.
Jingjing ZHONG Siwei LUO Jiao WANG
The key problem of object-based attention is the definition of objects, while contour grouping methods aim at detecting the complete boundaries of objects in images. In this paper, we develop a new contour grouping method which shows several characteristics. First, it is guided by the global saliency information. By detecting multiple boundaries in a hierarchical way, we actually construct an object-based attention model. Second, it is optimized by the grouping cost, which is decided both by Gestalt cues of directed tangents and by region saliency. Third, it gives a new definition of Gestalt cues for tangents which includes image information as well as tangent information. In this way, we can improve the robustness of our model against noise. Experiment results are shown in this paper, with a comparison against other grouping model and space-based attention model.