The search functionality is under construction.

Keyword Search Result

[Keyword] visual attention(19hit)

1-19hit
  • Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow

    Rebeka SULTANA  Gosuke OHASHI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/26
      Vol:
    E106-D No:5
      Page(s):
    1018-1026

    In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.

  • Image Modification Based on Spatial Frequency Components for Visual Attention Retargeting

    Hironori TAKIMOTO  Syuhei HITOMI  Hitoshi YAMAUCHI  Mitsuyoshi KISHIHARA  Kensuke OKUBO  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2017/03/15
      Vol:
    E100-D No:6
      Page(s):
    1339-1349

    It is estimated that 80% of the information entering the human brain is obtained through the eyes. Therefore, it is commonly believed that drawing human attention to particular objects is effective in assisting human activities. In this paper, we propose a novel image modification method for guiding user attention to specific regions of interest by using a novel saliency map model based on spatial frequency components. We modify the frequency components on the basis of the obtained saliency map to decrease the visual saliency outside the specified region. By applying our modification method to an image, human attention can be guided to the specified region because the saliency inside the region is higher than that outside the region. Using gaze measurements, we show that the proposed saliency map matches well with the distribution of actual human attention. Moreover, we evaluate the effectiveness of the proposed modification method by using an eye tracking system.

  • Top-Down Visual Attention Estimation Using Spatially Localized Activation Based on Linear Separability of Visual Features

    Takatsugu HIRAYAMA  Toshiya OHIRA  Kenji MASE  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/09/10
      Vol:
    E98-D No:12
      Page(s):
    2308-2316

    Intelligent information systems captivate people's attention. Examples of such systems include driving support vehicles capable of sensing driver state and communication robots capable of interacting with humans. Modeling how people search visual information is indispensable for designing these kinds of systems. In this paper, we focus on human visual attention, which is closely related to visual search behavior. We propose a computational model to estimate human visual attention while carrying out a visual target search task. Existing models estimate visual attention using the ratio between a representative value of visual feature of a target stimulus and that of distractors or background. The models, however, can not often achieve a better performance for difficult search tasks that require a sequentially spotlighting process. For such tasks, the linear separability effect of a visual feature distribution should be considered. Hence, we introduce this effect to spatially localized activation. Concretely, our top-down model estimates target-specific visual attention using Fisher's variance ratio between a visual feature distribution of a local region in the field of view and that of a target stimulus. We confirm the effectiveness of our computational model through a visual search experiment.

  • Image Modification Based on a Visual Saliency Map for Guiding Visual Attention

    Hironori TAKIMOTO  Tatsuhiko KOKUI  Hitoshi YAMAUCHI  Mitsuyoshi KISHIHARA  Kensuke OKUBO  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/08/13
      Vol:
    E98-D No:11
      Page(s):
    1967-1975

    It is commonly believed that improved interaction between humans and electronic device, it is effective to draw the viewer's attention to a particular object. Augmented reality (AR) applications can call attention to real objects by overlaying highlight effects or visual stimuli (such as arrows) on a physical scene. Sometimes, more subtle effects would be desirable, in which case it would be necessary to smoothly and naturally guide the user's gaze without external stimuli. Here, a novel image modification method is proposed for directing a viewer's gaze to specific regions of interest. The proposed method uses saliency analysis and color modulation to create modified images in which the region of interest is the most salient region in the entire image. The proposed saliency map model that is used during saliency analysis reduces computational costs and improves the naturalness of the image using the LAB color space and simplified normalization. During color modulation, the modulation value of each LAB component is determined in order to consider the relationship between the LAB components and the saliency value. With the image obtained in this manner, the viewer's attention is smoothly attracted to a specific region very naturally. Gaze measurements as well as a subjective experiments were conducted to prove the effectiveness of the proposed method. These results show that a viewer's visual attention is indeed attracted toward the specified region without any sense of discomfort or disruption when the proposed method is used.

  • Selective Attention Mechanisms for Visual Quality Assessment

    Ulrich ENGELKE  

     
    INVITED PAPER

      Vol:
    E98-A No:8
      Page(s):
    1681-1688

    Selective visual attention is an integral mechanism of the human visual system that is often neglected when designing perceptually relevant image and video quality metrics. Disregarding attention mechanisms assumes that all distortions in the visual content impact equally on the overall quality perception, which is typically not the case. Over the past years we have performed several experiments to study the effect of visual attention on quality perception. In addition to gaining a deeper scientific understanding of this matter, we were also able to use this knowledge to further improve various quality prediction models. In this article, I review our work with the aim to increase awareness on the importance of visual attention mechanisms for the effective design of quality prediction models.

  • Hybrid Integration of Visual Attention Model into Image Quality Metric

    Chanho JUNG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2014/08/22
      Vol:
    E97-D No:11
      Page(s):
    2971-2973

    Integrating the visual attention (VA) model into an objective image quality metric is a rapidly evolving area in modern image quality assessment (IQA) research due to the significant opportunities the VA information presents. So far, in the literature, it has been suggested to use either a task-free saliency map or a quality-task one for the integration into quality metric. A hybrid integration approach which takes the advantages of both saliency maps is presented in this paper. We compare our hybrid integration scheme with existing integration schemes using simple quality metrics. Results show that the proposed method performs better than the previous techniques in terms of prediction accuracy.

  • Distribution of Attention in Augmented Reality: Comparison between Binocular and Monocular Presentation Open Access

    Akihiko KITAMURA  Hiroshi NAITO  Takahiko KIMURA  Kazumitsu SHINOHARA  Takashi SASAKI  Haruhiko OKUMURA  

     
    INVITED PAPER

      Vol:
    E97-C No:11
      Page(s):
    1081-1088

    This study investigated the distribution of attention to frontal space in augmented reality (AR). We conducted two experiments to compare binocular and monocular observation when an AR image was presented. According to a previous study, when participants observed an AR image in monocular presentation, they perceived the AR image as more distant than in binocular vision. Therefore, we predicted that attention would need to be shifted between the AR image and the background in not the monocular observation but the binocular one. This would enable an observer to distribute his/her visual attention across a wider space in the monocular observation. In the experiments, participants performed two tasks concurrently to measure the size of the useful field of view (UFOV). One task was letter/number discrimination in which an AR image was presented in the central field of view (the central task). The other task was luminance change detection in which dots were presented in the peripheral field of view (the peripheral task). Depth difference existed between the AR image and the location of the peripheral task in Experiment 1 but not in Experiment 2. The results of Experiment 1 indicated that the UFOV became wider in the monocular observation than in the binocular observation. In Experiment 2, the size of the UFOV in the monocular observation was equivalent to that in the binocular observation. It becomes difficult for a participant to observe the stimuli on the background in the binocular observation when there is depth difference between the AR image and the background. These results indicate that the monocular presentation in AR is superior to binocular presentation, and even in the best condition for the binocular condition the monocular presentation is equivalent to the binocular presentation in terms of the UFOV.

  • Salient Region Detection Based on Color Uniqueness and Color Spatial Distribution

    Xing ZHANG  Keli HU  Lei WANG  Xiaolin ZHANG  Yingguan WANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E97-D No:7
      Page(s):
    1933-1936

    In this study, we address the problem of salient region detection. Recently, saliency detection with contrast based approaches has shown to give promising results. However, different individual features exhibit different performance. In this paper, we show that the combination of color uniqueness and color spatial distribution is an effective way to detect saliency. A Color Adaptive Thresholding Watershed Fusion Segmentation (CAT-WFS) method is first given to retain boundary information and delete unnecessary details. Based on the segmentation, color uniqueness and color spatial distribution are defined separately. The color uniqueness denotes the color rareness of salient object, while the color spatial distribution represents the color attribute of the background. Aiming at highlighting the salient object and downplaying the background, we combine the two characters to generate the final saliency map. Experimental results demonstrate that the proposed algorithm outperforms existing salient object detection methods.

  • Indoor Scene Classification Based on the Bag-of-Words Model of Local Feature Information Gain

    Rong WANG  Zhiliang WANG  Xirong MA  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E96-D No:4
      Page(s):
    984-987

    For the problem of Indoor Home Scene Classification, this paper proposes the BOW Model of Local Feature Information Gain. The experimental results show that not only the performance is improved but also the computation is reduced. Consequently this method out performs the state-of-the-art approach.

  • Computational Models of Human Visual Attention and Their Implementations: A Survey Open Access

    Akisato KIMURA  Ryo YONETANI  Takatsugu HIRAYAMA  

     
    INVITED SURVEY PAPER

      Vol:
    E96-D No:3
      Page(s):
    562-578

    We humans are easily able to instantaneously detect the regions in a visual scene that are most likely to contain something of interest. Exploiting this pre-selection mechanism called visual attention for image and video processing systems would make them more sophisticated and therefore more useful. This paper briefly describes various computational models of human visual attention and their development, as well as related psychophysical findings. In particular, our objective is to carefully distinguish several types of studies related to human visual attention and saliency as a measure of attentiveness, and to provide a taxonomy from several viewpoints such as the main objective, the use of additional cues and mathematical principles. This survey finally discusses possible future directions for research into human visual attention and saliency computation.

  • Skeleton Modulated Topological Perception Map for Rapid Viewpoint Selection

    Zhenfeng SHI  Liyang YU  Ahmed A. ABD EL-LATIF  Xiamu NIU  

     
    LETTER-Computer Graphics

      Vol:
    E95-D No:10
      Page(s):
    2585-2588

    Incorporating insights from human visual perception into 3D object processing has become an important research field in computer graphics during the past decades. Many computational models for different applications have been proposed, such as mesh saliency, mesh roughness and mesh skeleton. In this letter, we present a novel Skeleton Modulated Topological Visual Perception Map (SMTPM) integrated with visual attention and visual masking mechanism. A new skeletonisation map is presented and used to modulate the weight of saliency and roughness. Inspired by salient viewpoint selection, a new Loop subdivision stencil decision based rapid viewpoint selection algorithm using our new visual perception is also proposed. Experimental results show that the SMTPM scheme can capture more richer visual perception information and our rapid viewpoint selection achieves high efficiency.

  • Global-Context Based Salient Region Detection in Nature Images

    Hong BAO  De XU  Yingjun TANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E95-D No:5
      Page(s):
    1556-1559

    Visually saliency detection provides an alternative methodology to image description in many applications such as adaptive content delivery and image retrieval. One of the main aims of visual attention in computer vision is to detect and segment the salient regions in an image. In this paper, we employ matrix decomposition to detect salient object in nature images. To efficiently eliminate high contrast noise regions in the background, we integrate global context information into saliency detection. Therefore, the most salient region can be easily selected as the one which is globally most isolated. The proposed approach intrinsically provides an alternative methodology to model attention with low implementation complexity. Experiments show that our approach achieves much better performance than that from the existing state-of-art methods.

  • A Novel Bayes' Theorem-Based Saliency Detection Model

    Xin HE  Huiyun JING  Qi HAN  Xiamu NIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E94-D No:12
      Page(s):
    2545-2548

    We propose a novel saliency detection model based on Bayes' theorem. The model integrates the two parts of Bayes' equation to measure saliency, each part of which was considered separately in the previous models. The proposed model measures saliency by computing local kernel density estimation of features in the center-surround region and global kernel density estimation of features at each pixel across the whole image. Under the proposed model, a saliency detection method is presented that extracts DCT (Discrete Cosine Transform) magnitude of local region around each pixel as the feature. Experiments show that the proposed model not only performs competitively on psychological patterns and better than the current state-of-the-art models on human visual fixation data, but also is robust against signal uncertainty.

  • A Novel Saliency-Based Graph Learning Framework with Application to CBIR

    Hong BAO  Song-He FENG  De XU  Shuoyan LIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E94-D No:6
      Page(s):
    1353-1356

    Localized content-based image retrieval (LCBIR) has emerged as a hot topic more recently because in the scenario of CBIR, the user is interested in a portion of the image and the rest of the image is irrelevant. In this paper, we propose a novel region-level relevance feedback method to solve the LCBIR problem. Firstly, the visual attention model is employed to measure the regional saliency of each image in the feedback image set provided by the user. Secondly, the regions in the image set are constructed to form an affinity matrix and a novel propagation energy function is defined which takes both low-level visual features and regional significance into consideration. After the iteration, regions in the positive images with high confident scores are selected as the candidate query set to conduct the next-round retrieval task until the retrieval results are satisfactory. Experimental results conducted on the SIVAL dataset demonstrate the effectiveness of the proposed approach.

  • Adaptive Non-linear Intensity Mapping Based Salient Region Extraction

    Congyan LANG  De XU  Shuoyan LIU  Ning LI  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E92-D No:4
      Page(s):
    753-756

    Salient Region Extraction provides an alternative methodology to image description in many applications such as adaptive content delivery and image retrieval. In this paper, we propose a robust approach to extracting the salient region based on bottom-up visual attention. The main contributions are twofold: 1) Instead of the feature parallel integration, the proposed saliencies are derived by serial processing between texture and color features. Hence, the proposed approach intrinsically provides an alternative methodology to model attention with low implementation complexity. 2) A constructive approach is proposed for rendering an image by a non-linear intensity mapping, which can efficiently eliminate high contrast noise regions in the image. And then the salient map can be robustly generated for a variety of nature images. Experiments show that the proposed algorithm is effective and can characterize the human perception well.

  • Combining Attention Model with Hierarchical Graph Representation for Region-Based Image Retrieval

    Song-He FENG  De XU  Bing LI  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E91-D No:8
      Page(s):
    2203-2206

    The manifold-ranking algorithm has been successfully adopted in content-based image retrieval (CBIR) in recent years. However, while the global low-level features are widely utilized in current systems, region-based features have received little attention. In this paper, a novel attention-driven transductive framework based on a hierarchical graph representation is proposed for region-based image retrieval (RBIR). This approach can be characterized by two key properties: (1) Since the issue about region significance is the key problem in region-based retrieval, a visual attention model is chosen here to measure the regions' significance. (2) A hierarchical graph representation which combines region-level with image-level similarities is utilized for the manifold-ranking method. A novel propagation energy function is defined which takes both low-level visual features and regional significance into consideration. Experimental results demonstrate that the proposed approach shows the satisfactory retrieval performance compared to the global-based and the block-based manifold-ranking methods.

  • Modeling Bottom-Up Visual Attention for Color Images

    Congyan LANG  De XU  Ning LI  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E91-D No:3
      Page(s):
    869-872

    Modeling visual attention provides an alternative methodology to image description in many applications such as adaptive content delivery and image retrieval. In this paper, we propose a robust approach to the modeling bottom-up visual attention. The main contributions are twofold: 1) We use a principal component analysis (PCA) to transform the RGB color space into three principal components, which intrinsically leads to an opponent representation of colors to ensure good saliency analysis. 2) A practicable framework for modeling visual attention is presented based on a region-level reliability analysis for each feature map. And then the salient map can be robustly generated for a variety of nature images. Experiments show that the proposed algorithm is effective and can characterize the human perception well.

  • A Visual Attention Based Region-of-Interest Determination Framework for Video Sequences

    Wen-Huang CHENG  Wei-Ta CHU  Ja-Ling WU  

     
    PAPER-Image Processing and Multimedia Systems

      Vol:
    E88-D No:7
      Page(s):
    1578-1586

    This paper presents a framework for automatic video region-of-interest determination based on visual attention model. We view this work as a preliminary step towards the solution of high-level semantic video analysis. Facing such a challenging issue, in this work, a set of attempts on using video attention features and knowledge of computational media aesthetics are made. The three types of visual attention features we used are intensity, color, and motion. Referring to aesthetic principles, these features are combined according to camera motion types on the basis of a new proposed video analysis unit, frame-segment. We conduct subjective experiments on several kinds of video data and demonstrate the effectiveness of the proposed framework.

  • An Adaptive Visual Attentive Tracker with HMM-Based TD Learning Capability for Human Intended Behavior

    Minh Anh Thi HO  Yoji YAMADA  Yoji UMETANI  

     
    PAPER-Artificial Intelligence, Cognitive Science

      Vol:
    E86-D No:6
      Page(s):
    1051-1058

    In the study, we build a system called Adaptive Visual Attentive Tracker (AVAT) for the purpose of developing a non-verbal communication channel between the system and an operator who presents intended movements. In the system, we constructed an HMM (Hidden Markov Models)-based TD (Temporal Difference) learning algorithm to track and zoom in on an operator's behavioral sequence which represents his/her intention. AVAT extracts human intended movements from ordinary walking behavior based on the following two algorithms: the first is to model the movements of human body parts using HMMs algorithm, and the second is to learn the model of the tracker's action using a model-based TD learning algorithm. In the paper, we describe the integrated algorithm of the above two methods: whose linkage is established by assigning the state transition probability in HMM as a reward in TD learning. Experimental results of extracting an operator's hand sign action sequence during her natural walking motion are shown which demonstrates the function of AVAT as it is developed within the framework of perceptual organization. Identification of the sign gesture context through wavelet analysis autonomously provides a reward value for optimizing AVAT's action patterns.