The search functionality is under construction.

Keyword Search Result

[Keyword] perception(41hit)

1-20hit(41hit)

  • Learning Pixel Perception for Identity and Illumination Consistency Face Frontalization in the Wild

    Yongtang BAO  Pengfei ZHOU  Yue QI  Zhihui WANG  Qing FAN  

     
    PAPER-Person Image Generation

      Pubricized:
    2022/06/21
      Vol:
    E106-D No:5
      Page(s):
    794-803

    A frontal and realistic face image was synthesized from a single profile face image. It has a wide range of applications in face recognition. Although the frontal face method based on deep learning has made substantial progress in recent years, there is still no guarantee that the generated face has identity consistency and illumination consistency in a significant posture. This paper proposes a novel pixel-based feature regression generative adversarial network (PFR-GAN), which can learn to recover local high-frequency details and preserve identity and illumination frontal face images in an uncontrolled environment. We first propose a Reslu block to obtain richer feature representation and improve the convergence speed of training. We then introduce a feature conversion module to reduce the artifacts caused by face rotation discrepancy, enhance image generation quality, and preserve more high-frequency details of the profile image. We also construct a 30,000 face pose dataset to learn about various uncontrolled field environments. Our dataset includes ages of different races and wild backgrounds, allowing us to handle other datasets and obtain better results. Finally, we introduce a discriminator used for recovering the facial structure of the frontal face images. Quantitative and qualitative experimental results show our PFR-GAN can generate high-quality and high-fidelity frontal face images, and our results are better than the state-of-art results.

  • A Method for Generating Color Palettes with Deep Neural Networks Considering Human Perception

    Beiying LIU  Kaoru ARAKAWA  

     
    PAPER-Image, Vision, Neural Networks and Bioengineering

      Pubricized:
    2021/09/30
      Vol:
    E105-A No:4
      Page(s):
    639-646

    A method to generate color palettes from images is proposed. Here, deep neural networks (DNN) are utilized in order to consider human perception. Two aspects of human perception are considered; one is attention to image, and the other is human preference for colors. This method first extracts N regions with dominant color categories from the image considering human attention. Here, N is the number of colors in a color palette. Then, the representative color is obtained from each region considering the human preference for color. Two deep neural-net systems are adopted here, one is for estimating the image area which attracts human attention, and the other is for estimating human preferable colors from image regions to obtain representative colors. The former is trained with target images obtained by an eye tracker, and the latter is trained with dataset of color selection by human. Objective and subjective evaluation is performed to show high performance of the proposed system compared with conventional methods.

  • Effects of Initial Configuration on Attentive Tracking of Moving Objects Whose Depth in 3D Changes

    Anis Ur REHMAN  Ken KIHARA  Sakuichi OHTSUKA  

     
    PAPER-Vision

      Pubricized:
    2021/02/25
      Vol:
    E104-A No:9
      Page(s):
    1339-1344

    In daily reality, people often pay attention to several objects that change positions while being observed. In the laboratory, this process is investigated by a phenomenon known as multiple object tracking (MOT) which is a task that evaluates attentive tracking performance. Recent findings suggest that the attentional set for multiple moving objects whose depth changes in three dimensions from one plane to another is influenced by the initial configuration of the objects. When tracking objects, it is difficult for people to expand their attentional set to multiple-depth planes once attention has been focused on a single plane. However, less is known about people contracting their attentional set from multiple-depth planes to a single-depth plane. In two experiments, we examined tracking accuracy when four targets or four distractors, which were initially distributed on two planes, come together on one of the planes during an MOT task. The results from this study suggest that people have difficulty changing the depth range of their attention during attentive tracking, and attentive tracking performance depends on the initial attentional set based on the configuration prior to attentive tracking.

  • Towards mmWave V2X in 5G and Beyond to Support Automated Driving Open Access

    Kei SAKAGUCHI  Ryuichi FUKATSU  Tao YU  Eisuke FUKUDA  Kim MAHLER  Robert HEATH  Takeo FUJII  Kazuaki TAKAHASHI  Alexey KHORYAEV  Satoshi NAGATA  Takayuki SHIMIZU  

     
    INVITED SURVEY PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2020/11/26
      Vol:
    E104-B No:6
      Page(s):
    587-603

    Millimeter wave provides high data rates for Vehicle-to-Everything (V2X) communications. This paper motivates millimeter wave to support automated driving and begins by explaining V2X use cases that support automated driving with references to several standardization bodies. The paper gives a classification of existing V2X standards: IEEE802.11p and LTE V2X, along with the status of their commercial deployment. Then, the paper provides a detailed assessment on how millimeter wave V2X enables the use case of cooperative perception. The explanations provide detailed rate calculations for this use case and show that millimeter wave is the only technology able to achieve the requirements. Furthermore, specific challenges related to millimeter wave for V2X are described, including coverage enhancement and beam alignment. The paper concludes with some results from three studies, i.e. IEEE802.11ad (WiGig) based V2X, extension of 5G NR (New Radio) toward mmWave V2X, and prototypes of intelligent street with mmWave V2X.

  • Perception and Saccades during Figure-Ground Segregation and Border-Ownership Discrimination in Natural Contours

    Nobuhiko WAGATSUMA  Mika URABE  Ko SAKAI  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2020/01/27
      Vol:
    E103-D No:5
      Page(s):
    1126-1134

    Figure-ground (FG) segregation has been considered as a fundamental step towards object recognition. We explored plausible mechanisms that estimate global figure-ground segregation from local image features by investigating the human visual system. Physiological studies have reported border-ownership (BO) selective neurons in V2 which signal the local direction of figure (DOF) along a border; however, how local BO signals contribute to global FG segregation has not been clarified. The BO and FG processing could be independent, dependent on each other, or inseparable. The investigation on the differences and similarities between the BO and FG judgements is important for exploring plausible mechanisms that enable global FG estimation from local clues. We performed psychophysical experiments that included two different tasks each of which focused on the judgement of either BO or FG. The perceptual judgments showed consistency between the BO and FG determination while a longer distance in gaze movement was observed in FG segregation than BO discrimination. These results suggest the involvement of distinct neural mechanism for local BO determination and global FG segregation.

  • Analysis of Relevant Quality Metrics and Physical Parameters in Softness Perception and Assessment System

    Zhiyu SHAO  Juan WU  Qiangqiang OUYANG  

     
    PAPER-Rehabilitation Engineering and Assistive Technology

      Pubricized:
    2019/06/11
      Vol:
    E102-D No:10
      Page(s):
    2013-2024

    Many quality metrics have been proposed for the compliance perception to assess haptic device performance and perceived results. Perceived compliance may be influenced by factors such as object properties, experimental conditions and human perceptual habits. In this paper, analysis of softness perception was conducted to find out relevant quality metrics dominating in the compliance perception system and their correlation with perception results, by expressing these metrics by basic physical parameters that characterizing these factors. Based on three psychophysical experiments, just noticeable differences (JNDs) for perceived softness of combination of different stiffness coefficients and damping levels rendered by haptic devices were analyzed. Interaction data during the interaction process were recorded and analyzed. Preliminary experimental results show that the discrimination ability of softness perception changes with the ratio of damping to stiffness when subjects exploring at their habitual speed. Analysis results indicate that quality metrics of Rate-hardness, Extended Rate-hardness and ratio of damping to stiffness have high correlation for perceived results. Further analysis results show that parameters that reflecting object properties (stiffness, damping), experimental conditions (force bandwidth) and human perceptual habits (initial speed, maximum force change rate) lead to the change of these quality metrics, which then bring different perceptual feeling and finally result in the change of discrimination ability. Findings in this paper may provide a better understanding of softness perception and useful guidance in improvement of haptic and teleoperation devices.

  • Objective Evaluation of Impression of Faces with Various Female Hairstyles Using Field of Visual Perception

    Naoyuki AWANO  Kana MOROHOSHI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/03/22
      Vol:
    E101-D No:6
      Page(s):
    1648-1656

    Most people are concerned about their appearance, and the easiest way to change the appearance is to change the hairstyle. However, except for professional hairstylists, it is difficult to objectively judge which hairstyle suits them. Currently, oval faces are generally said to be the ideal facial shape in terms of suitability to various hairstyles. Meanwhile, field of visual perception (FVP), proposed recently in the field of cognitive science, has attracted attention as a model to represent the visual perception phenomenon. Moreover, a computation model for digital images has been proposed, and it is expected to be used in quantitative evaluation of sensibility and sensitivity called “kansei.” Quantitative evaluation of “goodness of patterns” and “strength of impressions” by evaluating distributions of the field has been reported. However, it is unknown whether the evaluation method can be generalized for use in various subjects, because it has been applied only to some research subjects, such as characters, text, and simple graphics. In this study, for the first time, we apply FVP to facial images with various hairstyles and verify whether it has the potential of evaluating impressions of female faces. Specifically, we verify whether the impressions of facial images that combine various facial shapes and female hairstyles can be represented using FVP. We prepare many combinational images of facial shapes and hairstyles and conduct a psychological experiment to evaluate their impressions. Moreover, we compute the FVP of each image and propose a novel evaluation method by analyzing the distributions. The conventional and proposed evaluation values correlated to the psychological evaluation values after normalization, and demonstrated the effectiveness of the FVP as an image feature quantity to evaluate faces.

  • Imperceptible On-Screen Markers for Mobile Interaction on Public Large Displays

    Goshiro YAMAMOTO  Luiz SAMPAIO  Takafumi TAKETOMI  Christian SANDOR  Hirokazu KATO  Tomohiro KURODA  

     
    PAPER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    2027-2036

    We present a novel method to enable users to experience mobile interaction with digital content on external displays by embedding markers imperceptibly on the screen. Our method consists of two parts: marker embedding on external displays and marker detection. To embed markers, similar to previous work, we display complementary colors in alternating frames, which are selected by considering L*a*b color space in order to make the markers harder for humans to detect. Our marker detection process does not require mobile devices to be synchronized with the display, while certain constraints for the relation between camera and display update rate need to be fulfilled. In this paper, we have conducted three experiments. The results show 1) selecting complementary colors in the a*b* color plane maximizes imperceptibility, 2) our method is extremely robust when used with static contents and can handle animated contents up to certain optical flow levels, and 3) our method was proved to work well in case of small movements, but large movements can lead to loss of tracking.

  • Image Quality Assessment Based on Multi-Order Local Features Description, Modeling and Quantification

    Yong DING  Xinyu ZHAO  Zhi ZHANG  Hang DAI  

     
    PAPER-Pattern Recognition

      Pubricized:
    2017/03/16
      Vol:
    E100-D No:6
      Page(s):
    1303-1315

    Image quality assessment (IQA) plays an important role in quality monitoring, evaluation and optimization for image processing systems. However, current quality-aware feature extraction methods for IQA can hardly balance accuracy and complexity. This paper introduces multi-order local description into image quality assessment for feature extraction. The first-order structure derivative and high-order discriminative information are integrated into local pattern representation to serve as the quality-aware features. Then joint distributions of the local pattern representation are modeled by spatially enhanced histogram. Finally, the image quality degradation is estimated by quantifying the divergence between such distributions of the reference image and those of the distorted image. Experimental results demonstrate that the proposed method outperforms other state-of-the-art approaches in consideration of not only accuracy that is consistent with human subjective evaluation, but also robustness and stability across different distortion types and various public databases. It provides a promising choice for image quality assessment development.

  • An Image Quality Assessment Using Mean-Centered Weber Ratio and Saliency Map

    Soyoung CHUNG  Min Gyo CHUNG  

     
    LETTER

      Pubricized:
    2015/10/21
      Vol:
    E99-D No:1
      Page(s):
    138-140

    Chen proposed an image quality assessment method to evaluate image quality at a ratio of noise in an image. However, Chen's method had some drawbacks that unnoticeable noise is reflected in the evaluation or noise position is not accurately detected. Therefore, in this paper, we propose a new image quality measurement scheme using the mean-centered WLNI (Weber's Law Noise Identifier) and the saliency map. The experimental results show that the proposed method outperforms Chen's and agrees more consistently with human visual judgment.

  • Estimation of Interpersonal Relationships in Movies

    Yuta OHWATARI  Takahiro KAWAMURA  Yuichi SEI  Yasuyuki TAHARA  Akihiko OHSUGA  

     
    PAPER

      Pubricized:
    2015/11/05
      Vol:
    E99-D No:1
      Page(s):
    128-137

    In many movies, social conditions and awareness of the issues of the times are depicted in any form. Even if fantasy and science fiction are works far from reality, the character relationship does mirror the real world. Therefore, we try to understand social conditions of the real world by analyzing the movie. As a way to analyze the movies, we propose a method of estimating interpersonal relationships of the characters, using a machine learning technique called Markov Logic Network (MLN) from movie script databases on the Web. The MLN is a probabilistic logic network that can describe the relationships between characters, which are not necessarily satisfied on every line. In experiments, we confirmed that our proposed method can estimate favors between the characters in a movie with F-measure of 58.7%. Finally, by comparing the relationships with social indicators, we discussed the relevance of the movies to the real world.

  • Analyzing Perceived Empathy Based on Reaction Time in Behavioral Mimicry

    Shiro KUMANO  Kazuhiro OTSUKA  Masafumi MATSUDA  Junji YAMATO  

     
    PAPER-Affective Computing

      Vol:
    E97-D No:8
      Page(s):
    2008-2020

    This study analyzes emotions established between people while interacting in face-to-face conversation. By focusing on empathy and antipathy, especially the process by which they are perceived by external observers, this paper aims to elucidate the tendency of their perception and from it develop a computational model that realizes the automatic inference of perceived empathy/antipathy. This paper makes two main contributions. First, an experiment demonstrates that an observer's perception of an interacting pair is affected by the time lags found in their actions and reactions in facial expressions and by whether their expressions are congruent or not. For example, a congruent but delayed reaction is unlikely to be perceived as empathy. Based on our findings, we propose a probabilistic model that relates the perceived empathy/antipathy of external observers to the actions and reactions of conversation participants. An experiment is conducted on ten conversations performed by 16 women in which the perceptions of nine external observers are gathered. The results demonstrate that timing cues are useful in improving the inference performance, especially for perceived antipathy.

  • Adaptive Block-Wise Compressive Image Sensing Based on Visual Perception

    Xue ZHANG  Anhong WANG  Bing ZENG  Lei LIU  Zhuo LIU  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E96-D No:2
      Page(s):
    383-386

    Numerous examples in image processing have demonstrated that human visual perception can be exploited to improve processing performance. This paper presents another showcase in which some visual information is employed to guide adaptive block-wise compressive sensing (ABCS) for image data, i.e., a varying CS-sampling rate is applied on different blocks according to the visual contents in each block. To this end, we propose a visual analysis based on the discrete cosine transform (DCT) coefficients of each block reconstructed at the decoder side. The analysis result is sent back to the CS encoder, stage-by-stage via a feedback channel, so that we can decide which blocks should be further CS-sampled and what is the extra sampling rate. In this way, we can perform multiple passes of reconstruction to improve the quality progressively. Simulation results show that our scheme leads to a significant improvement over the existing ones with a fixed sampling rate.

  • Skeleton Modulated Topological Perception Map for Rapid Viewpoint Selection

    Zhenfeng SHI  Liyang YU  Ahmed A. ABD EL-LATIF  Xiamu NIU  

     
    LETTER-Computer Graphics

      Vol:
    E95-D No:10
      Page(s):
    2585-2588

    Incorporating insights from human visual perception into 3D object processing has become an important research field in computer graphics during the past decades. Many computational models for different applications have been proposed, such as mesh saliency, mesh roughness and mesh skeleton. In this letter, we present a novel Skeleton Modulated Topological Visual Perception Map (SMTPM) integrated with visual attention and visual masking mechanism. A new skeletonisation map is presented and used to modulate the weight of saliency and roughness. Inspired by salient viewpoint selection, a new Loop subdivision stencil decision based rapid viewpoint selection algorithm using our new visual perception is also proposed. Experimental results show that the SMTPM scheme can capture more richer visual perception information and our rapid viewpoint selection achieves high efficiency.

  • A Multi-Scale Structural Degradation Metric for Perceptual Evaluation of 3D Mesh Simplification

    Zhenfeng SHI  Xiamu NIU  Liyang YU  

     
    PAPER-Computer Graphics

      Vol:
    E95-D No:7
      Page(s):
    1989-2001

    Visual degradation is usually introduced during 3D mesh simplification. The main issue in mesh simplification is to maximize the simplification ratio while minimizing the visual degradation. Therefore, effective and objective evaluation of the visual degradation is essential in order to select the simplification ratio. Some objective geometric and subjective perceptual metrics have been proposed. However, few objective metrics have taken human visual characteristics into consideration. To evaluate the visual degradation introduced by mesh simplification for a 3D triangular object, we integrate the structural degradation with mesh saliency and propose a new objective and multi-scale evaluation metric named Global Perceptual Structural Degradation (GPSD). The proper selection of the simplification ratio under a given distance-to-viewpoint is also discussed in this paper. The accuracy and validity of the proposed metric have been demonstrated through subjective experiments. The experimental results confirm that the GPSD metric shows better 3D model-based multi-scale perceptual evaluation capability.

  • Spectral Features for Perceptually Natural Phoneme Replacement by Another Speaker's Speech

    Reiko TAKOU  Hiroyuki SEGI  Tohru TAKAGI  Nobumasa SEIYAMA  

     
    PAPER-Speech and Hearing

      Vol:
    E95-A No:4
      Page(s):
    751-759

    The frequency regions and spectral features that can be used to measure the perceived similarity and continuity of voice quality are reported here. A perceptual evaluation test was conducted to assess the naturalness of spoken sentences in which either a vowel or a long vowel of the original speaker was replaced by that of another. Correlation analysis between the evaluation score and the spectral feature distance was conducted to select the spectral features that were expected to be effective in measuring the voice quality and to identify the appropriate speech segment of another speaker. The mel-frequency cepstrum coefficient (MFCC) and the spectral center of gravity (COG) in the low-, middle-, and high-frequency regions were selected. A perceptual paired comparison test was carried out to confirm the effectiveness of the spectral features. The results showed that the MFCC was effective for spectra across a wide range of frequency regions, the COG was effective in the low- and high-frequency regions, and the effective spectral features differed among the original speakers.

  • Depth Enhancement Considering Just Noticeable Difference in Depth

    Seung-Won JUNG  Sung-Jea KO  

     
    LETTER-Image

      Vol:
    E95-A No:3
      Page(s):
    673-675

    Recent advances in 3-D technologies draw an interest on the just noticeable difference in depth (JNDD) that describes a perceptual threshold of depth differences. In this letter, we address a new application of the JNDD to the depth image enhancement. In the proposed algorithm, a depth image is first segmented into multiple layers and then the depth range of the layer is expanded if the depth difference between adjacent layers is smaller than the JNDD. Therefore, viewers can effectively perceive the depth differences between layers and thus the human depth perception can be improved. The proposed algorithm can be applied to any depth-based 3-D display applications.

  • Accuracy of Smooth Pursuit Eye Movement and Perception Rate of a False Contour While Pursuing a Rapidly Moving Image

    Yusuke HORIE  Yuta KAWAMURA  Akiyuki SEITA  Mitsuho YAMADA  

     
    LETTER-Vision

      Vol:
    E94-A No:2
      Page(s):
    542-547

    The purpose of this study was to clarify whether viewers can perceive a digitally deteriorated image while pursuing a speedily moving digitally compressed image. We studied the perception characteristics of false contours among the various digital deteriorations for the four types of displays i.e. CRT, PDP, EL, LCD by changing the gradation levels and the speed of moving image as parameters. It is known that 8 bits is not high enough resolution for still images, and it is assumed that 8 bits is also not enough for an image moving at less than 5 deg/sec since the tracking accuracy of smooth pursuit eye movement (SPEM) is very high for a target moving at less than 5 deg/sec. Given these facts, we focused on images moving at more than 5 deg/sec. In our results, the images deteriorated by a false contour at a gradation level less than 32 were perceived by every subject at almost all velocities, from 5 degrees/sec to 30 degrees/sec, for all four types of displays we used. However, the perception rate drastically decreased when the gradation levels reached 64, with almost no subjects detecting deterioration for gradation levels more than 64 at any velocity. Compared to other displays, LCDs yielded relatively high recognition rates for gradation levels of 64, especially at lower velocities.

  • A Semi-Supervised Approach to Perceived Age Prediction from Face Images

    Kazuya UEKI  Masashi SUGIYAMA  Yasuyuki IHARA  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E93-D No:10
      Page(s):
    2875-2878

    We address the problem of perceived age estimation from face images, and propose a new semi-supervised approach involving two novel aspects. The first novelty is an efficient active learning strategy for reducing the cost of labeling face samples. Given a large number of unlabeled face samples, we reveal the cluster structure of the data and propose to label cluster-representative samples for covering as many clusters as possible. This simple sampling strategy allows us to boost the performance of a manifold-based semi-supervised learning method only with a relatively small number of labeled samples. The second contribution is to take the heterogeneous characteristics of human age perception into account. It is rare to misjudge the age of a 5-year-old child as 15 years old, but the age of a 35-year-old person is often misjudged as 45 years old. Thus, magnitude of the error is different depending on subjects' age. We carried out a large-scale questionnaire survey for quantifying human age perception characteristics, and propose to utilize the quantified characteristics in the framework of weighted regression. Consequently, our proposed method is expressed in the form of weighted least-squares with a manifold regularizer, which is scalable to massive datasets. Through real-world age estimation experiments, we demonstrate the usefulness of the proposed method.

  • The Influence of a Low-Level Color or Figure Adaptation on a High-Level Face Perception

    Miao SONG  Keizo SHINOMORI  Shiyong ZHANG  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E93-D No:1
      Page(s):
    176-184

    Visual adaptation is a universal phenomenon associated with human visual system. This adaptation affects not only the perception of low-level visual systems processing color, motion, and orientation, but also the perception of high-level visual systems processing complex visual patterns, such as facial identity and expression. Although it remains unclear for the mutual interaction mechanism between systems at different levels, this issue is the key to understand the hierarchical neural coding and computation mechanism. Thus, we examined whether the low-level adaptation influences on the high-level aftereffect by means of cross-level adaptation paradigm (i.e. color, figure adaptation versus facial identity adaptation). We measured the identity aftereffects within the real face test images on real face, color chip and figure adapting conditions. The cross-level mutual influence was evaluated by the aftereffect size among different adapting conditions. The results suggest that the adaptation to color and figure contributes to the high-level facial identity aftereffect. Besides, the real face adaptation obtained the significantly stronger aftereffect than the color chip or the figure adaptation. Our results reveal the possibility of cross-level adaptation propagation and implicitly indicate a high-level holistic facial neural representation. Based on these results, we discussed the theoretical implication of cross-level adaptation propagation for understanding the hierarchical sensory neural systems.

1-20hit(41hit)