The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] human(269hit)

81-100hit(269hit)

  • Backchannel Prediction for Mandarin Human-Computer Interaction

    Xia MAO  Yiping PENG  Yuli XUE  Na LUO  Alberto ROVETTA  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2015/03/02
      Vol:
    E98-D No:6
      Page(s):
    1228-1237

    In recent years, researchers have tried to create unhindered human-computer interaction by giving virtual agents human-like conversational skills. Predicting backchannel feedback for agent listeners has become a novel research hot-spot. The main goal of this paper is to identify appropriate features and methods for backchannel prediction in Mandarin conversations. Firstly, multimodal Mandarin conversations are recorded for the analysis of backchannel behaviors. In order to eliminate individual difference in the original face-to-face conversations, more backchannels from different listeners are gathered together. These data confirm that backchannels occurring in the speakers' pauses form a vast majority in Mandarin conversations. Both prosodic and visual features are used in backchannel prediction. Four types of models based on the speakers' pauses are built by using support vector machine classifiers. An evaluation of the pause-based prediction model has shown relatively high accuracy in consideration of the optional nature of backchannel feedback. Finally, the results of the subjective evaluation validate that the conversations performed between humans and virtual listeners using backchannels predicted by the proposed models is more unhindered compared to other backchannel prediction methods.

  • Model for Estimating Effects of Human Body Shadowing in High Frequency Bands

    Ngochao TRAN  Tetsuro IMAI  Yukihiko OKUMURA  

     
    PAPER

      Vol:
    E98-B No:5
      Page(s):
    773-782

    In this paper, we propose a simple model for estimating the effects of human body shadowing (HBS) in high frequency bands. The model includes two factors: the shadowing width (SW), which is the width of the area with shadowing loss values greater than 0dB, and the median shadowing loss value (MSLV), which is obtained by taking the median of the shadowing loss values within the SW. These factors are determined by formulas using parameters, i.e. frequency, distance between the base station (BS) and human body, distance between the terminal and human body, BS antenna height, and direction of the human body. To obtain the formulas, a method for calculating the effects of HBS based on the uniform theory of diffraction (UTD) and a human body model comprising lossy dielectric flat plates is proposed and verified. Then, the general forms of the formulas are predicted using the theory of knife-edge diffraction (KE). A series of computer simulations using the proposed calculation method with random changes in parameters is conducted to verify the general formulas and derive coefficients for these formulas through regression formulas.

  • Contextual Max Pooling for Human Action Recognition

    Zhong ZHANG  Shuang LIU  Xing MEI  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/01/19
      Vol:
    E98-D No:4
      Page(s):
    989-993

    The bag-of-words model (BOW) has been extensively adopted by recent human action recognition methods. The pooling operation, which aggregates local descriptor encodings into a single representation, is a key determiner of the performance of the BOW-based methods. However, the spatio-temporal relationship among interest points has rarely been considered in the pooling step, which results in the imprecise representation of human actions. In this paper, we propose a novel pooling strategy named contextual max pooling (CMP) to overcome this limitation. We add a constraint term into the objective function under the framework of max pooling, which forces the weights of interest points to be consistent with their probabilities. In this way, CMP explicitly considers the spatio-temporal contextual relationships among interest points and inherits the positive properties of max pooling. Our method is verified on three challenging datasets (KTH, UCF Sports and UCF Films datasets), and the results demonstrate that our method achieves better results than the state-of-the-art methods in human action recognition.

  • A Service Design Method for Transmission Rate Control in Multitasking That Takes Attention Shift into Account

    Sumaru NIIDA  Satoshi UEMURA  Shigehiro ANO  

     
    PAPER

      Vol:
    E98-B No:1
      Page(s):
    71-78

    With the rapid growth of high performance ICT (Information Communication Technologies) devices such as smart phones and tablet PCs, multitasking has become one of the popular ways of using mobile devices. The reasons users have adopted multitask operation are that it reduces the level of dissatisfaction regarding waiting time and makes effective use of time by switching their attention from the waiting process to other content. This is a good solution to the problem of waiting; however, it may cause another problem, which is the increase in traffic volume due to the multiple applications being worked on simultaneously. Thus, an effective method to control throughput adapted to the multitasking situation is required. This paper proposes a transmission rate control method for web browsing that takes multitasking behavior into account and quantitatively demonstrates the effect of service by two different field experiments. The main contribution of this paper is to present a service design process for a new transmission rate control that takes into account human-network interaction based on the human-centered approach. We show that the degree of satisfaction in relation to waiting time did not degrade even when a field trial using a testbed showed that throughput of the background task was reduced by 40%.

  • Occlusion-Robust Human Tracking with Integrated Multi-View Depth Imagery

    Kenichiro FUKUSHI  Itsuo KUMAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:12
      Page(s):
    3181-3191

    In this paper, we present a computer vision-based human tracking system with multiple stereo cameras. Many widely used methods, such as KLT-tracker, update the trackers “frame-to-frame,” so that features extracted from one frame are utilized to update their current state. In contrast, we propose a novel optimization technique for the “multi-frame” approach that computes resultant trajectories directly from video sequences, in order to achieve high-level robustness against severe occlusion, which is known to be a challenging problem in computer vision. We developed a heuristic optimization technique to estimate human trajectories, instead of using dynamic programming (DP) or an iterative approach, which makes our method sufficiently computationally efficient to operate in realtime. Six video sequences where one to six people walk in a narrow laboratory space are processed using our system. The results confirm that our system is capable of tracking cluttered scenes in which severe occlusion occurs and people are frequently in close proximity to each other. Moreover, minimal information is required for tracking, instead of full camera images, which is communicated over the network. Hence, commonly used network devices are sufficient for constructing our tracking system.

  • Binaural Sound Source Localization in Noisy Reverberant Environments Based on Equalization-Cancellation Theory

    Thanh-Duc CHAU  Junfeng LI  Masato AKAGI  

     
    PAPER-Engineering Acoustics

      Vol:
    E97-A No:10
      Page(s):
    2011-2020

    Sound source localization (SSL), with a binaural input in practical environments, is a challenging task due to the effects of noise and reverberation. In psychoacoustic research field, one of the theories to explain the mechanism of human perception in such environments is the well-known equalization-cancellation (EC) model. Motivated by the EC theory, this paper investigates a binaural SSL method by integrating EC procedures into a beamforming technique. The principle idea is that the EC procedures are first utilized to eliminate the sound signal component at each candidate direction respectively; direction of sound source is then determined as the direction at which the residual energy is minimal. The EC procedures applied in the proposed method differ from those in traditional EC models, in which the interference signals in rooms are accounted in E and C operations based on limited prior known information. Experimental results demonstrate that our proposed method outperforms the traditional SSL algorithms in the presence of noise and reverberation simultaneously.

  • High Performance Activity Recognition Framework for Ambient Assisted Living in the Home Network Environment

    Konlakorn WONGPATIKASEREE  Azman Osman LIM  Mitsuru IKEDA  Yasuo TAN  

     
    PAPER

      Vol:
    E97-B No:9
      Page(s):
    1766-1778

    Activity recognition has recently been playing an important role in several research domains, especially within the healthcare system. It is important for physicians to know what their patients do in daily life. Nevertheless, existing research work has failed to adequately identify human activity because of the variety of human lifestyles. To address this shortcoming, we propose the high performance activity recognition framework by introducing a new user context and activity location in the activity log (AL2). In this paper, the user's context is comprised by context-aware infrastructure and human posture. We propose a context sensor network to collect information from the surrounding home environment. We also propose a range-based algorithm to classify human posture for combination with the traditional user's context. For recognition process, ontology-based activity recognition (OBAR) is developed. The ontology concept is the main approach that uses to define the semantic information and model human activity in OBAR. We also introduce a new activity log ontology, called AL2 for investigating activities that occur at the user's location at that time. Through experimental studies, the results reveal that the proposed context-aware activity recognition engine architecture can achieve an average accuracy of 96.60%.

  • Image Quality Assessment by Quantifying Discrepancies of Multifractal Spectrums

    Hang ZHANG  Yong DING  Peng Wei WU  Xue Tong BAI  Kai HUANG  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E97-D No:9
      Page(s):
    2453-2460

    Visual quality evaluation is crucially important for various video and image processing systems. Traditionally, subjective image quality assessment (IQA) given by the judgments of people can be perfectly consistent with human visual system (HVS). However, subjective IQA metrics are cumbersome and easily affected by experimental environment. These problems further limits its applications of evaluating massive pictures. Therefore, objective IQA metrics are desired which can be incorporated into machines and automatically evaluate image quality. Effective objective IQA methods should predict accurate quality in accord with the subjective evaluation. Motivated by observations that HVS is highly adapted to extract irregularity information of textures in a scene, we introduce multifractal formalism into an image quality assessment scheme in this paper. Based on multifractal analysis, statistical complexity features of nature images are extracted robustly. Then a novel framework for image quality assessment is further proposed by quantifying the discrepancies between multifractal spectrums of images. A total of 982 images are used to validate the proposed algorithm, including five type of distortions: JPEG2000 compression, JPEG compression, white noise, Gaussian blur, and Fast Fading. Experimental results demonstrate that the proposed metric is highly effective for evaluating perceived image quality and it outperforms many state-of-the-art methods.

  • Effects of Conversational Agents on Activation of Communication in Thought-Evoking Multi-Party Dialogues

    Kohji DOHSAKA  Ryota ASAI  Ryuichiro HIGASHINAKA  Yasuhiro MINAMI  Eisaku MAEDA  

     
    PAPER-Natural Language Processing

      Vol:
    E97-D No:8
      Page(s):
    2147-2156

    This paper presents an experimental study that analyzes how conversational agents activate human communication in thought-evoking multi-party dialogues between multi-users and multi-agents. A thought-evoking dialogue is a kind of interaction in which agents act to provoke user thinking, and it has the potential to activate multi-party interactions. This paper focuses on quiz-style multi-party dialogues between two users and two agents as an example of thought-evoking multi-party dialogues. The experimental results revealed that the presence of a peer agent significantly improved user satisfaction and increased the number of user utterances in quiz-style multi-party dialogues. We also found that agents' empathic expressions significantly improved user satisfaction, improved user ratings of the peer agent, and increased the number of user utterances. Our findings should be useful for activating multi-party communications in various applications such as pedagogical agents and community facilitators.

  • Mood-Learning Public Display: Adapting Content Design Evolutionarily through Viewers' Involuntary Gestures and Movements

    Ken NAGAO  Issei FUJISHIRO  

     
    PAPER-Interaction

      Vol:
    E97-D No:8
      Page(s):
    1991-1999

    Due to the recent development of underlying hardware technology and improvement in installing environments, public display has been becoming more common and attracting more attention as a new type of signage. Any signage is required to make its content more attractive to its viewers by evaluating the current attractiveness on the fly, in order to deliver the message from the sender more effectively. However, most previous methods for public display require time to reflect the viewers' evaluations. In this paper, we present a novel system, called Mood-Learning Public Display, which automatically adapts its content design. This system utilizes viewers' involuntary behaviors as a sign of evaluation to make the content design more adapted to local viewers' tastes evolutionarily on site. The system removes the current gap between viewers' expectations and the content actually displayed on the display, and makes efficient mutual transmission of information between the cyberworld and the reality.

  • Tracking People with Active Cameras Using Variable Time-Step Decisions

    Alparslan YILDIZ  Noriko TAKEMURA  Maiya HORI  Yoshio IWAI  Kosuke SATO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:8
      Page(s):
    2124-2130

    In this study, we introduce a system for tracking multiple people using multiple active cameras. Our main objective is to surveille as many targets as possible, at any time, using a limited number of active cameras. In our context, an active camera is a statically located pan-tilt-zoom camera. In this research, we aim to optimize the camera configuration to achieve maximum coverage of the targets. We first devise a method for efficient tracking and estimation of target locations in the environment. Our tracking method is able to track an unknown number of targets and easily estimate multiple future time-steps, which is a requirement for active cameras. Next, we present an optimization of camera configuration with variable time-step that is optimal given the estimated object likelihoods for multiple future frames. We confirmed our results using simulation and real videos, and show that without introducing any significant computational complexities, it is possible to use active cameras to the point that we can track and observe multiple targets very effectively.

  • Accurate Image Separation Method for Two Closely Spaced Pedestrians Using UWB Doppler Imaging Radar and Supervised Learning

    Kenshi SAHO  Hiroaki HOMMA  Takuya SAKAMOTO  Toru SATO  Kenichi INOUE  Takeshi FUKUDA  

     
    PAPER-Sensing

      Vol:
    E97-B No:6
      Page(s):
    1223-1233

    Recent studies have focused on developing security systems using micro-Doppler radars to detect human bodies. However, the resolution of these conventional methods is unsuitable for identifying bodies and moreover, most of these conventional methods were designed for a solitary or sufficiently well-spaced targets. This paper proposes a solution to these problems with an image separation method for two closely spaced pedestrian targets. The proposed method first develops an image of the targets using ultra-wide-band (UWB) Doppler imaging radar. Next, the targets in the image are separated using a supervised learning-based separation method trained on a data set extracted using a range profile. We experimentally evaluated the performance of the image separation using some representative supervised separation methods and selected the most appropriate method. Finally, we reject false points caused by target interference based on the separation result. The experiment, assuming two pedestrians with a body separation of 0.44m, shows that our method accurately separates their images using a UWB Doppler radar with a nominal down-range resolution of 0.3m. We describe applications using various target positions, establish the performance, and derive optimal settings for our method.

  • A Fast Parallel Algorithm for Indexing Human Genome Sequences

    Woong-Kee LOH  Kyoung-Soo HAN  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E97-D No:5
      Page(s):
    1345-1348

    A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.

  • QoS Analysis for Service Composition by Human and Web Services Open Access

    Donghui LIN  Toru ISHIDA  Yohei MURAKAMI  Masahiro TANAKA  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    762-769

    The availability of more and more Web services provides great varieties for users to design service processes. However, there are situations that services or service processes cannot meet users' requirements in functional QoS dimensions (e.g., translation quality in a machine translation service). In those cases, composing Web services and human tasks is expected to be a possible alternative solution. However, analysis of such practical efforts were rarely reported in previous researches, most of which focus on the technology of embedding human tasks in software environments. Therefore, this study aims at analyzing the effects of composing Web services and human activities using a case study in the domain of language service with large scale experiments. From the experiments and analysis, we find out that (1) service implementation variety can be greatly increased by composing Web services and human activities for satisfying users' QoS requirements; (2) functional QoS of a Web service can be significantly improved by inducing human activities with limited cost and execution time provided certain quality of human activities; and (3) multiple QoS attributes of a composite service are affected in different ways with different quality of human activities.

  • Topic-Based Knowledge Transfer Algorithm for Cross-View Action Recognition

    Changhong CHEN  Shunqing YANG  Zongliang GAN  

     
    LETTER-Pattern Recognition

      Vol:
    E97-D No:3
      Page(s):
    614-617

    Cross-view action recognition is a challenging research field for human motion analysis. Appearance-based features are not credible if the viewpoint changes. In this paper, a new framework is proposed for cross-view action recognition by topic based knowledge transfer. First, Spatio-temporal descriptors are extracted from the action videos and each video is modeled by a bag of visual words (BoVW) based on the codebook constructed by the k-means cluster algorithm. Second, Latent Dirichlet Allocation (LDA) is employed to assign topics for the BoVW representation. The topic distribution of visual words (ToVW) is normalized and taken to be the feature vector. Third, in order to bridge different views, we transform ToVW into bilingual ToVW by constructing bilingual dictionaries, which guarantee that the same action has the same representation from different views. We demonstrate the effectiveness of the proposed algorithm on the IXMAS multi-view dataset.

  • A Novel Method for the Bi-directional Transformation between Human Living Activities and Appliance Power Consumption Patterns

    Xinpeng ZHANG  Yusuke YAMADA  Takekazu KATO  Takashi MATSUYAMA  

     
    PAPER-Pattern Recognition

      Vol:
    E97-D No:2
      Page(s):
    275-284

    This paper describes a novel method for the bi-directional transformation between the power consumption patterns of appliances and human living activities. We have been proposing a demand-side energy management system that aims to cut down the peak power consumption and save the electric energy in a household while keeping user's quality of life based on the plan of electricity use and the dynamic priorities of the appliances. The plan of electricity use could be established in advance by predicting appliance power consumption. Regarding the priority of each appliance, it changes according to user's daily living activities, such as cooking, bathing, or entertainment. To evaluate real-time appliance priorities, real-time living activity estimation is needed. In this paper, we address the problem of the bi-directional transformation between personal living activities and power consumption patterns of appliances. We assume that personal living activities and appliance power consumption patterns are related via the following two elements: personal appliance usage patterns, and the location of people. We first propose a Living Activity - Power Consumption Model as a generative model to represent the relationship between living activities and appliance power consumption patterns, via the two elements. We then propose a method for the bidirectional transformation between living activities and appliance power consumption patterns on the model, including the estimation of personal living activities from measured appliance power consumption patterns, and the generation of appliance power consumption patterns from given living activities. Experiments conducted on real daily life demonstrate that our method can estimate living activities that are almost consistent with the real ones. We also confirm through case study that our method is applicable for simulating appliance power consumption patterns. Our contributions in this paper would be effective in saving electric energy, and may be applied to remotely monitor the daily living of older people.

  • Pose-Free Face Swapping Based on a Deformable 3D Shape Morphable Model

    Yuan LIN  Shengjin WANG  

     
    PAPER-Computer Graphics

      Vol:
    E97-D No:2
      Page(s):
    305-314

    Traditional face swapping technologies require that the faces of source images and target images have similar pose and appearance (usually frontal). For overcoming this limit in applications this paper presents a pose-free face swapping method based on personalized 3D face modeling. By using a deformable 3D shape morphable model, a photo-realistic 3D face is reconstructed from a single frontal view image. With the aid of the generated 3D face, a virtual source image of the person with the same pose as the target face can be rendered, which is used as a source image for face swapping. To solve the problem of illumination difference between the target face and the source face, a color transfer merging method is proposed. It outperforms the original color transfer method in dealing with the illumination gap problem. An experiment shows that the proposed face reconstruction method is fast and efficient. In addition, we have conducted experiments of face swapping in a variety of scenarios such as children's story book, role play, and face de-identification stripping facial information used for identification, and promising results have been obtained.

  • Methods of Estimating Return-Path Capacitance in Electric-Field Intrabody Communication

    Tadashi MINOTANI  Mitsuru SHINAGAWA  

     
    PAPER-Antennas and Propagation

      Vol:
    E97-B No:1
      Page(s):
    114-121

    This paper describes a very accurate method of estimating the return-path-capacitance and validates the estimation based on low-error measurements for electric-field intrabody communication. The return-path capacitance, Cg, of a mobile transceiver is estimated in two ways. One uses the attenuation factor in transmission and capacitance, Cb, between a human body and the earth ground. The other uses the attenuation factor in reception. To avoid the influence of the lead wire in the estimation of Cb, Cb is estimated from the attenuation factor measured with an amplifier with a low input capacitance. The attenuation factor in reception is derived by using the applied-voltage dependence of the reception rate. This way avoids the influence of any additional instruments on the return-path capacitance and allows that capacitance to be estimated under the same condition as actual intrabody communication. The estimates obtained by the two methods agree well with each other, which means that the estimation of Cb is valid. The results demonstrate the usefulness of the methods.

  • Design of Miniature Implantable Tag Antenna for Radio-Frequency Identification System at 2.45GHz and Received Power Analysis

    HoYu LIN  Masaharu TAKAHASHI  Kazuyuki SAITO  Koichi ITO  

     
    PAPER-Antennas and Propagation

      Vol:
    E97-B No:1
      Page(s):
    129-136

    In recent years, there has been rapid developments in radio-frequency identification (RFID) systems, and their industrial applications include logistics management, automatic object identification, access and parking management, etc. Moreover, RFID systems have also been introduced for the management of medical instruments in medical applications to improve the quality of medical services. In recent years, the combination of such a system with a biological monitoring system through permanent implantation in the human body has been suggested to reduce malpractice events and ameliorate the patient suffering. This paper presents an implantable RFID tag antenna design that can match the conjugate impedance of most integrated circuit (IC) chips (9.3-j55.2Ω at 2.45GHz. The proposed antenna can be injected into the human body through a biological syringe, owing to its compact size of 9.3mm × 1.0mm × 1.0mm. The input impedance, transmission coefficient, and received power are simulated by a finite element method (FEM). A three-layered phantom is used to confirm antenna performance.

  • Personalized Emotion Recognition Considering Situational Information and Time Variance of Emotion

    Yong-Soo SEOL  Han-Woo KIM  

     
    PAPER-Human-computer Interaction

      Vol:
    E96-D No:11
      Page(s):
    2409-2416

    To understand human emotion, it is necessary to be aware of the surrounding situation and individual personalities. In most previous studies, however, these important aspects were not considered. Emotion recognition has been considered as a classification problem. In this paper, we attempt new approaches to utilize a person's situational information and personality for use in understanding emotion. We propose a method of extracting situational information and building a personalized emotion model for reflecting the personality of each character in the text. To extract and utilize situational information, we propose a situation model using lexical and syntactic information. In addition, to reflect the personality of an individual, we propose a personalized emotion model using KBANN (Knowledge-based Artificial Neural Network). Our proposed system has the advantage of using a traditional keyword-spotting algorithm. In addition, we also reflect the fact that the strength of emotion decreases over time. Experimental results show that the proposed system can more accurately and intelligently recognize a person's emotion than previous methods.

81-100hit(269hit)