Hitomi YOKOYAMA Masano NAKAYAMA Hiroaki MURATA Kinya FUJITA
Aimed at long-term monitoring of daily office conversations without recording the conversational content, a system is presented for estimating acoustic nonverbal information such as utterance duration, utterance frequency, and turn-taking. The system combines a sound localization technique based on the sound energy distribution with 16 beam-forming microphone-array modules mounted in the ceiling for reducing the influence of multiple sound reflection. Furthermore, human detection using a wide field of view camera is integrated to the system for more robust speaker estimation. The system estimates the speaker for each utterance and calculates nonverbal information based on it. An evaluation analyzing data collected over ten 12-hour workdays in an office with three assigned workers showed that the system had 72% speech segmentation detection accuracy and 86% speaker identification accuracy when utterances were correctly detected. Even with false voice detection and incorrect speaker identification and even in cases where the participants frequently made noise or where seven participants had gathered together for a discussion, the order of the amount of calculated acoustic nonverbal information uttered by the participants coincided with that based on human-coded acoustic nonverbal information. Continuous analysis of communication dynamics such as dominance and conversation participation roles through nonverbal information will reveal the dynamics of a group. The main contribution of this study is to demonstrate the feasibility of unconstrained long-term monitoring of daily office activity through acoustic nonverbal information.
Shengyu YAO Ruohua ZHOU Pengyuan ZHANG
This paper proposes a speaker-phonetic i-vector modeling method for text-dependent speaker verification with random digit strings, in which enrollment and test utterances are not of the same phrase. The core of the proposed method is making use of digit alignment information in i-vector framework. By utilizing force alignment information, verification scores of the testing trials can be computed in the fixed-phrase situation, in which the compared speech segments between the enrollment and test utterances are of the same phonetic content. Specifically, utterances are segmented into digits, then a unique phonetically-constrained i-vector extractor is applied to obtain speaker and channel variability representation for every digit segment. Probabilistic linear discriminant analysis (PLDA) and s-norm are subsequently used for channel compensation and score normalization respectively. The final score is obtained by combing the digit scores, which are computed by scoring individual digit segments of the test utterance against the corresponding ones of the enrollment. Experimental results on the Part 3 of Robust Speaker Recognition (RSR2015) database demonstrate that the proposed approach significantly outperforms GMM-UBM by 52.3% and 53.5% relative in equal error rate (EER) for male and female respectively.
Naoki MATSUDA Hirotaka OKABE Ayako OMURA Miki NAKANO Koji MIYAKE Toshihiko NAGAMURA Hideki KAWAI
Hydrophobic DNA (H-DNA) nano-film was formed on a thin glass plate of 50μm thick working as a slab optical waveguide. Bromothymol blue (BTB) molecules were immobilized from aqueous solution with direct contacting to the H-DNA nano-film for 20 minutes. From changes in absorption spectra observed with slab optical wave guide (SOWG) during automated solution exchange (SE) processes for 100 times, it was found that about 95% of bromothymol blue (BTB) molecules was immobilized in the H-DNA nano-film with keeping their functionality of color change responsible to pH change in the solution.
Kenta KUROISHI Toshinari DOI Yusuke YONAHA Iku KUSAJIMA Yasushiro NISHIOKA Satomitsu IMAI
Improvement of output and lifetime is a problem for biofuel cells. A structure was adopted in which gelation mixed with agarose and fuel (fructose) was sandwiched by electrodes made of graphene-coated carbon fiber. The electrode surface not contacting the gel was exposed to air. In addition, grooves were added to the gel surface to further increase the oxygen supply. The power density of the fuel cell was examined in terms of the electrode area exposed to air. The output increased almost in proportion to the area of the electrode exposed to air. Optimization of the concentration of fuel, gel, and the amount of enzyme at the cathode were also examined. The maximum power density in the proposed system was approximately 121μW/cm2, an enhancement of approximately 2.5 times that in the case of using liquid fuel. For the power density after 24h, the fuel gel was superior to the fuel liquid.
Tomoya SATO Narendra SINGH Roland HÖNES Chihiro URATA Yasutaka MATSUO Atsushi HOZUMI
Copper (Cu) electroless plating was conducted on planar and microstructured polydimethylsiloxane (PDMS) substrates. In this study, organic thin films terminated with nitrogen (N)-containing groups, e.g. poly (dimethylaminoethyl methacrylate) brush (PDMAEMA), aminopropyl trimethoxysilane monolayer (APTES), and polydopamine (PDA) were used to anchor palladium (Pd) catalyst. While electroless plating was successfully promoted on all sample surfaces, PDMAEMA was found to achieve the best adhesion strength to the PDMS surfaces, compared to APTES- and PDA-covered PDMS substrates, due to covalent bonding, anchoring effects of polymer chains as well as high affinity of N atoms to Pd species. Our process was also successfully applied to the electroless plating of microstructured PDMS substrates.
Ou ZHAO Lin SHAN Wei-Shun LIAO Mirza GOLAM KIBRIA Huan-Bang LI Kentaro ISHIZU Fumihide KOJIMA
Large-scale distributed antenna systems (LS-DASs) are gaining increasing interest and emerging as highly promising candidates for future wireless communications. To improve the user's quality of service (QoS) in these systems, this study proposes a user cooperation aided clustering approach based on device-centric architectures; it enables multi-user multiple-input multiple-output transmissions with non-reciprocal setups. We actively use device-to-device communication techniques to achieve the sharing of user information and try to form clusters on user side instead of the traditional way that performs clustering on base station (BS) side in data offloading. We further adopt a device-centric architecture to break the limits of the classical BS-centric cellular structure. Moreover, we derive an approximate expression to calculate the user rate for LS-DASs with employment of zero-forcing precoding and consideration of inter-cluster interference. Numerical results indicate that the approximate expression predicts the user rate with a lower computational cost than is indicated by computer simulation, and the proposed approach provides better user experience for, in particular, the users who have unacceptable QoS.
Mitsuhiko KATAGIRI Shofu MATSUDA Norio NAGAYAMA Minoru UMEDA
We describe the preparation of an α-phenyl-4'-(diphenylamino)stilbene (TPA) single crystal and the evaluation of its hole transport property. Based on the characterization using optical microscopy, polarizing microscopy, and X-ray diffraction, a large-scale TPA single crystal of dimensions 7.0×0.9×0.8mm is successfully synthesized using a solution method based on the solubility and supersolubility curves of the TPA. Notably, the current in the long-axis direction is larger than those in the short-axis and thickness directions (i(long) > i(short) > i(thickness)), which reveals the anisotropic charge transfer of the TPA single crystal. The observed anisotropic conductivity is well explained by the orientation of the triphenylamine unit in the TPA single crystal. Furthermore, the activation energy of the long-axis direction in the TPA single crystal is lower than that of the short-axis in TPA and all the axes in the α-phenyl-4'-[bis(4-methylphenyl)amino]stilbene single crystal reported in our previous study.
Xiangdong HUANG Jingwen XU Jiexiao YU Yu LIU
To optimize the performance of FIR filters that have low computation complexity, this paper proposes a hybrid design consisting of two optimization levels. The first optimization level is based on cyclic-shift synthesis, in which all possible sub filters (or windowed sub filters) with distinct cycle shifts are averaged to generate a synthesized filter. Due to the fact that the ripples of these sub filters' transfer curves can be individually compensated, this synthesized filter attains improved performance (besides two uprushes occur on the edges of a transition band) and thus this synthesis actually plays the role of ‘natural optimization’. Furthermore, this synthesis process can be equivalently summarized into a 3-step closed-form procedure, which converts the multi-variable optimization into a single-variable optimization. Hence, to suppress the uprushes, what the second optimization level (by Differential Evolution (DE) algorithm) needs to do is no more than searching for the optimum transition point which incurs only minimal complexity . Owning to the combination between the cyclic-shift synthesis and DE algorithm, unlike the regular evolutionary computing schemes, our hybrid design is more attractive due to its narrowed search space and higher convergence speed . Numerical results also show that the proposed design is superior to the conventional DE design in both filter performance and design efficiency, and it is comparable to the Remez design.
Zhe LI Yili XIA Qian WANG Wenjiang PEI Jinguang HAO
A novel time-series relationship among four consecutive real-valued single-tone sinusoid samples is proposed based on their linear prediction property. In order to achieve unbiased frequency estimates for a real sinusoid in white noise, based on the proposed four-point time-series relationship, a constrained least squares cost function is minimized based on the unit-norm principle. Closed-form expressions for the variance and the asymptotic expression for the variance of the proposed frequency estimator are derived, facilitating a theoretical performance comparison with the existing three-point counterpart, called as the reformed Pisarenko harmonic decomposer (RPHD). The region of performance advantage of the proposed four-point based constrained least squares frequency estimator over the RPHD is also discussed. Computer simulations are conducted to support our theoretical development and to compare the proposed estimator performance with the RPHD as well as the Cramer-Rao lower bound (CRLB).
Shinichi MOGAMI Yoshiki MITSUI Norihiro TAKAMUNE Daichi KITAMURA Hiroshi SARUWATARI Yu TAKAHASHI Kazunobu KONDO Hiroaki NAKAJIMA Hirokazu KAMEOKA
In this letter, we propose a new blind source separation method, independent low-rank matrix analysis based on generalized Kullback-Leibler divergence. This method assumes a time-frequency-varying complex Poisson distribution as the source generative model, which yields convex optimization in the spectrogram estimation. The experimental evaluation confirms the proposed method's efficacy.
Zheng FANG Tieyong CAO Jibin YANG Meng SUN
Saliency detection is widely used in many vision tasks like image retrieval, compression and person re-identification. The deep-learning methods have got great results but most of them focused more on the performance ignored the efficiency of models, which were hard to transplant into other applications. So how to design a efficient model has became the main problem. In this letter, we propose parallel feature network, a saliency model which is built on convolution neural network (CNN) by a parallel method. Parallel dilation blocks are first used to extract features from different layers of CNN, then a parallel upsampling structure is adopted to upsample feature maps. Finally saliency maps are obtained by fusing summations and concatenations of feature maps. Our final model built on VGG-16 is much smaller and faster than existing saliency models and also achieves state-of-the-art performance.
Nan SHA Mingxi GUO Yuanyuan GAO Lihua CHEN Kui XU
In this letter, a physical-layer network coding (PNC) scheme based on continuous phase modulation (CPM) signal using the titled-phase model, i.e., TIP-CPM-PNC, is presented, and the combined titled-phase state trellis for the superimposed CPM signal in TIP-CPM-PNC is discussed. Simulation results show that the proposed scheme with low decoding complexity can achieve the same error performance as CPM-PNC using the traditional-phase model.
Yusuke SAKUMOTO Tsukasa KAMEYAMA Chisa TAKANO Masaki AIDA
Spectral graph theory gives an algebraic approach to the analysis of the dynamics of a network by using the matrix that represents the network structure. However, it is not easy for social networks to apply the spectral graph theory because the matrix elements cannot be given exactly to represent the structure of a social network. The matrix element should be set on the basis of the relationship between persons, but the relationship cannot be quantified accurately from obtainable data (e.g., call history and chat history). To get around this problem, we utilize the universality of random matrices with the feature of social networks. As such a random matrix, we use the normalized Laplacian matrix for a network where link weights are randomly given. In this paper, we first clarify that the universality (i.e., the Wigner semicircle law) of the normalized Laplacian matrix appears in the eigenvalue frequency distribution regardless of the link weight distribution. Then, we analyze the information propagation speed by using the spectral graph theory and the universality of the normalized Laplacian matrix. As a result, we show that the worst-case speed of the information propagation changes up to twice if the structure (i.e., relationship among people) of a social network changes.
Kyu-Ha SONG San-Hae KIM Woo-Jin SONG
When time difference of arrival (TDOA)-based bearing measurements are used in passive triangulation, the accuracy of localization depends on the geometric relationship between the emitter and the sensors. In particular, the localization accuracy varies with the geometric conditions in TDOA-based direction finding (DF) for bearing measurement and lines of bearing (LOBs) crossing for triangulation. To obtain an accurate estimate in passive triangulation using TDOA-based bearing measurements, we shall use these bearings selectively by considering geometric dilution of precision (GDOP) between the emitter and the sensors. To achieve this goal, we first define two GDOPs related to TDOA-based DF and LOBs crossing geometries, and then propose a new hybrid GDOP by combining these GDOPs for a better selection of bearings. Subsequently, two bearings with the lowest hybrid GDOP condition are chosen as the inputs to a triangulation localization algorithm. In simulations, the proposed method shows its enhancement to the localization accuracy.
Huu-Anh TRAN Heyan HUANG Phuoc TRAN Shumin SHI Huu NGUYEN
Word order is one of the most significant differences between the Chinese and Vietnamese. In the phrase-based statistical machine translation, the reordering model will learn reordering rules from bilingual corpora. If the bilingual corpora are large and good enough, the reordering rules are exact and coverable. However, Chinese-Vietnamese is a low-resource language pair, the extraction of reordering rules is limited. This leads to the quality of reordering in Chinese-Vietnamese machine translation is not high. In this paper, we have combined Chinese dependency relation and Chinese-Vietnamese word alignment results in order to pre-order Chinese word order to be suitable to Vietnamese one. The experimental results show that our methodology has improved the machine translation performance compared to the translation system using only the reordering models of phrase-based statistical machine translation.
Xiao-yu WAN Xiao-na YANG Zheng-qiang WANG Zi-fu FAN
This paper investigates energy-efficient resource allocation problem for the wireless power transfer (WPT) enabled multi-user massive multiple-input multiple-output (MIMO) systems. In the considered systems, the sensor nodes (SNs) are firstly powered by WPT from the power beacon (PB) with a large scale of antennas. Then, the SNs use the harvested energy to transmit the data to the base station (BS) with multiple antennas. The problem of optimizing the energy efficiency objective is formulated with the consideration of maximum transmission power of the PB and the quality of service (QoS) of the SNs. By adopting fractional programming, the energy-efficient optimization problem is firstly converted into a subtractive form. Then, a joint power and time allocation algorithm based on the block coordinate descent and Dinkelbach method is proposed to maximize energy efficiency. Finally, simulation results show the proposed algorithm achieves a good compromise between the spectrum efficiency and total power consumption.
Yuya TANAKA Takahiro MAKINO Hisao ISHII
On surfaces of tris-(8-hydroxyquinolate) aluminum (Alq) and tris(7-propyl-8-hydroxyquinolinato) aluminum (Al7p) thin-films, positive and negative polarization charges appear, respectively, owing to spontaneous orientation of these polar molecules. Alq is a typical electron transport material where electrons are injected from cathode. Because the polarization charge exists at the Alq/cathode interface, it is likely that it affects the electron injection process because of Coulomb interaction. In order to evaluate an impact of polarization charge on electron injection from cathode, electron only devices (EODs) composed of Alq or Al7p were prepared and evaluated by displacement current measurement. We found that Alq-EOD has lower resistance than Al7p-EOD, indicating that the positive polarization charge at Alq/cathode interface enhances the electron injection due to Coulomb attraction, while the electron injection is suppressed by the negative polarization charge at the Al7p/Al interface. These results clearly suggest that it is necessary to design organic semiconductor devices by taking polarization charge into account.
Yuliang WEI Guodong XIN Wei WANG Fang LV Bailing WANG
Web person search often return web pages related to several distinct namesakes. This paper proposes a new web page model for template-free person data extraction, and uses Dirichlet Process Mixture model to solve name disambiguation. The results show that our method works best on web pages with complex structure.
Longfei CHEN Yuichi NAKAMURA Kazuaki KONDO Walterio MAYOL-CUEVAS
This paper presents an approach to analyze and model tasks of machines being operated. The executions of the tasks were captured through egocentric vision. Each task was decomposed into a sequence of physical hand-machine interactions, which are described with touch-based hotspots and interaction patterns. Modeling the tasks was achieved by integrating the experiences of multiple experts and using a hidden Markov model (HMM). Here, we present the results of more than 70 recorded egocentric experiences of the operation of a sewing machine. Our methods show good potential for the detection of hand-machine interactions and modeling of machine operation tasks.
Minseok LEE Jihoon AN Younghee LEE
Data generated from the Internet of Things (IoT) devices in smart spaces are utilized in a variety of fields such as context recognition, service recommendation, and anomaly detection. However, the missing values in the data streams of the IoT devices remain a challenging problem owing to various missing patterns and heterogeneous data types from many different data streams. In this regard, while we were analyzing the dataset collected from a smart space with multiple IoT devices, we found a continuous missing pattern that is quite different from the existing missing-value patterns. The pattern has blocks of consecutive missing values over a few seconds and up to a few hours. Therefore, the pattern is a vital factor to the availability and reliability of IoT applications; yet, it cannot be solved by the existing missing-value imputation methods. Therefore, a novel approach for missing-value imputation of the continuous missing pattern is required. We deliberate that even if the missing values of the continuous missing pattern occur in one data stream, missing-values imputation is possible through learning other data streams correlated with this data stream. To solve the missing values of the continuous missing pattern problem, we analyzed multiple IoT data streams in a smart space and figured out the correlations between them that are the interdependencies among the data streams of the IoT devices in a smart space. To impute missing values of the continuous missing pattern, we propose a deep learning-based missing-value imputation model exploiting correlation information, namely, the deep imputation network (DeepIN), in a smart space. The DeepIN uses that multiple long short-term memories are constructed according to the correlation information of each IoT data stream. We evaluated the DeepIN on a real dataset from our campus IoT testbed, and the experimental results show that our proposed approach improves the imputation performance by 57.36% over the state-of-the-art missing-value imputation algorithm. Thus, our approach can be a promising methodology that enables IoT applications and services with a reasonable missing-value imputation accuracy (80∼85%) on average, even if a long-term block of values is missing in IoT environments.