Recurrent neural networks (RNNs) are a powerful model for sequential data. RNNs that use long short-term memory (LSTM) cells have proven effective in handwriting recognition, language modeling, speech recognition, and language comprehension tasks. In this study, we propose LSTM conditional random fields (LSTM-CRF); it is an LSTM-based RNN model that uses output-label dependencies with transition features and a CRF-like sequence-level objective function. We also propose variations to the LSTM-CRF model using a gate recurrent unit (GRU) and structurally constrained recurrent network (SCRN). Empirical results reveal that our proposed models attain state-of-the-art performance for named entity recognition.
Takuya FUJIHASHI Yusuke HIROTA Takashi WATANABE
Multi-view video streaming plays an important role in new interactive and augmented video applications such as telepresence, remote surgery, and entertainment. For those applications, interactive multi-view video transmission schemes have been proposed that aim to reduce the amount of video traffic. Specifically, these schemes only encode and transmit video frames, which are potentially displayed by users, based on periodical feedback from the users. However, existing schemes are vulnerable to frame loss, which often occurs during transmissions, because they encode most video frames using inter prediction and inter-view prediction to reduce traffic. Frame losses induce significant quality degradation due to the collapse of the decoding operations. To improve the loss resilience, we propose an encoding/decoding system, Frame Popularity-based Multi-view Video Streaming (FP-MVS), for interactive multi-view video streaming services. The main idea of FP-MVS is to assign intra (I) frames in the prediction structure for less/more popular (i.e., few/many observed users) potential frames in order to mitigate the impact of a frame loss. In addition, FP-MVS utilizes overlapping and non-overlapping areas between all user's potential frames to prevent redundant video transmission. Although each intra-frame has a large data size, the video traffic can be reduced within a network constraint by combining multicast and unicast for overlapping and non-overlapping area transmissions. Evaluations using Joint Multi-view Video Coding (JMVC) demonstrated that FP-MVS achieves higher video quality even in loss-prone environments. For example, our scheme improves video quality by 11.81dB compared to the standard multi-view video encoding schemes at the loss rate of 5%.
Fuan PU Guiming LUO Zhou JIANG
In this paper, a Boolean algebra approach is proposed to encode various acceptability semantics for abstract argumentation frameworks, where each semantics can be equivalently encoded into several Boolean constraint models based on Boolean matrices and a family of Boolean operations between them. Then, we show that these models can be easily translated into logic programs, and can be solved by a constraint solver over Boolean variables. In addition, we propose some querying strategies to accelerate the calculation of the grounded, stable and complete extensions. Finally, we describe an experimental study on the performance of our encodings according to different semantics and querying strategies.
Given an undirected graph G, an edge dominating set is a subset F of edges such that each edge not in F is adjacent to some edge in F, and computing the minimum size of an edge dominating set is known to be NP-hard. Since the size of any edge dominating set is at least half of the maximum size µ(G) of a matching in G, we study the problem of testing whether a given graph G has an edge dominating set of size ⌈µ(G)/2⌉ or not. In this paper, we prove that the problem is NP-complete, whereas we design an O*(2.0801µ(G)/2)-time and polynomial-space algorithm to the problem.
Vijay JOHN Qian LONG Yuquan XU Zheng LIU Seiichi MITA
Environment perception is an important task for intelligent vehicles applications. Typically, multiple sensors with different characteristics are employed to perceive the environment. To robustly perceive the environment, the information from the different sensors are often integrated or fused. In this article, we propose to perform the sensor fusion and registration of the LIDAR and stereo camera using the particle swarm optimization algorithm, without the aid of any external calibration objects. The proposed algorithm automatically calibrates the sensors and registers the LIDAR range image with the stereo depth image. The registered LIDAR range image functions as the disparity map for the stereo disparity estimation and results in an effective sensor fusion mechanism. Additionally, we perform the image denoising using the modified non-local means filter on the input image during the stereo disparity estimation to improve the robustness, especially at night time. To evaluate our proposed algorithm, the calibration and registration algorithm is compared with baseline algorithms on multiple datasets acquired with varying illuminations. Compared to the baseline algorithms, we show that our proposed algorithm demonstrates better accuracy. We also demonstrate that integrating the LIDAR range image within the stereo's disparity estimation results in an improved disparity map with significant reduction in the computational complexity.
Wenhao JIANG Wenjiang FENG Xingcheng ZHAO Qing LUO Zhiming WANG
Spectrum sharing effectively improves the spectrum usage by allowing secondary users (SUs) to dynamically and opportunistically share the licensed bands with primary users (PUs). The concept of cooperative spectrum sharing allows SUs to use portions of the PUs' radio resource for their own data transmission, under the condition that SUs help the PUs' transmission. The key issue with designing such a scheme is how to deal with the resource splitting of the network. In this paper we propose a relay-based cooperative spectrum sharing scheme in which the network consists of one PU and multiple SUs. The PU asks the SUs to relay its data in order to improve its energy efficiency, in return it rewards the SUs with a portion of its authorized spectrum. However each SU is only allowed to transmit its data via the rewarded channel at a power level proportional to the contribution it makes to the PU. Since energy cost is considered, the SUs must carefully determine their power level. This scheme forms a non-cooperative Stackelberg resource allocation game where the strategy of PU is the bandwidth it rewards and the strategy of each SU is power level of relay transmission. We first investigate the second stage of the sub-game which is addressed as power allocation game. We prove there exists an equilibrium in the power allocation game and provide a sufficient condition for the uniqueness of the equilibrium. We further prove a unique Stackelberg equilibrium exists in the resource allocation game. Distributed algorithms are proposed to help the users with incomplete information achieve the equilibrium point. Simulation results validate our analysis and show that our proposed scheme introduces significant utility improvement for both PU and SUs.
Zheng WEN Di ZHANG Keping YU Takuro SATO
We propose the node name routing (NNR) strategy for information-centric ad-hoc networks based on the named-node networking (3N). This strategy is especially valuable for use in disaster areas because, when the Internet is out of service during a disaster, our strategy can be used to set up a self-organizing network via cell phones or other terminal devices that have a sharing ability, and it does not rely on a base station (BS) or similar providers. Our proposed strategy can solve the multiple-name problem that has arisen in prior 3N proposals, as well as the dead loop problems in both 3N ad-hoc networks and TCP/IP ad-hoc networks. To evaluate the NNR strategy, it is compared with the optimized link state routing protocol (OLSR) and the dynamic source routing (DSR) strategy. Computer-based comprehensive simulations showed that our NNR proposal exhibits a better performance in this environment when all of the users are moving randomly. We further observed that with a growing number of users, our NNR protocol performs better in terms of packet delivery, routing cost, etc.
Wanchun LI Ting YUAN Bin WANG Qiu TANG Yingxiang LI Hongshu LIAO
In this paper, we explore the relationship between Geometric Dilution of Precision (GDOP) and Cramer-Rao Bound (CRB) by tracing back to the original motivations for deriving these two indexes. In addition, the GDOP is served as a sensor-target geometric uncertainty analysis tool whilst the CRB is served as a statistical performance evaluation tool based on the sensor observations originated from target. And CRB is the inverse matrix of Fisher information matrix (FIM). Based on the original derivations for a same positioning application, we interpret their difference in a mathematical view to show that.
Card-based protocols enable us to easily perform cryptographic tasks such as secure multiparty computation using a deck of physical cards. Since the first card-based protocol appeared in 1989, many protocols have been designed. A protocol is usually described with a series of somewhat intuitive and verbal descriptions, such as “turn over this card,” “shuffle these two cards,” “apply a random cut to these five cards,” and so on. On the other hand, a formal computational model of card-based protocols via abstract machine was constructed in 2014. By virtue of the formalization, card-based protocols can be treated more rigorously; for example, it enables one to discuss the lower bounds on the number of cards required for secure computations. In this paper, an overview of the computational model with its applications to designing protocols and a survey of the recent progress in card-based protocols are presented.
The Even-Goldreich-Micali framework is a generic method for constructing secure digital signature schemes from weaker signature schemes and one-time signature schemes. Several variations are known due to properties demanded on the underlying building blocks. It is in particular interesting when the underlying signature scheme is a so-called F-signature scheme that admits different message spaces for signing and verification. In this paper we overview these variations in the literature and add a new one to the bucket.
Video data mining based on topic models as an emerging technique recently has become a very popular research topic. In this paper, we present a novel topic model named sequential correspondence hierarchical Dirichlet processes (Seq-cHDP) to learn the hidden structure within video data. The Seq-cHDP model can be deemed as an extended hierarchical Dirichlet processes (HDP) model containing two important features: one is the time-dependency mechanism that connects neighboring video frames on the basis of a time dependent Markovian assumption, and the other is the correspondence mechanism that provides a solution for dealing with the multimodal data such as the mixture of visual words and speech words extracted from video files. A cascaded Gibbs sampling method is applied for implementing the inference task of Seq-cHDP. We present a comprehensive evaluation for Seq-cHDP through experimentation and finally demonstrate that Seq-cHDP outperforms other baseline models.
Ryoichi KAWAHARA Hiroshi SAITO
It is expected that a large number of different objects, such as sensor devices and consumer electronics, will be connected to future networks. For such networks, we propose a name resolution method for directly specifying a condition on a set of attribute-value pairs of real-world information without needing prior knowledge of the uniquely assigned name of a target object, e.g., a URL. For name resolution, we need an algorithm to find the target object(s) satisfying a query condition on multiple attributes. To address the problem that multi-attribute searching algorithms may not work well when the number of attributes (i.e., dimensions) d increases, which is related to the curse of dimensionality, we also propose a probabilistic searching algorithm to reduce searching time at the expense of a small probability of false positives. With this algorithm, we choose permutation pattern(s) of d attributes to use the first K (K « d) ones to search objects so that they contain relevant attributes with a high probability. We argue that our algorithm can identify the target objects at a false positive rate less than 10-6 and a few percentages of tree-searching cost compared with a naive d-dimensional searching under a certain condition.
Junji TAKEMASA Yuki KOIZUMI Toru HASEGAWA
Energy efficiency is an important requirement to forth-coming NDN (Named Data Networking) networks and caching inherent to NDN is a main driver of energy reduction in such networks. This paper addresses the research question “Does caching really reduce the energy consumption of the entire network?”. To answer the question, we precisely estimate how caching reduces energy consumption of forth-coming commercial NDN networks by carefully considering configurations of NDN routers. This estimation reveals that energy reduction due to caching depends on energy-proportionality of NDN routers.
Lixin WANG Yutong LU Wei ZHANG Yan LEI
File system workloads are increasing write-heavy. The growing capacity of RAM in modern nodes allows many reads to be satisfied from memory while writes must be persisted to disk. Today's sophisticated local file systems like Ext4, XFS and Btrfs optimize for reads but suffer from workloads dominated by microdata (including metadata and tiny files). In this paper we present an LSM-tree-based file system, RFS, which aims to take advantages of the write optimization of LSM-tree to provide enhanced microdata performance, while offering matching performance for large files. RFS incrementally partitions the namespace into several metadata columns on a per-directory basis, preserving disk locality for directories and reducing the write amplification of LSM-trees. A write-ordered log-structured layout is used to store small files efficiently, rather than embedding the contents of small files into inodes. We also propose an optimization of global bloom filters for efficient point lookups. Experiments show our library version of RFS can handle microwrite-intensive workloads 2-10 times faster than existing solutions such as Ext4, Btrfs and XFS.
Yuki KOIZUMI Suhwuk KIM Yuki URATA Toru HASEGAWA
This paper proposes an NDN-based message delivery protocol over a cellular network in disasters. Collaborative communication among cellular devices is integrated into the protocol so that power consumed by battery-operated base stations (BSs) is reduced when a blackout occurs. A key idea is to reduce consumed radio resources by making cellular devices of which radio propagation quality are better forward messages of neighboring devices. The radio resource reduction contributes to reducing power consumed by a battery-operated BS. We empirically and analytically evaluate how the proposed message delivery protocol reduces the power consumption of a BS assuming a densely populated shelter.
Yin WAN Kosuke SANADA Nobuyoshi KOMURO Gen MOTOYOSHI Norio YAMAGAKI Shigeo SHIODA Shiro SAKATA Tutomu MURASE Hiroo SEKIYA
This paper presents an analytical model for network throughput of WLANs, taking into account heterogeneous conditions, namely network nodes transmit different length frames with various offered load individually. The airtime concept, which is often used in multi-hop network analyses, is firstly applied for WLAN analysis. The proposed analytical model can cover the situation that there are saturation and non-saturation nodes in the same network simultaneously, which is the first success in the WLAN analyses. This paper shows the network throughput characteristics of four scenarios. Scenario 1 considers the saturation throughputs for the case that one or two length frames are transmitted at the identical offered load. Scenarios 2 and 3 are prepared for investigating the cases that all network nodes transmit different length frames at the identical offered load and identical length frames at the different offered loads, respectively. The heterogeneous conditions for not only frame length but also offered load are investigated in Scenario 4.
Takashi YANAGI Toru FUKASAWA Hiroaki MIYASHITA
In this paper, a measurement method for the impedance and mutual coupling of multi-antennas that we have proposed is summarized. Impedance and mutual coupling characteristics are obtained after reducing the influence of the coaxial cables by synthesizing the measured S-parameters under the condition that unbalanced currents on the outside of the coaxial cables are canceled at feed points. We apply the proposed method to two closely positioned monopole antennas mounted on a small ground plane and demonstrate the validity and effectiveness of the proposed method by simulation and experiment. The proposed method is significantly better in terms of the accuracy of the mutual coupling data. In the presented case, the errors at the resonant frequency of the antennas are only 0.5dB in amplitude and 1.8° in phase.
Surasak BOONKLA Masashi UNOKI Stanislav S. MAKHANOV Chai WUTIWIWATCHAI
We propose a speech analysis method based on the source-filter model using multivariate empirical mode decomposition (MEMD). The proposed method takes multiple adjacent frames of a speech signal into account by combining their log spectra into multivariate signals. The multivariate signals are then decomposed into intrinsic mode functions (IMFs). The IMFs are divided into two groups using the peak of the autocorrelation function (ACF) of an IMF. The first group characterized by a spectral fine structure is used to estimate the fundamental frequency F0 by using the ACF, whereas the second group characterized by the frequency response of the vocal-tract filter is used to estimate formant frequencies by using a peak picking technique. There are two advantages of using MEMD: (i) the variation in the number of IMFs is eliminated in contrast with single-frame based empirical mode decomposition and (ii) the common information of the adjacent frames aligns in the same order of IMFs because of the common mode alignment property of MEMD. These advantages make the analysis more accurate than with other methods. As opposed to the conventional linear prediction (LP) and cepstrum methods, which rely on the LP order and cut-off frequency, respectively, the proposed method automatically separates the glottal-source and vocal-tract filter. The results showed that the proposed method exhibits the highest accuracy of F0 estimation and correctly estimates the formant frequencies of the vocal-tract filter.
The most commonly used scattering parameters (S parameters) are normalized to a real reference resistance, typically 50Ω. In some cases, the use of S parameters normalized to some complex reference impedance is essential or convenient. But there are different definitions of complex-referenced S parameters that are incompatible with each other and serve different purposes. To make matters worse, different simulators implement different ones and which ones are implemented is rarely properly documented. What are possible scenarios in which using the right one matters? This tutorial-style paper is meant as an informal and not overly technical exposition of some such confusing aspects of S parameters, for those who have a basic familiarity with the ordinary, real-referenced S parameters.
Shinnosuke TAKAMICHI Tomoki TODA Graham NEUBIG Sakriani SAKTI Satoshi NAKAMURA
This paper presents a novel statistical sample-based approach for Gaussian Mixture Model (GMM)-based Voice Conversion (VC). Although GMM-based VC has the promising flexibility of model adaptation, quality in converted speech is significantly worse than that of natural speech. This paper addresses the problem of inaccurate modeling, which is one of the main reasons causing the quality degradation. Recently, we have proposed statistical sample-based speech synthesis using rich context models for high-quality and flexible Hidden Markov Model (HMM)-based Text-To-Speech (TTS) synthesis. This method makes it possible not only to produce high-quality speech by introducing ideas from unit selection synthesis, but also to preserve flexibility of the original HMM-based TTS. In this paper, we apply this idea to GMM-based VC. The rich context models are first trained for individual joint speech feature vectors, and then we gather them mixture by mixture to form a Rich context-GMM (R-GMM). In conversion, an iterative generation algorithm using R-GMMs is used to convert speech parameters, after initialization using over-trained probability distributions. Because the proposed method utilizes individual speech features, and its formulation is the same as that of conventional GMM-based VC, it makes it possible to produce high-quality speech while keeping flexibility of the original GMM-based VC. The experimental results demonstrate that the proposed method yields significant improvements in term of speech quality and speaker individuality in converted speech.