This paper illustrates various content sharing systems that take advantage of cloud's storage and computational resources as well as their supporting conventional technologies. First, basic technology concepts supporting cloud-based systems from a client-server to cloud computing as well as their relationships and functional linkages are shown. Second, the taxonomy of cloud-based system models from the aspect of multiple clouds' interoperability is explained. Interoperability can be categorized into provider-centric and client-centric scenarios. Each can be further divided into federated clouds, hybrid clouds, multi-clouds and aggregated service by broker. Third, practical cloud-based systems related to contents sharing are reported and their characteristics are discussed. Finally, future direction of cloud-based content sharing is suggested.
Masahiro NAKAGAWA Hiroshi HASEGAWA Ken-ichi SATO
Adaptive and flexible network control technology is considered essential for efficient network resource utilization. Moreover, such technology is becoming a key to cost-effectively meet diverse service requirements and accommodate heavier traffic with limited network resources; demands that conventional static operation cannot satisfy. To address this issue, we previously studied dynamic network control technology for large-capacity network services including on-demand broad bandwidth provisioning services and layer-one VPN. Our previous study introduced a simple weighting function for achieving fairness in terms of path length and proposed two dynamic Make Before Break Routing algorithms for reducing blocking probability. These algorithms enhance network utilization by rerouting existing paths to alternative routes while completely avoiding disruption for highly reliable services. However, the impact of this avoidance of service disruption on blocking probability has not been clarified. In this paper, we propose modified versions of the algorithms that enhance network utilization while slightly increasing disruption by rerouting, which enable us to elucidate the effectiveness of hitless rerouting. We also provide extensive evaluations including a comparison of original and modified algorithms. Numerical examples demonstrate that they achieve not only a high degree of fairness but also low service blocking probability. Hitless rerouting is achieved with a small increase in blocking probability.
Masayuki SUZUKI Ryo KUROIWA Keisuke INNAMI Shumpei KOBAYASHI Shinya SHIMIZU Nobuaki MINEMATSU Keikichi HIROSE
When synthesizing speech from Japanese text, correct assignment of accent nuclei for input text with arbitrary contents is indispensable in obtaining naturally-sounding synthetic speech. A phenomenon called accent sandhi occurs in utterances of Japanese; when a word is uttered in a sentence, its accent nucleus may change depending on the contexts of preceding/succeeding words. This paper describes a statistical method for automatically predicting the accent nucleus changes due to accent sandhi. First, as the basis of the research, a database of Japanese text was constructed with labels of accent phrase boundaries and accent nucleus positions when uttered in sentences. A single native speaker of Tokyo dialect Japanese annotated all the labels for 6,344 Japanese sentences. Then, using this database, a conditional-random-field-based method was developed using this database to predict accent phrase boundaries and accent nuclei. The proposed method predicted accent nucleus positions for accent phrases with 94.66% accuracy, clearly surpassing the 87.48% accuracy obtained using our rule-based method. A listening experiment was also conducted on synthetic speech obtained using the proposed method and that obtained using the rule-based method. The results show that our method significantly improved the naturalness of synthetic speech.
Nobuaki MINEMATSU Ibuki NAKAMURA Masayuki SUZUKI Hiroko HIRANO Chieko NAKAGAWA Noriko NAKAMURA Yukinori TAGAWA Keikichi HIROSE Hiroya HASHIMOTO
This paper develops an online and freely available framework to aid teaching and learning the prosodic control of Tokyo Japanese: how to generate its adequate word accent and phrase intonation. This framework is called OJAD (Online Japanese Accent Dictionary) [1] and it provides three features. 1) Visual, auditory, systematic, and comprehensive illustration of patterns of accent change (accent sandhi) of verbs and adjectives. Here only the changes caused by twelve fundamental conjugations are focused upon. 2) Visual illustration of the accent pattern of a given verbal expression, which is a combination of a verb and its postpositional auxiliary words. 3) Visual illustration of the pitch pattern of any given sentence and the expected positions of accent nuclei in the sentence. The third feature is technically implemented by using an accent change prediction module that we developed for Japanese Text-To-Speech (TTS) synthesis [2],[3]. Experiments show that accent nucleus assignment to given texts by the proposed framework is much more accurate than that by native speakers. Subjective assessment and objective assessment done by teachers and learners show extremely high pedagogical effectiveness of the developed framework.
Xiaoyun WANG Tsuneo KATO Seiichi YAMAMOTO
Recognition of second language (L2) speech is a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. Considering that the expressions of non-native speakers are usually simpler than those of native ones, and that second language speech usually includes mispronunciation and less fluent pronunciation, we propose a novel method that maximizes unified acoustic and linguistic objective function to derive a phoneme set for second language speech recognition. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based computer assisted language learning (CALL) system. In this paper, the authors examine the performance based on acoustic likelihood, linguistic discrimination ability and integrated objective function for second language speech. Experiments demonstrate the validity of the phoneme set derived by the proposed method.
Hanxu YOU Zhixian MA Wei LI Jie ZHU
Traditional speech enhancement (SE) algorithms usually have fluctuant performance when they deal with different types of noisy speech signals. In this paper, we propose multi-task Bayesian compressive sensing based speech enhancement (MT-BCS-SE) algorithm to achieve not only comparable performance to but also more stable performance than traditional SE algorithms. MT-BCS-SE algorithm utilizes the dependence information among compressive sensing (CS) measurements and the sparsity of speech signals to perform SE. To obtain sufficient sparsity of speech signals, we adopt overcomplete dictionary to transform speech signals into sparse representations. K-SVD algorithm is employed to learn various overcomplete dictionaries. The influence of the overcomplete dictionary on MT-BCS-SE algorithm is evaluated through large numbers of experiments, so that the most suitable dictionary could be adopted by MT-BCS-SE algorithm for obtaining the best performance. Experiments were conducted on well-known NOIZEUS corpus to evaluate the performance of the proposed algorithm. In these cases of NOIZEUS corpus, MT-BCS-SE is shown that to be competitive or even superior to traditional SE algorithms, such as optimally-modified log-spectral amplitude (OMLSA), multi-band spectral subtraction (SSMul), and minimum mean square error (MMSE), in terms of signal-noise ratio (SNR), speech enhancement gain (SEG) and perceptual evaluation of speech quality (PESQ) and to have better stability than traditional SE algorithms.
Xingge GUO Liping HUANG Ke GU Leida LI Zhili ZHOU Lu TANG
The quality assessment of screen content images (SCIs) has been attractive recently. Different from natural images, SCI is usually a mixture of picture and text. Traditional quality metrics are mainly designed for natural images, which do not fit well into the SCIs. Motivated by this, this letter presents a simple and effective method to naturalize SCIs, so that the traditional quality models can be applied for SCI quality prediction. Specifically, bicubic interpolation-based up-sampling is proposed to achieve this goal. Extensive experiments and comparisons demonstrate the effectiveness of the proposed method.
Shin-ichi NAKAYAMA Shigeru MASUYAMA
Given a graph G=(V, E), where V and E are vertex and edge sets of G, and a subset VNT of vertices called a non-terminal set, the minimum spanning tree with a non-terminal set VNT, denoted by MSTNT, is a connected and acyclic spanning subgraph of G that contains all vertices of V with the minimum weight where each vertex in a non-terminal set is not a leaf. On general graphs, the problem of finding an MSTNT of G is NP-hard. We show that if G is an outerplanar graph then finding an MSTNT of G is linearly solvable with respect to the number of vertices.
Zedong XIE Xihong CHEN Xiaopeng LIU Lunsheng XUE Yu ZHAO
The impact of intersymbol interference (ISI) on single carrier frequency domain equalization with multiple input multiple output (MIMO-SCFDE) systems is severe. Most existing channel equalization methods fail to solve it completely. In this paper, given the disadvantages of the error propagation and the gap from matched filter bound (MFB), we creatively introduce a decision feedback equalizer with frequency-domain bidirectional noise prediction (DFE-FDBiNP) to tackle intersymbol interference (ISI) in MIMO-SCFDE systems. The equalizer has two-part equalizer, that is the normal mode and the time-reversal mode decision feedback equalization with noise prediction (DFE-NP). Equal-gain combining is used to realize a greatly simplified and low complexity diversity combining. Analysis and simulation results validate the improved performance of the proposed method in quasi-static frequency-selective fading MIMO channel for a typical urban environment.
Junji YAMADA Ushio JIMBO Ryota SHIOYA Masahiro GOSHIMA Shuichi SAKAI
An 8-issue superscalar core generally requires a 24-port RAM for the register file. The area and energy consumption of a multiported RAM increase in proportional to the square of the number of ports. A register cache can reduce the area and energy consumption of the register file. However, earlier register cache systems suffer from lower IPC caused by register cache misses. Thus, we proposed the Non-Latency-Oriented Register Cache System (NORCS) to solve the IPC problem with a modified pipeline. We evaluated NORCS mainly from the viewpoint of microarchitecture in the original article, and showed that NORCS maintains almost the same IPC as conventional register files. Researchers in NVIDIA adopted the same idea for their GPUs. However, the evaluation was not sufficient from the viewpoint of LSI design. In the original article, we used CACTI to evaluate the area and energy consumption. CACTI is a design space exploration tool for cache design, and adopts some rough approximations. Therefore, this paper shows design of NORCS with FreePDK45, an open source process design kit for 45nm technology. We performed manual layout of the memory cells and arrays of NORCS, and executed SPICE simulation with RC parasitics extracted from the layout. The results show that, from a full-port register file, an 8-entry NORCS achieves a 75.2% and 48.2% reduction in area and energy consumption, respectively. The results also include the latency which we did not present in our original article. The latencies of critical path is 307ps and 318ps for an 8-entry NORCS and a conventional multiported register file, respectively, when the same two cycles are allocated to register file read.
Takashi HISAKADO Keisuke YOSHIDA Tohlu MATSUSHIMA Osami WADA
An equivalent-circuit model is an effective tool for the analysis and design of metamaterials. This paper describes a systematic and theoretical method for the circuit modeling of meta-atoms. We focus on the structures of wired metallic spheres and propose a method for deriving a sophisticated equivalent circuit that has the same topology as the wires using the partial element equivalent circuit (PEEC) method. Our model contains the effect of external electromagnetic coupling: excitation by an external field modeled by voltage sources and radiation modeled by the radiation resistances for each mode. The equivalent-circuit model provides the characteristics of meta-atoms such as the resonant frequencies and the resonant modes induced by the current distribution in the wires by an external excitation. Although the model is obtained by a very coarse discretization, it provides a good agreement with an electromagnetic simulation.
Takayoshi SHOUDAI Kazuhide AIKOH Yusuke SUZUKI Satoshi MATSUMOTO Tetsuhiro MIYAHARA Tomoyuki UCHIDA
An efficient means of learning tree-structural features from tree-structured data would enable us to construct effective mining methods for tree-structured data. Here, a pattern representing rich tree-structural features common to tree-structured data and a polynomial time algorithm for learning important tree patterns are necessary for mining knowledge from tree-structured data. As such a tree pattern, we introduce a term tree pattern t such that any edge label of t belongs to a finite alphabet Λ, any internal vertex of t has ordered children and t has a new kind of structured variable, called a height-constrained variable. A height-constrained variable has a pair of integers (i, j) as constraints, and it can be replaced with a tree whose trunk length is at least i and whose height is at most j. This replacement is called height-constrained replacement. A sequence of consecutive height-constrained variables is called a variable-chain. In this paper, we present polynomial time algorithms for solving the membership problem and the minimal language (MINL) problem for term tree patternshaving no variable-chain. The membership problem for term tree patternsis to decide whether or not a given tree can be obtained from a given term tree pattern by applying height-constrained replacements to all height-constrained variables in the term tree pattern. The MINL problem for term tree patternsis to find a term tree pattern t such that the language generated by t is minimal among languages, generated by term tree patterns, which contain all given tree-structured data. Finally, we show that the class, i.e., the set of all term tree patternshaving no variable-chain, is polynomial time inductively inferable from positive data if |Λ| ≥ 2.
In this paper, we present an FPGA hardware implementation for a phylogenetic tree reconstruction with a maximum parsimony algorithm. We base our approach on a particular stochastic local search algorithm that uses the Progressive Neighborhood and the Indirect Calculation of Tree Lengths method. This method is widely used for the acceleration of the phylogenetic tree reconstruction algorithm in software. In our implementation, we define a tree structure and accelerate the search by parallel and pipeline processing. We show results for eight real-world biological datasets. We compare execution times against our previous hardware approach, and TNT, the fastest available parsimony program, which is also accelerated by the Indirect Calculation of Tree Lengths method. Acceleration rates between 34 to 45 per rearrangement, and 2 to 6 for the whole search, are obtained against our previous hardware approach. Acceleration rates between 2 to 36 per rearrangement, and 18 to 112 for the whole search, are obtained against TNT.
Shinya OHTANI Yu KATO Nobutaka KUROKI Tetsuya HIROSE Masahiro NUMA
This paper proposes image super-resolution techniques with multi-channel convolutional neural networks. In the proposed method, output pixels are classified into K×K groups depending on their coordinates. Those groups are generated from separate channels of a convolutional neural network (CNN). Finally, they are synthesized into a K×K magnified image. This architecture can enlarge images directly without bicubic interpolation. Experimental results of 2×2, 3×3, and 4×4 magnifications have shown that the average PSNR for the proposed method is about 0.2dB higher than that for the conventional SRCNN.
Wei HAN Xiongwei ZHANG Meng SUN Li LI Wenhua SHI
In this letter, we propose a novel speech separation method based on perceptual weighted deep recurrent neural network (DRNN) which incorporate the masking properties of the human auditory system. In supervised training stage, we firstly utilize the clean label speech of two different speakers to calculate two perceptual weighting matrices. Then, the obtained different perceptual weighting matrices are utilized to adjust the mean squared error between the network outputs and the reference features of both the two clean speech so that the two different speech can mask each other. Experimental results on TSP speech corpus demonstrate that the proposed speech separation approach can achieve significant improvements over the state-of-the-art methods when tested with different mixing cases.
Wei HAN Xiongwei ZHANG Gang MIN Xingyu ZHOU Meng SUN
In this letter, we explore joint optimization of perceptual gain function and deep neural networks (DNNs) for a single-channel speech enhancement task. A DNN architecture is proposed which incorporates the masking properties of the human auditory system to make the residual noise inaudible. This new DNN architecture directly trains a perceptual gain function which is used to estimate the magnitude spectrum of clean speech from noisy speech features. Experimental results demonstrate that the proposed speech enhancement approach can achieve significant improvements over the baselines when tested with TIMIT sentences corrupted by various types of noise, no matter whether the noise conditions are included in the training set or not.
Yong-An JUNG Yung-Lyul LEE Hyoung-Kyu SONG Young-Hwan YOU
In this letter, we propose an improved timing offset estimation scheme without making use of pilot symbols in the HomePlug Green PHY (HomePlug GP) standard. In contrast to the conventional decision-directed timing estimation scheme, the proposed scheme exploits the inherent repetition information of the HomePlug GP signals, thus not only removing the need for the estimated data or pilot symbols but also improving the timing estimation performance.
We present a novel receiver for reliable IoT communications. In this letter, it is assumed that IoT communications are based on ZigBee under frequency-selective indoor environments. The ZigBee includes IEEE 802.15.4 specification for low-power and low-cost communications. The presented receiver fully follows the specification. However, the specification exhibits extremely low performance under frequency-selective environments. Therefore, a channel estimation approach is proposed for reliable communications under frequency-selective fading indoor environments. The estimation method relies on FFT operations, which are usually embedded in cellular phones. We also suggest a correlation method for accurate recovery of original information. The simulation results show that the proposed receiver is very suitable for IoT communications under frequency-selective indoor environments.
Kai-Feng XIA Bin WU Tao XIONG Tian-Chun YE Cheng-Ying CHEN
In this paper, a hardware efficient design methodology for a configurable-point multiple-stream pipeline FFT processor is presented. We first compared the memory and arithmetic components of different pipeline FFT architectures, and obtained the conclusion that MDF architecture is more hardware efficient than MDC for the overall processor. Then, in order to reduce the computational complexity, a binary-tree representation was adopted to analyze the decomposition algorithm. Consequently, the coefficient multiplications are minimized among all the decomposition probabilities. In addition, an efficient output reorder circuit was designed for the multiple-stream architecture. An 128∼2048 point 4-stream FFT processor in LTE system was designed in SMIC 55nm technology for evaluation. It owns 1.09mm2 core area with 82.6mW power consumption at 122.88MHz clock frequency.
Bowei ZHANG Wenjiang FENG Qian XIAO Luran LV Zhiming WANG
In this paper, we study the degrees of freedom (DoF) of a multiple-input multiple-output (MIMO) multiway relay channel (mRC) with two relays, two clusters and K (K≥3) users per cluster. We consider a clustered full data exchange model, i.e., each user in a cluster sends a multicast (common) message to all other users in the same cluster and desires to acquire all messages from them. The DoF results of the mRC with the single relay have been reported. However, the DoF achievability of the mRC with multiple relays is still an open problem. Furthermore, we consider a more practical scenario where no channel state information at the transmitter (CSIT) is available to each user. We first give a DoF cut-set upper bound of the considered mRC. Then, we propose a distributed interference neutralization and retransmission scheme (DINR) to approach the DoF cut-set upper bound. In the absence of user cooperation, this method focuses on the beamforming matrix design at each relay. By investigating channel state information (CSI) acquisition, we show that the DINR scheme can be performed by distributed processing. Theoretical analyses and numerical simulations show that the DoF cut-set upper bound can be attained by the DINR scheme. It is shown that the DINR scheme can provide significant DoF gain over the conventional time division multiple access (TDMA) scheme. In addition, we show that the DINR scheme is superior to the existing single relay schemes for the considered mRC.