The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SPE(2504hit)

281-300hit(2504hit)

  • Highly Efficient Adaptive Bandwidth Allocation Algorithm for WDM/OFDM-PON-Based Elastic Optical Access Networks

    Hiroyuki SAITO  Naoki MINATO  Hideaki TAMAI  Hironori SASAKI  

     
    PAPER

      Pubricized:
    2017/10/18
      Vol:
    E101-B No:4
      Page(s):
    972-978

    Capital expenditure (CAPEX) reduction and efficient wavelength allocation are critical for the future access networks. Elastic lambda aggregation network (EλAN) based on WDM and OFDM technologies is expected to realize efficient wavelength allocation. In this paper, we propose adaptive bandwidth allocation (ABA) algorithm for EλAN under the conditions of crowded networks, in which modulation format, symbol rate and the number of sub-carriers are adaptively decided based on the distance of PON-section, QoS and bandwidth demand of each ONU. Network simulation results show that the proposed algorithm can effectively reduce the total bandwidth and achieve steady high spectrum efficiency and contribute to the further reduction of CAPEX of future optical access networks.

  • A 7GS/s Complete-DDFS-Solution in 65nm CMOS

    Abdel MARTINEZ ALONSO  Masaya MIYAHARA  Akira MATSUZAWA  

     
    PAPER

      Vol:
    E101-C No:4
      Page(s):
    206-217

    A 7GS/s complete-DDFS-solution featuring a two-times interleaved RDAC with 1.2Vpp-diff output swing was fabricated in 65nm CMOS. The frequency tuning and amplitude resolutions are 24-bits and 10-bits respectively. The RDAC includes a mixed-signal, high-speed architecture for random swapping thermometer coding dynamic element matching that improves the narrowband SFDR up to 8dB for output frequencies below 1.85GHz. The proposed techniques enable a 7 GS/s operation with a spurious-free dynamic range better than 32dBc over the full Nyquist bandwidth. The worst case narrowband SFDR is 42dBc. This system consumes 87.9mW/(GS/s) from a 1.2V power supply when the RSTC-DEM method is enabled, resulting in a FoM of 458.9GS/s·2(SFDR/6)/W. A proof-of-concept chip with an active area of only 0.22mm2 was measured in prototypes encapsulated in a 144-pins low profile quad flat package.

  • Pipelined Squarer for Unsigned Integers of Up to 12 Bits

    Seongjin CHOI  Hyeong-Cheol OH  

     
    LETTER-Computer System

      Pubricized:
    2017/12/06
      Vol:
    E101-D No:3
      Page(s):
    795-798

    This paper proposes and analyzes a pipelining scheme for a hardware squarer that can square unsigned integers of up to 12 bits. Each stage is designed and adjusted such that stage delays are well balanced and that the critical path delay of the design does not exceed the reference value which is set up based on the analysis. The resultant design has the critical path delay of approximately 3.5 times a full-adder delay. In an implementation using an Intel Stratix V FPGA, the design operates at approximately 23% higher frequency than the comparable pipelined squarer provided in the Intel library.

  • Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability

    Wenhua SHI  Xiongwei ZHANG  Xia ZOU  Meng SUN  Wei HAN  Li LI  Gang MIN  

     
    LETTER-Noise and Vibration

      Vol:
    E101-A No:3
      Page(s):
    585-589

    A monaural speech enhancement method combining deep neural network (DNN) with low rank analysis and speech present probability is proposed in this letter. Low rank and sparse analysis is first applied on the noisy speech spectrogram to get the approximate low rank representation of noise. Then a joint feature training strategy for DNN based speech enhancement is presented, which helps the DNN better predict the target speech. To reduce the residual noise in highly overlapping regions and high frequency domain, speech present probability (SPP) weighted post-processing is employed to further improve the quality of the speech enhanced by trained DNN model. Compared with the supervised non-negative matrix factorization (NMF) and the conventional DNN method, the proposed method obtains improved speech enhancement performance under stationary and non-stationary conditions.

  • Effects of Automated Transcripts on Non-Native Speakers' Listening Comprehension

    Xun CAO  Naomi YAMASHITA  Toru ISHIDA  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2017/11/24
      Vol:
    E101-D No:3
      Page(s):
    730-739

    Previous research has shown that transcripts generated by automatic speech recognition (ASR) technologies can improve the listening comprehension of non-native speakers (NNSs). However, we still lack a detailed understanding of how ASR transcripts affect listening comprehension of NNSs. To explore this issue, we conducted two studies. The first study examined how the current presentation of ASR transcripts impacted NNSs' listening comprehension. 20 NNSs engaged in two listening tasks, each in different conditions: C1) audio only and C2) audio+ASR transcripts. The participants pressed a button whenever they encountered a comprehension problem, and explained each problem in the subsequent interviews. From our data analysis, we found that NNSs adopted different strategies when using the ASR transcripts; some followed the transcripts throughout the listening; some only checked them when necessary. NNSs also appeared to face difficulties following imperfect and slightly delayed transcripts while listening to speech - many reported difficulties concentrating on listening/reading or shifting between the two. The second study explored how different display methods of ASR transcripts affected NNSs' listening experiences. We focused on two display methods: 1) accuracy-oriented display which shows transcripts only after the completion of speech input analysis, and 2) speed-oriented display which shows the interim analysis results of speech input. We conducted a laboratory experiment with 22 NNSs who engaged in two listening tasks with ASR transcripts presented via the two display methods. We found that the more the NNSs paid attention to listening to the audio, the more they tended to prefer the speed-oriented transcripts, and vice versa. Mismatched transcripts were found to have negative effects on NNSs' listening comprehension. Our findings have implications for improving the presentation methods of ASR transcripts to more effectively support NNSs.

  • Clutter Rank Estimation for Diving Platform Radar

    Fengde JIA  Zishu HE  

     
    LETTER-Analog Signal Processing

      Vol:
    E101-A No:3
      Page(s):
    600-603

    A convenient formula for the estimation of the clutter rank of the diving platform radar is derived. Brennan's rule provides a general formula to estimate the clutter rank for the side looking radar with a linear array, which is normally called one-dimensional (1D) estimation problem. With the help of the clutter wavenumber spectrum, the traditional estimation of the clutter rank is extended to the diving scenario and the estimation problem is two-dimensional (2D). The proposed rule is verified by the numerical simulations.

  • Optimization of MAC-Layer Sensing Based on Alternating Renewal Theory in Cognitive Radio Networks

    Zhiwei MAO  Xianmin WANG  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2017/09/14
      Vol:
    E101-B No:3
      Page(s):
    865-876

    Cognitive radio (CR) is considered as the most promising solution to the so-called spectrum scarcity problem, in which channel sensing is an important problem. In this paper, the problem of determining the period of medium access control (MAC)-layer channel sensing in cognitive radio networks (CRNs) is studied. In our study, the channel state is statistically modeled as a continuous-time alternating renewal process (ARP) alternating between the OFF and ON states for the primary user (PU)'s communication activity. Based on the statistical ARP model, we analyze the CRNs with different SU MAC protocols, taking into consideration the effects of practical issues of imperfect channel sensing and non-negligible channel sensing time. Based on the analysis results, a constrained optimization problem to find the optimal sensing period is formulated and the feasibility of this problem is studied for systems with different OFF/ON channel state length distributions. Numerical results are presented to show the performance of the proposed sensing period optimization scheme. The effects of practical system parameters, including channel sensing errors and channel sensing time, on the performance and the computational complexity of the proposed sensing period optimization scheme are also investigated.

  • Improved MCAS Based Spectrum Sensing in Cognitive Radio

    Shusuke NARIEDA  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2017/08/29
      Vol:
    E101-B No:3
      Page(s):
    915-923

    This paper presents a computationally efficient cyclostationarity detection based spectrum sensing technique in cognitive radio. Traditionally, several cyclostationarity detection based spectrum sensing techniques with a low computational complexity have been presented, e.g., peak detector (PD), maximum cyclic autocorrelation selection (MCAS), and so on. PD can be affected by noise uncertainty because it requires a noise floor estimation, whereas MCAS does not require the estimation. Furthermore, the computational complexity of MCAS is greater than that of PD because MCAS must compute some statistics for signal detection instead of the estimation unnecessary whereas PD must compute only one statistic. In the presented MCAS based techniques, only one statistic must be computed. The presented technique obtains other necessary statistics from the procedure that computes the statistic. Therefore, the computational complexity of the presented is almost the same as that of PD, and it does not require the noise floor estimation for threshold. Numerical examples are shown to validate the effectiveness of the presented technique.

  • Pose Estimation with Action Classification Using Global-and-Pose Features and Fine-Grained Action-Specific Pose Models

    Norimichi UKITA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2017/12/08
      Vol:
    E101-D No:3
      Page(s):
    758-766

    This paper proposes an iterative scheme between human action classification and pose estimation in still images. Initial action classification is achieved only by global image features that consist of the responses of various object filters. The classification likelihood of each action weights human poses estimated by the pose models of multiple sub-action classes. Such fine-grained action-specific pose models allow us to robustly identify the pose of a target person under the assumption that similar poses are observed in each action. From the estimated pose, pose features are extracted and used with global image features for action re-classification. This iterative scheme can mutually improve action classification and pose estimation. Experimental results with a public dataset demonstrate the effectiveness of the proposed method both for action classification and pose estimation.

  • DNN-Based Speech Synthesis Using Speaker Codes

    Nobukatsu HOJO  Yusuke IJIMA  Hideyuki MIZUNO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2017/11/01
      Vol:
    E101-D No:2
      Page(s):
    462-472

    Deep neural network (DNN)-based speech synthesis can produce more natural synthesized speech than the conventional HMM-based speech synthesis. However, it is not revealed whether the synthesized speech quality can be improved by utilizing a multi-speaker speech corpus. To address this problem, this paper proposes DNN-based speech synthesis using speaker codes as a method to improve the performance of the conventional speaker dependent DNN-based method. In order to model speaker variation in the DNN, the augmented feature (speaker codes) is fed to the hidden layer(s) of the conventional DNN. This paper investigates the effectiveness of introducing speaker codes to DNN acoustic models for speech synthesis for two tasks: multi-speaker modeling and speaker adaptation. For the multi-speaker modeling task, the method we propose trains connection weights of the whole DNN using a multi-speaker speech corpus. When performing multi-speaker synthesis, the speaker code corresponding to the selected target speaker is fed to the DNN to generate the speaker's voice. When performing speaker adaptation, a set of connection weights of the multi-speaker model is re-estimated to generate a new target speaker's voice. We investigated the relationship between the prediction performance and architecture of the DNNs through objective measurements. Objective evaluation experiments revealed that the proposed model outperformed conventional methods (HMMs, speaker dependent DNNs and multi-speaker DNNs based on a shared hidden layer structure). Subjective evaluation experimental results showed that the proposed model again outperformed the conventional methods (HMMs, speaker dependent DNNs), especially when using a small number of target speaker utterances.

  • A Low-Power Pulse-Shaped Duobinary ASK Modulator for IEEE 802.11ad Compliant 60GHz Transmitter in 65nm CMOS

    Bangan LIU  Yun WANG  Jian PANG  Haosheng ZHANG  Dongsheng YANG  Aravind Tharayil NARAYANAN  Dae Young LEE  Sung Tae CHOI  Rui WU  Kenichi OKADA  Akira MATSUZAWA  

     
    PAPER-Electronic Circuits

      Vol:
    E101-C No:2
      Page(s):
    126-134

    An energy efficient modulator for an ultra-low-power (ULP) 60-GHz IEEE transmitter is presented in this paper. The modulator consists of a differential duobinary coder and a semi-digital finite-impulse-response (FIR) pulse-shaping filter. By virtue of differential duobinary coding and pulse shaping, the transceiver successfully solves the adjacent-channel-power-ratio (ACPR) issue of conventional on-off-keying (OOK) transceivers. The proposed differential duobinary code adopts an over-sampling precoder, which relaxes timing requirement and reduces power consumption. The semi-digital FIR eliminates the power hungry digital multipliers and accumulators, and improves the power efficiency through optimization of filter parameters. Fabricated in a 65nm CMOS process, this modulator occupies a core area of 0.12mm2. With a throughput of 1.7Gbps/2.6Gbps, power consumption of modulator is 24.3mW/42.8mW respectively, while satisfying the IEEE 802.11ad spectrum mask.

  • Accurate Estimation of Personalized Video Preference Using Multiple Users' Viewing Behavior

    Yoshiki ITO  Takahiro OGAWA  Miki HASEYAMA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2017/11/22
      Vol:
    E101-D No:2
      Page(s):
    481-490

    A method for accurate estimation of personalized video preference using multiple users' viewing behavior is presented in this paper. The proposed method uses three kinds of features: a video, user's viewing behavior and evaluation scores for the video given by a target user. First, the proposed method applies Supervised Multiview Spectral Embedding (SMSE) to obtain lower-dimensional video features suitable for the following correlation analysis. Next, supervised Multi-View Canonical Correlation Analysis (sMVCCA) is applied to integrate the three kinds of features. Then we can get optimal projections to obtain new visual features, “canonical video features” reflecting the target user's individual preference for a video based on sMVCCA. Furthermore, in our method, we use not only the target user's viewing behavior but also other users' viewing behavior for obtaining the optimal canonical video features of the target user. This unique approach is the biggest contribution of this paper. Finally, by integrating these canonical video features, Support Vector Ordinal Regression with Implicit Constraints (SVORIM) is trained in our method. Consequently, the target user's preference for a video can be estimated by using the trained SVORIM. Experimental results show the effectiveness of our method.

  • A RGB-Guided Low-Rank Method for Compressive Hyperspectral Image Reconstruction

    Limin CHEN  Jing XU  Peter Xiaoping LIU  Hui YU  

     
    PAPER-Image

      Vol:
    E101-A No:2
      Page(s):
    481-487

    Compressive spectral imaging (CSI) systems capture the 3D spatiospectral data by measuring the 2D compressed focal plane array (FPA) coded projection with the help of reconstruction algorithms exploiting the sparsity of signals. However, the contradiction between the multi-dimension of the scenes and the limited dimension of the sensors has limited improvement of recovery performance. In order to solve the problem, a novel CSI system based on a coded aperture snapshot spectral imager, RGB-CASSI, is proposed, which has two branches, one for CASSI, another for RGB images. In addition, considering that conventional reconstruction algorithms lead to oversmoothing, a RGB-guided low-rank (RGBLR) method for compressive hyperspectral image reconstruction based on compressed sensing and coded aperture spectral imaging system is presented, in which the available additional RGB information is used to guide the reconstruction and a low-rank regularization for compressive sensing and a non-convex surrogate of the rank is also used instead of nuclear norm for seeking a preferable solution. Experiments show that the proposed algorithm performs better in both PSNR and subjective effects compared with other state-of-art methods.

  • A Compact Matched Filter Bank for an Optical ZCZ Sequence Set with Zero-Correlation Zone 2z

    Yasuaki OHIRA  Takahiro MATSUMOTO  Hideyuki TORII  Yuta IDA  Shinya MATSUFUJI  

     
    LETTER

      Vol:
    E101-A No:1
      Page(s):
    195-198

    In this paper, we propose a new structure for a compact matched filter bank (MFB) for an optical zero-correlation zone (ZCZ) sequence set with Zcz=2z. The proposed MFB can reduces operation elements such as 2-input adders and delay elements. The number of 2-input adders decrease from O(N2) to O(N log2 N), delay elements decrease from O(N2) to O(N). In addition, the proposed MFBs for the sequence of length 32, 64, 128 and 256 with Zcz=2,4 and 8 are implemented on a field programmable gate array (FPGA). As a result, the numbers of logic elements (LEs) of the proposed MFBs for the sequences with Zcz=2 of length 32, 64, 128 and 256 are suppressed to about 76.2%, 84.2%, 89.7% and 93.4% compared to that of the conventional MFBs, respectively.

  • Calibration Method for Multi Static Linear Array Radar with One Dimensional Array Antenna Arranged in Staggered Manner

    Yasunari MORI  Takayoshi YUMII  Yumi ASANO  Kyouji DOI  Christian N. KOYAMA  Yasushi IITSUKA  Kazunori TAKAHASHI  Motoyuki SATO  

     
    PAPER-Electromagnetic Theory

      Vol:
    E101-C No:1
      Page(s):
    26-34

    This paper presents a calibration method for RF switch channels of a near-range multistatic linear array radar. The method allows calibration of the channel transfer functions of the RF switches and antenna transfer functions in frequency domain data, without disconnecting the antennas from the radar system. In addition, the calibration of the channels is independent of the directivities of the transmitting and receiving antennas. We applied the calibration method to a 3D imaging step-frequency radar system at 10-20GHz suitable for the nondestructive inspection of the walls of wooden houses. The measurement range of the radar is limited to 0-240mm, shorter than the antenna array length 480mm. This radar system allows acquiring 3D imaging data with a single scan. Using synthetic aperture radar processing, the structural health of braces inside the walls of wooden houses can be evaluated from the obtained 3D volume images. Based on experiment results, we confirmed that the proposed calibration method significantly improves the subsurface 3D imaging quality. Low intensity ghost images behind the brace target were suppressed, deformations of the target in the volume image were rectified and errors the range distance were corrected.

  • Oscillation Model for Describing Network Dynamics Caused by Asymmetric Node Interaction Open Access

    Masaki AIDA  Chisa TAKANO  Masayuki MURATA  

     
    POSITION PAPER-Fundamental Theories for Communications

      Pubricized:
    2017/07/03
      Vol:
    E101-B No:1
      Page(s):
    123-136

    This paper proposes an oscillation model for analyzing the dynamics of activity propagation across social media networks. In order to analyze such dynamics, we generally need to model asymmetric interactions between nodes. In matrix-based network models, asymmetric interaction is frequently modeled by a directed graph expressed as an asymmetric matrix. Unfortunately, the dynamics of an asymmetric matrix-based model is difficult to analyze. This paper, first of all, discusses a symmetric matrix-based model that can describe some types of link asymmetry, and then proposes an oscillation model on networks. Next, the proposed oscillation model is generalized to arbitrary link asymmetry. We describe the outlines of four important research topics derived from the proposed oscillation model. First, we show that the oscillation energy of each node gives a generalized notion of node centrality. Second, we introduce a framework that uses resonance to estimate the natural frequency of networks. Natural frequency is important information for recognizing network structure. Third, by generalizing the oscillation model on directed networks, we create a dynamical model that can describe flaming on social media networks. Finally, we show the fundamental equation of oscillation on networks, which provides an important breakthrough for generalizing the spectral graph theory applicable to directed graphs.

  • BER Performance of SS System Using a Huffman Sequence against CW Jamming

    Takahiro MATSUMOTO  Hideyuki TORII  Yuta IDA  Shinya MATSUFUJI  

     
    PAPER

      Vol:
    E101-A No:1
      Page(s):
    167-175

    In this paper, we theoretically analyse the influence of intersymbol interference (ISI) and continuous wave interference (CWI) on the bit error rate (BER) performance of the spread spectrum (SS) system using a real-valued Huffman sequence under the additive white Gaussian noise (AWGN) environment. The aperiodic correlation function of the Huffman sequence has zero sidelobes except the shift-end values at the left and right ends of shift. The system can give the unified communication and ranging system because the output of a matched filter (MF) is the ideal impulse by generating transmitted signal of the bit duration T=NTc, N=2n, n=1,2,… from the sequence of length M=2kN+1, k=0,1,…, where Tc is the chip duration and N is the spreading factor. As a result, the BER performance of the system is improved with decrease in the absolute value of the shift-end value, and is not influenced by ISI if the shift-end value is almost zero-value. In addition, the BER performance of the system of the bit duration T=NTc with CWI is improved with increase in the sequence length M=2kN+1, and the system can decrease the influence of CWI.

  • Regular Expression Filtering on Multiple q-Grams

    Seon-Ho SHIN  HyunBong KIM  MyungKeun YOON  

     
    LETTER-Information Network

      Pubricized:
    2017/10/11
      Vol:
    E101-D No:1
      Page(s):
    253-256

    Regular expression matching is essential in network and big-data applications; however, it still has a serious performance bottleneck. The state-of-the-art schemes use a multi-pattern exact string-matching algorithm as a filtering module placed before a heavy regular expression engine. We design a new approximate string-matching filter using multiple q-grams; this filter not only achieves better space compactness, but it also has higher throughput than the existing filters.

  • Encoding Detection and Bit Rate Classification of AMR-Coded Speech Based on Deep Neural Network

    Seong-Hyeon SHIN  Woo-Jin JANG  Ho-Won YUN  Hochong PARK  

     
    LETTER-Speech and Hearing

      Pubricized:
    2017/10/20
      Vol:
    E101-D No:1
      Page(s):
    269-272

    A method for encoding detection and bit rate classification of AMR-coded speech is proposed. For each texture frame, 184 features consisting of the short-term and long-term temporal statistics of speech parameters are extracted, which can effectively measure the amount of distortion due to AMR. The deep neural network then classifies the bit rate of speech after analyzing the extracted features. It is confirmed that the proposed features provide better performance than the conventional spectral features designed for bit rate classification of coded audio.

  • HMM-Based Maximum Likelihood Frame Alignment for Voice Conversion from a Nonparallel Corpus

    Ki-Seung LEE  

     
    LETTER-Speech and Hearing

      Pubricized:
    2017/08/23
      Vol:
    E100-D No:12
      Page(s):
    3064-3067

    One of the problems associated with voice conversion from a nonparallel corpus is how to find the best match or alignment between the source and the target vector sequences without linguistic information. In a previous study, alignment was achieved by minimizing the distance between the source vector and the transformed vector. This method, however, yielded a sequence of feature vectors that were not well matched with the underlying speaker model. In this letter, the vectors were selected from the candidates by maximizing the overall likelihood of the selected vectors with respect to the target model in the HMM context. Both objective and subjective evaluations were carried out using the CMU ARCTIC database to verify the effectiveness of the proposed method.

281-300hit(2504hit)