The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] LER(1184hit)

61-80hit(1184hit)

  • F-band Frequency Multipliers with Fundamental and Harmonic Rejection for Improved Conversion Gain and Output Power

    Ibrahim ABDO  Korkut Kaan TOKGOZ  Atsushi SHIRANE  Kenichi OKADA  

     
    PAPER-Electronic Circuits

      Pubricized:
    2021/09/29
      Vol:
    E105-C No:3
      Page(s):
    118-125

    This paper introduces several design techniques to improve the performance of CMOS frequency multipliers that operate at the sub-THz band without increasing the complexity and the power consumption of the circuit. The proposed techniques are applied to a device nonlinearity-based frequency tripler and to a push-push frequency doubler. By utilizing the fundamental and second harmonic feedback cancellation, the tripler achieves -2.9dBm output power with a simple single-ended circuit architecture reducing the required area and power consumption. The tripler operates at frequencies from 103GHz to 130GHz. The introduced modified push-push doubler provides 2.3dB conversion gain including the balun losses and it has good tolerance against balun mismatches. The output frequency of the doubler is from 118GHz to 124GHz. Both circuits were designed and fabricated using CMOS 65nm technology.

  • L5-TSPP: A Protocol for Disruption Tolerant Networking in Layer-5

    Hiroki WATANABE  Fumio TERAOKA  

     
    PAPER

      Pubricized:
    2021/09/01
      Vol:
    E105-B No:2
      Page(s):
    215-227

    TCP/IP, the foundation of the current Internet, assumes a sufficiently low packet loss rate for links in communication path. On the other hand, for communication services such as mobile and wireless communications, communication link tends to be disruptive. In this paper, we propose Layer-5 temporally-spliced path protocol (L5-TSPP), which provides disruption-tolerance in the L5 temporally-spliced path (L5-TSP), as one of the communication paths provided by Layer-5 (L5-paths). We design and implement an API for using L5-paths (L5 API). The L5 API is designed and implemented to support not only POSIX systems but also non-POSIX systems. L5 API and L5-TSPP are implemented in the user space in Go language. The measurement results show that L5-TSP achieves lower and more stable connection establishment time and better end-to-end throughput in the presence of disruption than conventional communication paths.

  • Trace Representation of r-Ary Sequences Derived from Euler Quotients Modulo 2p

    Rayan MOHAMMED  Xiaoni DU  Wengang JIN  Yanzhong SUN  

     
    PAPER-Coding Theory

      Pubricized:
    2021/06/21
      Vol:
    E104-A No:12
      Page(s):
    1698-1703

    We introduce the r-ary sequence with period 2p2 derived from Euler quotients modulo 2p (p is an odd prime) where r is an odd prime divisor of (p-1). Then based on the cyclotomic theory and the theory of trace function in finite fields, we give the trace representation of the proposed sequence by determining the corresponding defining polynomial. Our results will be help for the implementation and the pseudo-random properties analysis of the sequences.

  • A Low-Latency Inference of Randomly Wired Convolutional Neural Networks on an FPGA

    Ryosuke KURAMOCHI  Hiroki NAKAHARA  

     
    PAPER

      Pubricized:
    2021/06/24
      Vol:
    E104-D No:12
      Page(s):
    2068-2077

    Convolutional neural networks (CNNs) are widely used for image processing tasks in both embedded systems and data centers. In data centers, high accuracy and low latency are desired for various tasks such as image processing of streaming videos. We propose an FPGA-based low-latency CNN inference for randomly wired convolutional neural networks (RWCNNs), whose layer structures are based on random graph models. Because RWCNNs have several convolution layers that have no direct dependencies between them, our architecture can process them efficiently using a pipeline method. At each layer, we need to use the calculation results of multiple layers as the input. We use an FPGA with HBM2 to enable parallel access to the input data with multiple HBM2 channels. We schedule the order of execution of the layers to improve the pipeline efficiency. We build a conflict graph using the scheduling results. Then, we allocate the calculation results of each layer to the HBM2 channels by coloring the graph. Because the pipeline execution needs to be properly controlled, we developed an automatic generation tool for hardware functions. We implemented the proposed architecture on the Alveo U50 FPGA. We investigated a trade-off between latency and recognition accuracy for the ImageNet classification task by comparing the inference performances for different input image sizes. We compared our accelerator with a conventional accelerator for ResNet-50. The results show that our accelerator reduces the latency by 2.21 times. We also obtained 12.6 and 4.93 times better efficiency than CPU and GPU, respectively. Thus, our accelerator for RWCNNs is suitable for low-latency inference.

  • Smaller Residual Network for Single Image Depth Estimation

    Andi HENDRA  Yasushi KANAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/17
      Vol:
    E104-D No:11
      Page(s):
    1992-2001

    We propose a new framework for estimating depth information from a single image. Our framework is relatively small and straightforward by employing a two-stage architecture: a residual network and a simple decoder network. Our residual network in this paper is a remodeled of the original ResNet-50 architecture, which consists of only thirty-eight convolution layers in the residual block following by pair of two up-sampling and layers. While the simple decoder network, stack of five convolution layers, accepts the initial depth to be refined as the final output depth. During training, we monitor the loss behavior and adjust the learning rate hyperparameter in order to improve the performance. Furthermore, instead of using a single common pixel-wise loss, we also compute loss based on gradient-direction, and their structure similarity. This setting in our network can significantly reduce the number of network parameters, and simultaneously get a more accurate image depth map. The performance of our approach has been evaluated by conducting both quantitative and qualitative comparisons with several prior related methods on the publicly NYU and KITTI datasets.

  • Enhanced Sender-Based Message Logging for Reducing Forced Checkpointing Overhead in Distributed Systems

    Jinho AHN  

     
    LETTER-Dependable Computing

      Pubricized:
    2021/06/08
      Vol:
    E104-D No:9
      Page(s):
    1500-1505

    The previous communication-induced checkpointing may considerably induce worthless forced checkpoints because each process receiving messages cannot obtain sufficient information related to non-causal Z-paths. This paper presents an enhanced sender-based message logging protocol applicable to any communication-induced checkpointing to lead to a high decrease of the forced checkpointing overhead of communication-induced checkpointing in an effective way while permitting no useless checkpoint. The protocol allows each process sending a message to know the exact timestamp of the receiver of the message in its logging procedures without any extra message. Simulation verifies their great efficiency of overhead alleviation regardless of communication patterns.

  • Chromatic Art Gallery Problem with r-Visibility is NP-Complete

    Chuzo IWAMOTO  Tatsuaki IBUSUKI  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2021/03/26
      Vol:
    E104-A No:9
      Page(s):
    1108-1115

    The art gallery problem is to find a set of guards who together can observe every point of the interior of a polygon P. We study a chromatic variant of the problem, where each guard is assigned one of k distinct colors. The chromatic art gallery problem is to find a guard set for P such that no two guards with the same color have overlapping visibility regions. We study the decision version of this problem for orthogonal polygons with r-visibility when the number of colors is k=2. Here, two points are r-visible if the smallest axis-aligned rectangle containing them lies entirely within the polygon. In this paper, it is shown that determining whether there is an r-visibility guard set for an orthogonal polygon with holes such that no two guards with the same color have overlapping visibility regions is NP-hard when the number of colors is k=2.

  • TDM Based Reference Signal Multiplexing for OFDM Using Faster-than-Nyquist Signaling

    Tsubasa SHOBUDANI  Mamoru SAWAHASHI  Yoshihisa KISHIYAMA  

     
    PAPER

      Pubricized:
    2021/03/17
      Vol:
    E104-B No:9
      Page(s):
    1079-1088

    This paper proposes time division multiplexing (TDM) based reference signal (RS) multiplexing for faster-than-Nyquist (FTN) signaling using orthogonal frequency division multiplexing (OFDM). We also propose a subframe structure in which a cyclic prefix (CP) is appended to only the TDM based RS block and the first FTN symbol to achieve accurate estimation of the channel response in a multipath fading channel with low CP overhead. Computer simulation results show that the loss in the required average received SNR satisfying the average block error rate (BLER) of 10-2 using the proposed TDM based RS multiplexing from that with ideal channel estimation is suppressed to within approximately 1.2dB and 1.7dB for QPSK and 16QAM, respectively. This is compared to when the improvement ratio of the spectral efficiency from CP-OFDM is 1.31 with the rate-1/2 turbo code. We conclude that the TDM based RS multiplexing with the associated CP multiplexing is effective in achieving accurate channel estimation for FTN signaling using OFDM.

  • Design and Fabrication of PTFE Substrate Integrated Waveguide Coupler by SR Direct Etching Open Access

    Mitsuyoshi KISHIHARA  Masaya TAKEUCHI  Akinobu YAMAGUCHI  Yuichi UTSUMI  Isao OHTA  

     
    PAPER-Microwaves, Millimeter-Waves

      Pubricized:
    2021/03/15
      Vol:
    E104-C No:9
      Page(s):
    446-454

    The microfabrication technique based on synchrotron radiation (SR) direct etching process has recently been applied to construct PTFE microstructures. This paper proposes a PTFE substrate integrated waveguide (PTFE SIW). It is expected that the PTFE SIW contributes to the improvement of the structural strength. A rectangular through-hole is introduced taking the advantage of the SR direct etching process. First, a PTFE SIW for the Q-band is designed. Then, a cruciform 3-dB directional coupler consisting of the PTFE SIW is designed and fabricated by the SR direct etching process. The validity of the PTFE SIW coupler is confirmed by measuring the frequency characteristics of the S-parameters. The mechanical strength of the PTFE SIW and the peeling strength of its Au film are also additionally investigated.

  • Construction of Multiple-Valued Bent Functions Using Subsets of Coefficients in GF and RMF Domains

    Milo&scaron M. RADMANOVIĆ  Radomir S. STANKOVIĆ  

     
    PAPER-Logic Design

      Pubricized:
    2021/04/21
      Vol:
    E104-D No:8
      Page(s):
    1103-1110

    Multiple-valued bent functions are functions with highest nonlinearity which makes them interesting for multiple-valued cryptography. Since the general structure of bent functions is still unknown, methods for construction of bent functions are often based on some deterministic criteria. For practical applications, it is often necessary to be able to construct a bent function that does not belong to any specific class of functions. Thus, the criteria for constructions are combined with exhaustive search over all possible functions which can be very CPU time consuming. A solution is to restrict the search space by some conditions that should be satisfied by the produced bent functions. In this paper, we proposed the construction method based on spectral subsets of multiple-valued bent functions satisfying certain appropriately formulated restrictions in Galois field (GF) and Reed-Muller-Fourier (RMF) domains. Experimental results show that the proposed method efficiently constructs ternary and quaternary bent functions by using these restrictions.

  • The Fractional-N All Digital Frequency Locked Loop with Robustness for PVT Variation and Its Application for the Microcontroller Unit

    Ryoichi MIYAUCHI  Akio YOSHIDA  Shuya NAKANO  Hiroki TAMURA  Koichi TANNO  Yutaka FUKUCHI  Yukio KAWAMURA  Yuki KODAMA  Yuichi SEKIYA  

     
    PAPER-Circuit Technologies

      Pubricized:
    2021/04/01
      Vol:
    E104-D No:8
      Page(s):
    1146-1153

    This paper describes the Fractional-N All Digital Frequency Locked Loop (ADFLL) with Robustness for PVT variation and its application for the microcontroller unit. The conventional FLL is difficult to achieve the required specification by using the fine CMOS process. Especially, the conventional FLL has some problems such as unexpected operation and long lock time that are caused by PVT variation. To overcome these problems, we propose a new ADFLL which uses dynamic selecting digital filter coefficients. The proposed ADFLL was evaluatied through the HSPICE simulation and fabricating chips using a 0.13 µm CMOS process. From these results, we observed the proposed ADFLL has robustness for PVT variation by using dynamic selecting digital filter coefficient, and the lock time is improved up to 57%, clock jitter is 0.85 nsec.

  • FCA-BNN: Flexible and Configurable Accelerator for Binarized Neural Networks on FPGA

    Jiabao GAO  Yuchen YAO  Zhengjie LI  Jinmei LAI  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2021/05/19
      Vol:
    E104-D No:8
      Page(s):
    1367-1377

    A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.

  • Enhancing the Business Model: Automating the Recommended Retail Price Calculation of Products

    Bahjat FAKIEH  

     
    PAPER-Office Information Systems, e-Business Modeling

      Pubricized:
    2021/04/15
      Vol:
    E104-D No:7
      Page(s):
    970-980

    The purpose of this paper is to find an automated pricing algorithm to calculate the real cost of each product by considering the associate costs of the business. The methodology consists of two main stages. A brief semi-structured survey and a mathematical calculation the expenses and adding them to the original cost of the offered products and services. The output of this process obtains the minimum recommended selling price (MRSP) that the business should not go below, to increase the likelihood of generating profit and avoiding the unexpected loss. The contribution of this study appears in filling the gap by calculating the minimum recommended price automatically and assisting businesses to foresee future budgets. This contribution has a certain limitation, where it is unable to calculate the MRSP of the in-house created products from raw materials. It calculates the MRSP only for the products bought from the wholesaler to be sold by the retailer.

  • Recovering Faulty Non-Volatile Flip Flops for Coarse-Grained Reconfigurable Architectures

    Takeharu IKEZOE  Takuya KOJIMA  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2020/12/14
      Vol:
    E104-C No:6
      Page(s):
    215-225

    Recent IoT devices require extremely low standby power consumption, while a certain performance is needed during the active time, and Coarse-Grained Reconfigurable Arrays (CGRAs) have received attention because of their high energy efficiency. For further reduction of the standby energy consumption of CGRAs, the leakage power for their configuration memory must be reduced. Although the power gating is a common technique, the lost data in flip-flops and memory must be retrieved after the wake-up. Recovering everything requires numerous state transitions and considerable overhead both on its execution time and energy. To address the problem, Non-volatile Cool Mega Array (NVCMA), a CGRA providing non-volatile flip-flops (NVFFs) with spin transfer torque type non-volatile memory (NVM) technology has been developed. However, in general, non-volatile memory technologies have problems with reliability. Some NVFFs are stacked-at-0/1, and cannot store the data in a certain possibility. To improve the chip yield, we propose a mapping algorithm to avoid faulty processing elements of the CGRA caused by the erroneous configuration data. Next, we also propose a method to add an error-correcting code (ECC) mechanism to NVFFs for the configuration and constant memory. The proposed method was applied to NVCMA to evaluate the availability rate and reduction of write time. By using both methods, the average availability ratio of 94.2% was achieved, while the average availability ratio of the nine applications was 0.056% when the probability of failure of the FF was 0.01. The energy for storing data becomes about 2.3 times because of the hardware overhead of ECC but the proposed method can save 8.6% of the writing power on average.

  • Instruction Prefetch for Improving GPGPU Performance

    Jianli CAO  Zhikui CHEN  Yuxin WANG  He GUO  Pengcheng WANG  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2020/11/16
      Vol:
    E104-A No:5
      Page(s):
    773-785

    Like many processors, GPGPU suffers from memory wall. The traditional solution for this issue is to use efficient schedulers to hide long memory access latency or use data prefetch mech-anism to reduce the latency caused by data transfer. In this paper, we study the instruction fetch stage of GPU's pipeline and analyze the relationship between the capacity of GPU kernel and instruction miss rate. We improve the next line prefetch mechanism to fit the SIMT model of GPU and determine the optimal parameters of prefetch mechanism on GPU through experiments. The experimental result shows that the prefetch mechanism can achieve 12.17% performance improvement on average. Compared with the solution of enlarging I-Cache, prefetch mechanism has the advantages of more beneficiaries and lower cost.

  • Physical Cell ID Detection Probability Using NR Synchronization Signals in 28-GHz Band

    Kyogo OTA  Mamoru SAWAHASHI  Satoshi NAGATA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/10/22
      Vol:
    E104-B No:4
      Page(s):
    436-445

    This paper presents the physical-layer cell identity (PCID) detection probability using the primary synchronization signal (PSS) and secondary synchronization signal (SSS) for the New Radio (NR) radio interface considering a large frequency offset and high Doppler frequency in multipath Rayleigh fading channels in the 28-GHz band. Simulation results show that cross-correlation based PSS detection after compensating for the frequency offset achieves higher PCID detection probability than autocorrelation based PSS detection at the average received signal-to-noise power ratio (SNR) values below approximately 0dB for the frequency stability of a user equipment (UE) oscillator of ϵ =5ppm. Meanwhile, both methods achieve almost the same PCID detection probability for average received SNR values higher than approximately 0dB. We also show that even with the large frequency offset caused by ϵ =20 ppm, the high PCID detection probability of approximately 90 (97)% and 90 (96)% is achieved for the cross-correlation or autocorrelation based PSS detection method, respectively, at the average received SNR of 0dB for the subcarrier spacing of 120 (240)kHz. We conclude that utilizing the multiplexing scheme for the PSS and SSS and their sequences is effective in achieving a high PCID detection probability considering a large frequency offset even with the frequency deviation of ϵ =20ppm in the 28-GHz band.

  • An Energy-Efficient Defense against Message Flooding Attacks in Delay Tolerant Networks

    Hiromu ASAHINA  Keisuke ARAI  Shuichiro HARUTA  P. Takis MATHIOPOULOS  Iwao SASASE  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2020/10/06
      Vol:
    E104-B No:4
      Page(s):
    348-359

    Delay Tolerant Networks (DTNs) are vulnerable to message flooding attacks in which a very large number of malicious messages are sent so that network resources are depleted. To address this problem, previous studies mainly focused on constraining the number of messages that nodes can generate per time slot by allowing nodes to monitor the other nodes' communication history. Since the adversaries may hide their attacks by claiming a false history, nodes exchange their communication histories and detect an attacker who has presented an inconsistent communication history. However, this approach increases node energy consumption since the number of communication histories increases every time a node communicates with another node. To deal with this problem, in this paper, we propose an energy-efficient defense against such message flooding attacks. The main idea of the proposed scheme is to time limit the communication history exchange so as to reduce the volume while ensuring the effective detection of inconsistencies. The advantage of this approach is that, by removing communication histories after they have revealed such inconsistencies, the energy consumption is reduced. To estimate such expiration time, analytical expressions based upon a Markov chain based message propagation model, are derived for the probability that a communication history reveals such inconsistency in an arbitrary time. Extensive performance evaluation results obtained by means of computer simulations and several performance criteria verify that the proposed scheme successfully improves the overall energy efficiency. For example, these performance results have shown that, as compared to other previously known defenses against message flooding attacks, the proposed scheme extends by at least 22% the battery lifetime of DTN nodes, while maintaining the same levels of protection.

  • Partial Scrambling Overlapped Selected Mapping PAPR Reduction for OFDM/OQAM Systems

    Tomoya KAGEYAMA  Osamu MUTA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/09/24
      Vol:
    E104-B No:3
      Page(s):
    338-347

    Offset quadrature amplitude modulation based orthogonal frequency division multiplexing (OFDM/OQAM) is a promising multi-carrier modulation technique to achieve a low-sidelobe spectrum while maintaining orthogonality among subcarriers. However, a major shortcoming of OFDM/OQAM systems is the high peak-to-average power ratio (PAPR) of the transmit signal. To resolve the high-PAPR issue of traditional OFDM, a self-synchronized-scrambler-based selected-mapping has been investigated, where the transmit sequence is scrambled to reduce PAPR. In this method, the receiver must use a descrambler to recover the original data. However, the descrambling process leads to error propagation, which degrades the bit error rate (BER). As described herein, a partial scrambling overlapped selected mapping (PS-OSLM) scheme is proposed for PAPR reduction of OFDM/OQAM signals, where candidate sequences are generated using partial scrambling of original data. The best candidate, the one that minimizes the peak amplitude within multiple OFDM/OQAM symbols, is selected. In the proposed method, an overlap search algorithm for SLM is applied to reduce the PAPR of OFDM/OQAM signals. Numerical results demonstrate that our PS-OSLM proposal achieves better BER than full-scrambling overlapped SLM (FS-OSLM) in OFDM/OQAM systems while maintaining almost equivalent PAPR reduction capability as FS-OSLM and better PAPR than SLM without overlap search. Additionally, we derive a theoretical lower bound expression for OFDM/OQAM with PS-OSLM, and clarify the effectiveness of the proposed scheme.

  • 180-Degree Branch Line Coupler Composed of Two Types of Iris-Loaded Waveguides

    Hidenori YUKAWA  Yu USHIJIMA  Naofumi YONEDA  Moriyasu MIYAZAKI  

     
    PAPER-Microwaves, Millimeter-Waves

      Pubricized:
    2020/08/14
      Vol:
    E104-C No:2
      Page(s):
    85-92

    We propose a 180-degree branch line coupler composed of two types of iris-loaded waveguides. The proposed coupler consists of two main transmission lines and branch lines with different electrical lengths. Based on optimal electrical lengths, a 180-degree output phase difference can be achieved without additional phase shifters. The two main lines with different electrical lengths are realized by capacitive and inductive iris-loaded waveguides. The size of the proposed coupler is nearly half that of the conventional 180-degree branch line coupler with additional phase shifters. Thus, the proposed coupler is of advantage with respect to the conventional one. We designed a proposed coupler in the K-band for satellite communication systems. The measurement results demonstrate a reflection of -20 dB, isolation of -20 dB, coupling response of -3.1+0.1 dB/-0.1 dB, and phase differences of 0+0.1 deg/-1.4 deg and -180+0.5 deg/-2.3 deg at a bandwidth of 8% in the K-band.

  • A Novel Multi-Knowledge Distillation Approach

    Lianqiang LI  Kangbo SUN  Jie ZHU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/10/19
      Vol:
    E104-D No:1
      Page(s):
    216-219

    Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.

61-80hit(1184hit)