The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] tin(3578hit)

341-360hit(3578hit)

  • A Generalized Theory Based on the Turn Model for Deadlock-Free Irregular Networks

    Ryuta KAWANO  Ryota YASUDO  Hiroki MATSUTANI  Michihiro KOIBUCHI  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2019/10/08
      Vol:
    E103-D No:1
      Page(s):
    101-110

    Recently proposed irregular networks can reduce the latency for both on-chip and off-chip systems with a large number of computing nodes and thus can improve the performance of parallel applications. However, these networks usually suffer from deadlocks in routing packets when using a naive minimal path routing algorithm. To solve this problem, we focus attention on a lately proposed theory that generalizes the turn model to maintain the network performance with deadlock-freedom. The theorems remain a challenge of applying themselves to arbitrary topologies including fully irregular networks. In this paper, we advance the theorems to completely general ones. Moreover, we provide a feasible implementation of a deadlock-free routing method based on our advanced theorem. Experimental results show that the routing method based on our proposed theorem can improve the network throughput by up to 138 % compared to a conventional deterministic minimal routing method. Moreover, when utilized as the escape path in Duato's protocol, it can improve the throughput by up to 26.3 % compared with the conventional up*/down* routing.

  • Decentralized Local Scaling Factor Control for Backoff-Based Opportunistic Routing Open Access

    Taku YAMAZAKI  Ryo YAMAMOTO  Genki HOSOKAWA  Tadahide KUNITACHI  Yoshiaki TANAKA  

     
    PAPER-Information Network

      Pubricized:
    2019/07/17
      Vol:
    E102-D No:12
      Page(s):
    2317-2328

    In wireless multi-hop networks such as ad hoc networks and sensor networks, backoff-based opportunistic routing protocols, which make a forwarding decision based on backoff time, have been proposed. In the protocols, each potential forwarder calculates the backoff time based on the product of a weight and global scaling factor. The weight prioritizes potential forwarders and is calculated based on hop counts to the destination of a sender and receiver. The global scaling factor is a predetermined value to map the weight to the actual backoff time. However, there are three common issues derived from the global scaling factor. First, it is necessary to share the predetermined global scaling factor with a centralized manner among all terminals properly for the backoff time calculation. Second, it is almost impossible to change the global scaling factor during the networks are being used. Third, it is difficult to set the global scaling factor to an appropriate value since the value differs among each local surrounding of forwarders. To address the aforementioned issues, this paper proposes a novel decentralized local scaling factor control without relying on a predetermined global scaling factor. The proposed method consists of the following three mechanisms: (1) sender-centric local scaling factor setting mechanism in a decentralized manner instead of the global scaling factor, (2) adaptive scaling factor control mechanism which adapts the local scaling factor to each local surrounding of forwarders, and (3) mitigation mechanism for excessive local scaling factor increases for the local scaling factor convergence. Finally, this paper evaluates the backoff-based opportunistic routing protocol with and without the proposed method using computer simulations.

  • Design of Low-Cost Approximate Multipliers Based on Probability-Driven Inexact Compressors

    Yi GUO  Heming SUN  Ping LEI  Shinji KIMURA  

     
    PAPER

      Vol:
    E102-A No:12
      Page(s):
    1781-1791

    Approximate computing has emerged as a promising approach for error-tolerant applications to improve hardware performance at the cost of some loss of accuracy. Multiplication is a key arithmetic operation in these applications. In this paper, we propose a low-cost approximate multiplier design by employing new probability-driven inexact compressors. This compressor design is introduced to reduce the height of partial product matrix into two rows, based on the probability distribution of the sum result of partial products. To compensate the accuracy loss of the multiplier, a grouped error recovery scheme is proposed and achieves different levels of accuracy. In terms of mean relative error distance (MRED), the accuracy losses of the proposed multipliers are from 1.07% to 7.86%. Compared with the Wallace multiplier using 40nm process, the most accurate variant of the proposed multipliers can reduce power by 59.75% and area by 42.47%. The critical path delay reduction is larger than 12.78%. The proposed multiplier design has a better accuracy-performance trade-off than other designs with comparable accuracy. In addition, the efficiency of the proposed multiplier design is assessed in an image processing application.

  • Memory Efficient Load Balancing for Distributed Large-Scale Volume Rendering Using a Two-Layered Group Structure

    Marcus WALLDEN  Stefano MARKIDIS  Masao OKITA  Fumihiko INO  

     
    PAPER-Computer Graphics

      Pubricized:
    2019/09/09
      Vol:
    E102-D No:12
      Page(s):
    2306-2316

    We propose a novel compositing pipeline and a dynamic load balancing technique for volume rendering which utilizes a two-layered group structure to achieve effective and scalable load balancing. The technique enables each process to render data from non-contiguous regions of the volume with minimal impact on the total render time. We demonstrate the effectiveness of the proposed technique by performing a set of experiments on a modern GPU cluster. The experiments show that using the technique results in up to a 35.7% lower worst-case memory usage as compared to a dynamic k-d tree load balancing technique, whilst simultaneously achieving similar or higher render performance. The proposed technique was also able to lower the amount of transferred data during the load balancing stage by up to 72.2%. The technique has the potential to be used in many scenarios where other dynamic load balancing techniques have proved to be inadequate, such as during large-scale visualization.

  • An Improvement of Non-Binary Single b-Burst of Insertion/Deletion Correcting Code

    Toyohiko SAEKI  Takayuki NOZAKI  

     
    PAPER-Coding Theory

      Vol:
    E102-A No:12
      Page(s):
    1591-1599

    This paper constructs non-binary codes correcting a single b-burst of insertions or deletions with large cardinalities. This paper also provides insertion and deletion correcting algorithms of the constructed codes and evaluates a lower bound of the cardinalities of the constructed codes. Moreover, we evaluate a non-asymptotic upper bound on the cardinalities of arbitrary codes which correct a single b-burst of insertions or deletions.

  • Packet-Oriented Erasure Correcting Codes by Bit-Level Shift Operation and Exclusive OR

    Yuta HANAKI  Takayuki NOZAKI  

     
    PAPER-Erasure Correction

      Vol:
    E102-A No:12
      Page(s):
    1622-1630

    This paper constructs packet-oriented erasure correcting codes and their systematic forms for the distributed storage systems. The proposed codes are encoded by exclusive OR and bit-level shift operation. By the shift operation, the encoded packets are slightly longer than the source packets. This paper evaluates the extra length of the encoded packets, called overhead, and shows that the proposed codes have smaller overheads than the zigzag decodable codes, which are existing codes using bit-level shift operation and exclusive OR.

  • Characteristics and Applicability of Frequency Sharing Criteria in the Broadcasting Satellite Link Open Access

    Kazuyoshi SHOGEN  Thong PHAM VIET  

     
    PAPER-Satellite Communications

      Pubricized:
    2019/06/17
      Vol:
    E102-B No:12
      Page(s):
    2297-2303

    Two frequency sharing criteria for BSS (Broadcasting-Satellite Service) are enacted in Sect.1 of Annex 1 to Appendix 30 to Radio Regulations. These two criteria are pfd (power flux-density) and EPM (Equivalent Protection Margin) values. In this paper, the two criteria are compared and studied from the view point of applicability to the sharing cases between BSS and BSS. In particular, it is shown that in some cases, the EPM criterion contributes to alleviate the problem of “sensitive satellite network”, i.e., one that has relatively low transmission power and is very weak against interference and blocks the new satellite to enter. Disclaimer The views and positions expressed by the authors are strictly personal and do not constitute, nor can be interpreted as, the position of the International Telecommunication Union on the topics addressed in this paper.

  • On-Chip Cache Architecture Exploiting Hybrid Memory Structures for Near-Threshold Computing

    Hongjie XU  Jun SHIOMI  Tohru ISHIHARA  Hidetoshi ONODERA  

     
    PAPER

      Vol:
    E102-A No:12
      Page(s):
    1741-1750

    This paper focuses on power-area trade-off axis to memory systems. Compared with the power-performance-area trade-off application on the traditional high performance cache, this paper focuses on the edge processing environment which is becoming more and more important in the Internet of Things (IoT) era. A new power-oriented trade-off is proposed for on-chip cache architecture. As a case study, this paper exploits a good energy efficiency of Standard-Cell Memory (SCM) operating in a near-threshold voltage region and a good area efficiency of Static Random Access Memory (SRAM). A hybrid 2-level on-chip cache structure is first introduced as a replacement of 6T-SRAM cache as L0 cache to save the energy consumption. This paper proposes a method for finding the best capacity combination for SCM and SRAM, which minimizes the energy consumption of the hybrid cache under a specific cache area constraint. The simulation result using a 65-nm process technology shows that up to 80% energy consumption is reduced without increasing the die area by replacing the conventional SRAM instruction cache with the hybrid 2-level cache. The result shows that energy consumption can be reduced if the area constraint for the proposed hybrid cache system is less than the area which is equivalent to a 8kB SRAM. If the target operating frequency is less than 100MHz, energy reduction can be achieved, which implies that the proposed cache system is suitable for low-power systems where a moderate processing speed is required.

  • Adaptive-Partial Template Update with Center-Shifting Recovery for High Frame Rate and Ultra-Low Delay Deformation Matching

    Songlin DU  Yuhao XU  Tingting HU  Takeshi IKENAGA  

     
    PAPER-Image

      Vol:
    E102-A No:12
      Page(s):
    1872-1881

    High frame rate and ultra-low delay matching system plays an important role in various human-machine interactive applications, which demands better performance in matching deformable and out-of-plane rotating objects. Although many algorithms have been proposed for deformation tracking and matching, few of them are suitable for hardware implementation due to complicated operations and large time consumption. This paper proposes a hardware-oriented template update and recovery method for high frame rate and ultra-low delay deformation matching system. In the proposed method, the new template is generated in real time by partially updating the template descriptor and adding new keypoints simultaneously with the matching process in pixels (proposal #1), which avoids the large inter-frame delay. The size and shape of region of interest (ROI) are made flexible and the Hamming threshold used for brute-force matching is adjusted according to pixel position and the flexible ROI (proposal #2), which solves the problem of template drift. The template is recovered by the previous one with a relative center-shifting vector when it is judged as lost via region-wise difference check (proposal #3). Evaluation results indicate that the proposed method successfully achieves the real-time processing of 784fps at the resolution of 640×480 on field-programmable gate array (FPGA), with a delay of 0.808ms/frame, as well as achieves satisfactory deformation matching results in comparison with other general methods.

  • Interworking Layer of Distributed MQTT Brokers

    Ryohei BANNO  Jingyu SUN  Susumu TAKEUCHI  Kazuyuki SHUDO  

     
    PAPER-Information Network

      Pubricized:
    2019/07/30
      Vol:
    E102-D No:12
      Page(s):
    2281-2294

    MQTT is one of the promising protocols for various data exchange in IoT environments. Typically, those environments have a characteristic called “edge-heavy”, which means that things at the network edge generate a massive volume of data with high locality. For handling such edge-heavy data, an architecture of placing multiple MQTT brokers at the network edges and making them cooperate with each other is quite effective. It can provide higher throughput and lower latency, as well as reducing consumption of cloud resources. However, under this kind of architecture, heterogeneity could be a vital issue. Namely, an appropriate product of MQTT broker could vary according to the different environment of each network edge, even though different products are hard to cooperate due to the MQTT specification providing no interoperability between brokers. In this paper, we propose Interworking Layer of Distributed MQTT brokers (ILDM), which enables arbitrary kinds of MQTT brokers to cooperate with each other. ILDM, designed as a generic mechanism independent of any specific cooperation algorithm, provides APIs to facilitate development of a variety of algorithms. By using the APIs, we also present two basic cooperation algorithms. To evaluate the usefulness of ILDM, we introduce a benchmark system which can be used for both a single broker and multiple brokers. Experimental results show that the throughput of five brokers running together by ILDM is improved 4.3 times at maximum than that of a single broker.

  • On the Distribution of p-Error Linear Complexity of p-Ary Sequences with Period pn

    Miao TANG  Juxiang WANG  Minjia SHI  Jing LIANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/09/02
      Vol:
    E102-D No:12
      Page(s):
    2595-2598

    Linear complexity and the k-error linear complexity of periodic sequences are the important security indices of stream cipher systems. This paper focuses on the distribution of p-error linear complexity of p-ary sequences with period pn. For p-ary sequences of period pn with linear complexity pn-p+1, n≥1, we present all possible values of the p-error linear complexity, and derive the exact formulas to count the number of the sequences with any given p-error linear complexity.

  • Passage of Faulty Nodes: A Novel Approach for Fault-Tolerant Routing on NoCs

    Yota KUROKAWA  Masaru FUKUSHI  

     
    PAPER

      Vol:
    E102-A No:12
      Page(s):
    1702-1710

    This paper addresses the problem of developing an efficient fault-tolerant routing method for 2D mesh Network-on-Chips (NoCs) to realize dependable and high performance many core systems. Existing fault-tolerant routing methods have two critical problems of high communication latency and low node utilization. Unlike almost all existing methods where packets always detour faulty nodes, we propose a novel and unique approach that packets can pass through faulty nodes. For this approach, we enhance the common NoC architecture by adding switches and links around each node and propose a fault-tolerant routing method with no virtual channels based on the well-known simple XY routing method. Simulation results show that the proposed method reduces average communication latency by about 97.1% compared with the existing method, without sacrificing fault-free nodes.

  • Simulation Study of Low-Latency Network Model with Orchestrator in MEC Open Access

    Krittin INTHARAWIJITR  Katsuyoshi IIDA  Hiroyuki KOGA  Katsunori YAMAOKA  

     
    PAPER-Network

      Pubricized:
    2019/05/16
      Vol:
    E102-B No:11
      Page(s):
    2139-2150

    Most of latency-sensitive mobile applications depend on computational resources provided by a cloud computing service. The problem of relying on cloud computing is that, sometimes, the physical locations of cloud servers are distant from mobile users and the communication latency is long. As a result, the concept of distributed cloud service, called mobile edge computing (MEC), is being introduced in the 5G network. However, MEC can reduce only the communication latency. The computing latency in MEC must also be considered to satisfy the required total latency of services. In this research, we study the impact of both latencies in MEC architecture with regard to latency-sensitive services. We also consider a centralized model, in which we use a controller to manage flows between users and mobile edge resources to analyze MEC in a practical architecture. Simulations show that the interval and controller latency trigger some blocking and error in the system. However, the permissive system which relaxes latency constraints and chooses an edge server by the lowest total latency can improve the system performance impressively.

  • Performance Improvement of the Catastrophic CPM Scheme with New Split-Merged MNSED

    Richard Hsin-Hsyong YANG  Chia-Kun LEE  Shiunn-Jang CHERN  

     
    PAPER-Transmission Systems and Transmission Equipment for Communications

      Pubricized:
    2019/05/16
      Vol:
    E102-B No:11
      Page(s):
    2091-2103

    Continuous phase modulation (CPM) is a very attractive digital modulation scheme, with constant envelope feature and high efficiency in meeting the power and bandwidth requirements. CPM signals with pairs of input sequences that differ in an infinite number of positions and map into pairs of transmitted signals with finite Euclidean distance (ED) are called catastrophic. In the CPM scheme, data sequences that have the catastrophic property are called the catastrophic sequences; they are periodic difference data patterns. The catastrophic sequences are usually with shorter length of the merger. The corresponding minimum normalized squared ED (MNSED) is smaller and below the distance bound. Two important CPM schemes, viz., LREC and LRC schemes, are known to be catastrophic for most cases; they have poor overall power and bandwidth performance. In the literatures, it has been shown that the probability of generating such catastrophic sequences are negligible, therefore, the asymptotic error performance (AEP) of those well-known catastrophic CPM schemes evaluated with the corresponding MNSED, over AWGN channels, might be too negative or pessimistic. To deal with this problem in AWGN channel, this paper presents a new split-merged MNSED and provide criteria to explore which conventional catastrophic CPM scheme could increase the length of mergers with split-merged non-periodic events, effectively. For comparison, we investigate the exact power and bandwidth performance for LREC and LRC CPM for the same bandwidth occupancy. Computer simulation results verify that the AEP evaluating with the split-merged MNSED could achieve up to 3dB gain over the conventional approach.

  • QSL: A Specification Language for E-Questionnaire, E-Testing, and E-Voting Systems

    Yuan ZHOU  Yuichi GOTO  Jingde CHENG  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2019/08/19
      Vol:
    E102-D No:11
      Page(s):
    2159-2175

    Many kinds of questionnaires, testing, and voting are performed in some completely electronic ways to do questions and answers on the Internet as Web applications, i.e. e-questionnaire systems, e-testing systems, and e-voting systems. Because there is no unified communication tool among the stakeholders of e-questionnaire, e-testing, and e-voting systems, until now, all the e-questionnaire, e-testing, and e-voting systems are designed, developed, used, and maintained in various ad hoc ways. As a result, the stakeholders are difficult to communicate to implement the systems, because there is neither an exhaustive requirement list to have a grasp of the overall e-questionnaire, e-testing, and e-voting systems nor a standardized terminology for these systems to avoid ambiguity. A general-purpose specification language to provide a unified description way for specifying various e-questionnaire, e-testing, and e-voting systems can solve the problems such that the stakeholders can refer to and use the complete requirements and standardized terminology for better communications, and can easily and unambiguously specify all the requirements of systems and services of e-questionnaire, e-testing, and e-voting, even can implement the systems. In this paper, we propose the first specification language, named “QSL,” with a standardized, consistent, and exhaustive list of requirements for specifying various e-questionnaire, e-testing, and e-voting systems such that the specifications can be used as the precondition of automatically generating e-questionnaire, e-testing, and e-voting systems. The paper presents our design addressing that QSL can specify all the requirements of various e-questionnaire, e-testing, and e-voting systems in a structured way, evaluates its effectiveness, performs real applications using QSL in case of e-questionnaire, e-testing, and e-voting systems, and shows various QSL applications for providing convenient QSL services to stakeholders.

  • Accelerating Stochastic Simulations on GPUs Using OpenCL

    Pilsung KANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/07/23
      Vol:
    E102-D No:11
      Page(s):
    2253-2256

    Since first introduced in 2008 with the 1.0 specification, OpenCL has steadily evolved over the decade to increase its support for heterogeneous parallel systems. In this paper, we accelerate stochastic simulation of biochemical reaction networks on modern GPUs (graphics processing units) by means of the OpenCL programming language. In implementing the OpenCL version of the stochastic simulation algorithm, we carefully apply its data-parallel execution model to optimize the performance provided by the underlying hardware parallelism of the modern GPUs. To evaluate our OpenCL implementation of the stochastic simulation algorithm, we perform a comparative analysis in terms of the performance using the CPU-based cluster implementation and the NVidia CUDA implementation. In addition to the initial report on the performance of OpenCL on GPUs, we also discuss applicability and programmability of OpenCL in the context of GPU-based scientific computing.

  • Further Results on the Separating Redundancy of Binary Linear Codes

    Haiyang LIU  Lianrong MA  

     
    LETTER-Coding Theory

      Vol:
    E102-A No:10
      Page(s):
    1420-1425

    In this letter, we investigate the separating redundancy of binary linear codes. Using analytical techniques, we provide a general lower bound on the first separating redundancy of binary linear codes and show the bound is tight for a particular family of binary linear codes, i.e., cycle codes. In other words, the first separating redundancy of cycle codes can be determined. We also derive a deterministic and constructive upper bound on the second separating redundancy of cycle codes, which is shown to be better than the general deterministic and constructive upper bounds for the codes.

  • A Deep Learning Approach to Writer Identification Using Inertial Sensor Data of Air-Handwriting

    Yanfang DING  Yang XUE  

     
    LETTER-Pattern Recognition

      Pubricized:
    2019/07/18
      Vol:
    E102-D No:10
      Page(s):
    2059-2063

    To the best of our knowledge, there are a few researches on air-handwriting character-level writer identification only employing acceleration and angular velocity data. In this paper, we propose a deep learning approach to writer identification only using inertial sensor data of air-handwriting. In particular, we separate different representations of degree of freedom (DoF) of air-handwriting to extract local dependency and interrelationship in different CNNs separately. Experiments on a public dataset achieve an average good performance without any extra hand-designed feature extractions.

  • LEF: An Effective Routing Algorithm for Two-Dimensional Meshes

    Thiem Van CHU  Kenji KISE  

     
    PAPER-Computer System

      Pubricized:
    2019/07/09
      Vol:
    E102-D No:10
      Page(s):
    1925-1941

    We design a new oblivious routing algorithm for two-dimensional mesh-based Networks-on-Chip (NoCs) called LEF (Long Edge First) which offers high throughput with low design complexity. LEF's basic idea comes from conventional wisdom in choosing the appropriate dimension-order routing (DOR) algorithm for supercomputers with asymmetric mesh or torus interconnects: routing longest dimensions first provides better performance than other strategies. In LEF, we combine the XY DOR and the YX DOR. When routing a packet, which DOR algorithm is chosen depends on the relative position between the source node and the destination node. Decisions of selecting the appropriate DOR algorithm are not fixed to the network shape but instead made on a per-packet basis. We also propose an efficient deadlock avoidance method for LEF in which the use of virtual channels is more flexible than in the conventional method. We evaluate LEF against O1TURN, another effective oblivious routing algorithm, and a minimal adaptive routing algorithm based on the odd-even turn model. The evaluation results show that LEF is particularly effective when the communication is within an asymmetric mesh. In a 16×8 NoC, LEF even outperforms the adaptive routing algorithm in some cases and delivers from around 4% up to around 64.5% higher throughput than O1TURN. Our results also show that the proposed deadlock avoidance method helps to improve LEF's performance significantly and can be used to improve O1TURN's performance. We also examine LEF in large-scale NoCs with thousands of nodes. Our results show that, as the NoC size increases, the performance of the routing algorithms becomes more strongly influenced by the resource allocation policy in the network and the effect is different for each algorithm. This is evident in that results of middle-scale NoCs with around 100 nodes cannot be applied directly to large-scale NoCs.

  • 120-W Ku-Band GaN SSPA with Diode Linearizer for Future Broadcasting Satellites Open Access

    Masafumi NAGASAKA  Masaaki KOJIMA  Takuma TORII  Hiromitsu UTSUMI  Koji YAMANAKA  Shintaro SHINJO  Mitsuhiro SHIMOZAWA  Hisashi SUJIKAI  

     
    PAPER

      Vol:
    E102-C No:10
      Page(s):
    717-724

    Satellite broadcasting of 4K/8K ultra-high definition television (UHDTV) was launched in Japan in December 2018. Because this system uses the amplitude and phase shift keying (APSK) modulation scheme, there is a need to improve the non-linear characteristics of the satellite transponders. To meet this requirement, we have been developing a 120-W-class Ku-band solid state power amplifier (SSPA) as a replacement for the currently used traveling wave tube amplifier (TWTA). In this study, we developed a gallium-nitride (GaN) SSPA and linearizer (LNZ). The SSPA achieved an output power of 120W while maintaining a power added efficiency (PAE) of 31%. We evaluated the transmission performance of 16APSK in this SSPA channel in comparison with that in the TWTA channel.

341-360hit(3578hit)