The search functionality is under construction.

Author Search Result

[Author] Takashi OHSAWA(9hit)

1-9hit
  • A Fully Analog Deep Neural Network Inference Accelerator with Pipeline Registers Based on Master-Slave Switched Capacitors

    Yaxin MEI  Takashi OHSAWA  

     
    PAPER-Integrated Electronics

      Pubricized:
    2023/03/08
      Vol:
    E106-C No:9
      Page(s):
    477-485

    A fully analog pipelined deep neural network (DNN) accelerator is proposed, which is constructed by using pipeline registers based on master-slave switched capacitors. The idea of the master-slave switched capacitors is an analog equivalent of the delayed flip-flop (D-FF) which has been used as a digital pipeline register. To estimate the performance of the pipeline register, it is applied to a conventional DNN which performs non-pipeline operation. Compared with the conventional DNN, the cycle time is reduced by 61.5% and data rate is increased by 160%. The accuracy reaches 99.6% in MNIST classification test. The energy consumption per classification is reduced by 88.2% to 0.128µJ, achieving an energy efficiency of 1.05TOPS/W and a throughput of 0.538TOPS in 180nm technology node.

  • A New Read Scheme for High-Density Emerging Memories

    Takashi OHSAWA  

     
    PAPER-Electronic Circuits

      Vol:
    E101-C No:6
      Page(s):
    423-429

    Several new memories are being studied as candidates of future DRAM that seems difficult to be scaled. However, the read signal in these new memories needs to be amplified in a single-end manner with reference signal supplied if they are aimed for being applied to the high-density main memory. This scheme, which is fortunately not necessary in DRAM's 1/2Vdd pre-charge sense amp, can become a serious bottleneck in the new memory development, because the device electrical parameters in these new memory cells are prone to large cell-to-cell variations without exception. Furthermore, the extent to which the parameter fluctuates in data “1” is generally not the same as in data “0”. In these situations, a new sensing scheme is proposed that can minimize the sensing error rate for high-density single-end emerging memories like STT-MRAM, ReRAM and PCRAM. The scheme is based on averaging multiple dummy cell pairs that are written “1” and “0” in a weighted manner according to the fluctuation unbalance between “1” and “0”. A detailed analysis shows that this scheme is effective in designing 128Mb 1T1MTJ STT-MRAM with the results that the required TMR ratio of an MTJ can be relaxed from 130% to 90% for the fluctuation of 6% sigma-to-average ratio of MTJ resistance in a 16 pair-dummy cell averaging case by using this technology when compared with the arithmetic averaging method.

  • Compact Model of Magnetic Tunnel Junctions for SPICE Simulation Based on Switching Probability

    Haoyan LIU  Takashi OHSAWA  

     
    PAPER-Semiconductor Materials and Devices

      Pubricized:
    2020/09/08
      Vol:
    E104-C No:3
      Page(s):
    121-127

    We propose a compact magnetic tunnel junction (MTJ) model for circuit simulation by de-facto standard SPICE in this paper. It is implemented by Verilog-A language which makes it easy to simulate MTJs with other standard devices. Based on the switching probability, we smoothly connect the adiabatic precessional model and the thermal activation model by using an interpolation technique based on the cubic spline method. We can predict the switching time after a current is applied. Meanwhile, we use appropriate physical models to describe other MTJ characteristics. Simulation results validate that the model is consistent with experimental data and effective for MTJ/CMOS hybrid circuit simulation.

  • Low Power Nonvolatile Counter Unit with Fine-Grained Power Gating

    Shuta TOGASHI  Takashi OHSAWA  Tetsuo ENDOH  

     
    PAPER

      Vol:
    E95-C No:5
      Page(s):
    854-859

    In this paper, we propose a new low power nonvolatile counter unit based on Magnetic Tunnel Junction (MTJ) with fine-grained power gating. The proposed counter unit consists of only a single latch with two MTJs. We verify the basic operation and estimate the power consumption of the proposed counter unit. The operating power consumption of the proposed nonvolatile counter unit is smaller than the conventional one below 140 kHz. The power of the proposed unit is 74.6% smaller than the conventional one at low frequency.

  • A Low-Cost Training Method of ReRAM Inference Accelerator Chips for Binarized Neural Networks to Recover Accuracy Degradation due to Statistical Variabilities

    Zian CHEN  Takashi OHSAWA  

     
    PAPER-Integrated Electronics

      Pubricized:
    2022/01/31
      Vol:
    E105-C No:8
      Page(s):
    375-384

    A new software based in-situ training (SBIST) method to achieve high accuracies is proposed for binarized neural networks inference accelerator chips in which measured offsets in sense amplifiers (activation binarizers) are transformed into biases in the training software. To expedite this individual training, the initial values for the weights are taken from results of a common forming training process which is conducted in advance by using the offset fluctuation distribution averaged over the fabrication line. SPICE simulation inference results for the accelerator predict that the accuracy recovers to higher than 90% even when the amplifier offset is as large as 40mV only after a few epochs of the individual training.

  • Co-Design of Binary Processing in Memory ReRAM Array and DNN Model Optimization Algorithm

    Yue GUAN  Takashi OHSAWA  

     
    PAPER-Integrated Electronics

      Pubricized:
    2020/05/13
      Vol:
    E103-C No:11
      Page(s):
    685-692

    In recent years, deep neural network (DNN) has achieved considerable results on many artificial intelligence tasks, e.g. natural language processing. However, the computation complexity of DNN is extremely high. Furthermore, the performance of traditional von Neumann computing architecture has been slowing down due to the memory wall problem. Processing in memory (PIM), which places computation within memory and reduces the data movement, breaks the memory wall. ReRAM PIM is thought to be a available architecture for DNN accelerators. In this work, a novel design of ReRAM neuromorphic system is proposed to process DNN fully in array efficiently. The binary ReRAM array is composed of 2T2R storage cells and current mirror sense amplifiers. A dummy BL reference scheme is proposed for reference voltage generation. A binary DNN (BDNN) model is then constructed and optimized on MNIST dataset. The model reaches a validation accuracy of 96.33% and is deployed to the ReRAM PIM system. Co-design model optimization method between hardware device and software algorithm is proposed with the idea of utilizing hardware variance information as uncertainness in optimization procedure. This method is analyzed to achieve feasible hardware design and generalizable model. Deployed with such co-design model, ReRAM array processes DNN with high robustness against fabrication fluctuation.

  • Folded Bitline Architecture for a Gigabit-Scale NAND DRAM

    Shinichiro SHIRATAKE  Daisaburo TAKASHIMA  Takehiro HASEGAWA  Hiroaki NAKANO  Yukihito OOWAKI  Shigeyoshi WATANABE  Takashi OHSAWA  Kazunori OHUCHI  

     
    PAPER

      Vol:
    E80-C No:4
      Page(s):
    573-581

    A new memory cell arrangement for a gigabit-scale NAND DRAM is proposed. Although the conventional NAND DRAM in which memory cells are connected in series realizes the small die size, it faces a crucial array noise problem in the 1 gigabit generation and beyond because of its inherent noise of the open bitline arrangement. By introducing the new cell arrangement to a NAND DRAM, the folded bitline scheme is realized, resulting in good noise immunity. The basic operation of the proposed folded bitline scheme was successfully verified using the 64 kbit test chip. The die size of the proposed NAND DRAM with the folded bitline scheme (F-NAND DRAM) at the 1 Gbit generation is reduced to 63% of that of the conventional 1 Gbit DRAM with the folded bitline scheme, assuming the bitlines and the wordlines are fabricated with the same pitch. The new 4/4 bitline grouping scheme in which cell data are read out to four neighboring bitlines is also introduced to reduce the bitline-to-bitline coupling noise to half of that of the conventional folded bitline scheme. The array noise of the proposed F-NAND DRAM with the 4/4 bitline grouping scheme at 1 Gbit generation is reduced to 10% of the read-out signal, while that of the conventional NAND DRAM with open bitline scheme is 29%, and that of the conventional DRAM with the folded bitline scheme is 22%.

  • Array Design of High-Density Emerging Memories Making Clamped Bit-Line Sense Amplifier Compatible with Dummy Cell Average Read Scheme

    Ziyue ZHANG  Takashi OHSAWA  

     
    PAPER-Integrated Electronics

      Pubricized:
    2020/02/26
      Vol:
    E103-C No:8
      Page(s):
    372-380

    Reference current used in sense amplifiers is a crucial factor in a single-end read manner for emerging memories. Dummy cell average read scheme uses multiple pairs of dummy cells inside the array to generate an accurate reference current for data sensing. The previous research adopts current mirror sense amplifier (CMSA) which is compatible with the dummy cell average read scheme. However, clamped bit-line sense amplifier (CBLSA) has higher sensing speed and lower power consumption compared with CMSA. Therefore, applying CBLSA to dummy cell average read scheme is expected to enhance the performance. This paper reveals that direct combination of CBLSA and dummy cell average read scheme leads to sense margin degradation. In order to solve this problem, a new array design is proposed to make CBLSA compatible with dummy cell average read scheme. Current mirror structure is employed to prevent CBLSA from being short-circuited directly. The simulation result shows that the minimum sensible tunnel magnetoresistance ratio (TMRR) can be extended from 14.3% down to 1%. The access speed of the proposed sensing scheme is less than 2 ns when TMRR is 70% or larger, which is about twice higher than the previous research. And this circuit design just consumes half of the energy in one read cycle compared with the previous research. In the proposed array architecture, all the dummy cells can be always short-circuited in totally isolated area by low-resistance metal wiring instead of using controlling transistors. This structure is able to contribute to increasing the dummy cell averaging effect. Besides, the array-level simulation validates that the array design is accessible to every data cell. This design is generally applicable to any kinds of resistance-variable emerging memories including STT-MRAM.

  • A 250 mV Bit-Line Swing Scheme for 1-V Operating Gigabit Scale DRAMs

    Tsuneo INABA  Daisaburo TAKASHIMA  Yukihito OOWAKI  Tohru OZAKI  Shigeyoshi WATANABE  Takashi OHSAWA  Kazunori OHUCHI  Hiroyuki TANGO  

     
    PAPER

      Vol:
    E79-C No:12
      Page(s):
    1699-1706

    This paper proposes a small 1/4Vcc bit-line swing scheme and a related sense amplifier scheme for low power 1 V operating DRAM. Using the proposed small bit-line swing scheme, the stress bias of memory cell transistor and capacitor is reduced to half that of the conventional DRAM, resulting in improvement of device reliability. The proposed sense amplifier scheme achieves high speed and stable sensing/restoring operation at 250mV bit-line swing, which is much smaller than threshold voltage. The proposed scheme reduces the total power dissipation of bit-line sensing/restoring operation to 40% of the conventional one. This paper also proposes a small 4F2 size memory cell and a new twisted bit-line scheme. The array noise is reduced to 8.6% of the conventional DRAM.