1-5hit |
Tetsuya YAMADA Naohiko IRIE Takanobu TSUNODA Takahiro IRITA Kenji KITAGAWA Ryohei YOSHIDA Keisuke TOYAMA Motoaki SATOYAMA
We have developed a hardware accelerator for Java platforms, integrated on a SuperH microprocessor core, using a 130-nm CMOS process. The Java accelerator, a bytecode translation unit (BTU), is tightly coupled with the CPU to share resources. The BTU supports 159 basic bytecodes and 5 or 6 optional bytecodes. It supports both connected device configuration (CDC) 1.0 and connected limited device configuration (CLDC) 1.0.4 technologies. The BTU corresponds to the dual-issued superscalar CPU and applies a new method, control-sharing. With this method, the BTU always grasps the pipeline status of the CPU, and the Java program is processed by both the BTU and the CPU. To implement this method, we developed some acceleration techniques: fast branch requests, enhanced CPU instructions, Java runtime exception detection hardware, and fewer overhead cycles of handover between the BTU and the CPU. In particular, the BTU can detect Java runtime exceptions in parallel with other processing, such as an array access. With previous methods, there is a disadvantage in that CPU efficiency decreases for Java-specific processing, such as array index bounds checking. The sample chip was fabricated in Renesas 130-nm, five-layer Cu, dual-vth low-power CMOS technology. The chip runs at 216 MHz and 1.2 V. The BTU has 75 kG. The benchmark on an evaluation board showed 6.55 embedded caffeine marks (ECM)/MHz on the CLDC 1.0.4 configuration, a tenfold speed increase without the BTU for roughly the same power consumption. In other words, power savings of 90 percent with the same performance were achieved.
Osamu NISHII Yoichi YUYAMA Masayuki ITO Yoshikazu KIYOSHIGE Yusuke NITTA Makoto ISHIKAWA Tetsuya YAMADA Junichi MIYAKOSHI Yasutaka WADA Keiji KIMURA Hironori KASAHARA Hideo MAEJIMA
We built a 12.4 mm12.4 mm, 45-nm CMOS, chip that integrates eight 648-MHz general purpose cores, two matrix processor (MX-2) cores, four flexible engine (FE) cores and media IP (VPU5) to establish heterogeneous multi-core chip architecture. The general purpose core had its IPC (instructions per cycle) performance enhanced by adding 32-bit instructions to the existing 16-bit fixed-length instruction set and executing up to two 32-bit instructions per cycle. Considering these five-to-seven years of embedded LSI and increasing trend of access-master within LSI, we predict that the memory usage of single core will not exceed 32-bit physical area (i.e. 4 GB), but chip-total memory usage will exceed 4 GB. Based on this prediction, the physical address was expanded from 32-bit to 40-bit. The fabricated chip was tested and a parallel operation of eight general purpose cores and four FE cores and eight data transfer units (DTU) is obtained on AAC (Advanced Audio Coding) encode processing.
Tetsuya YAMADA Masahide ABE Yusuke NITTA Kenji OGURA Manabu KUSAOKE Makoto ISHIKAWA Motokazu OZAWA Kiwamu TAKADA Fumio ARAKAWA Osamu NISHII Toshihiro HATTORI
A low-power SuperHTM embedded processor core, the SH-X2, has been designed in 90-nm CMOS technology. The power consumption was reduced by using hierarchical fine-grained clock gating to reduce the power consumption of the flip-flops and the clock-tree, synthesis and a layout that supports the implementation of the clock gating, and several-level power evaluations for RTL refinement. With this clock gating and RTL refinement, the power consumption of the clock-tree and flip-flops was reduced by 35% and 59%, including the process shrinking effects, respectively. As a result, the SH-X2 achieved 6,000 MIPS/W using a Renesas low-power process with a lowered voltage. Its performance-power efficiency was 25% better than that of a 130-nm-process SH-X.
Tetsuya YAMADA Makoto ISHIKAWA Yuji OGATA Takanobu TSUNODA Takahiro IRITA Saneaki TAMAKI Kunihiko NISHIYAMA Tatsuya KAMEI Ken TATEZAWA Fumio ARAKAWA Takuichiro NAKAZAWA Toshihiro HATTORI Kunio UCHIYAMA
A 32-bit embedded RISC microprocessor core integrating a DSP has been developed using a 0.18-µm five-layer-metal CMOS technology. The integrated DSP has a single-MAC and exploits CPU resources to reduce hardware. The DSP occupies only 0.5 mm2. The processor core includes a large on-chip 128 kB SRAM called U-memory. A large capacity on-chip memory decreases the amount of traffic with an external memory. And it is effective for low-power and high-performance operation. To realize low-power dissipation for the U-memory access, the active ratio of U-memory's access is reduced. The critical path is a load path from the U-memory, and we optimized the path through the whole chip. The chip achieves 0.79 mA/MHz executing Dhrystone 1.1 at 108 MHz, which is suitable for mobile applications.
Takahide MIZUNO Kousuke KAWAHARA Kazuhiko YAMADA Yukio KAMATA Tetsuya YAMADA Hitoshi KUNINAKA
Hayabusa returned to Earth on June 13, 2010, becoming the world's first explorer to complete a round-trip voyage to an asteroid. After being released from the spacecraft, the sample return capsule landed in the Woomera Prohibited Area in the desert of South Australia. The capsule recovery team from JAXA found the capsule within 1 h of its landing. The beacon tracking system that was developed by JAXA played an important role in the tracking and discovery of the sample return capsule. The system has flexibility regarding the landing position of the capsule, because it does not rely on primary radar. In this paper, we describe the beacon tracking system and evaluate the system by discussing the results of preliminary examination and of operation on the day of re-entry.