Ryochi KATAOKA Kentaro NISHIMORI Takefumi HIRAGURI Naoki HONMA Tomohiro SEKI Ken HIRAGA Hideo MAKINO
A novel analog decoding method using only 90-degree phase shifters is proposed to simplify the decoding method for short-range multiple-input multiple-output (MIMO) transmission. In a short-range MIMO transmission, an optimal element spacing that maximizes the channel capacity exists for a given transmit distance between the transmitter and receiver. We focus on the fact that the weight matrix by zero forcing (ZF) at the optimal element spacing can be obtained by using dividers and 90-degree phase shifters because it can be expressed by a unitary matrix. The channel capacity by the proposed method is next derived for the evaluation of the exact limitation of the channel capacity. Moreover, it is shown that an optimal weight when using directional antennas can be expressed by using only dividers, 90-degree phase shifters, and attenuators, regardless of the beam width of the directional antenna. Finally, bit error rate and channel capacity evaluations by both simulation and measurement confirm the effectiveness of the proposed method.
Yijian GONG Manuel MURBACH Teruo ONISHI Myles CAPSTICK Toshio NOJIMA Niels KUSTER
The objective of this paper is to extend the dosimetric assessment of 35mm Petri dishes exposed in the standing wave of R18 waveguides operated at 1950MHz for a medium-oil two-layer configuration for cells in monolayer and suspension. The culture medium inside the Petri dish is covered by oil that prevents evaporation and seals the cells below in the medium. The exposure of the cells was analyzed for one suspension-medium configuration, two different suspension-multilayer configurations, and one monolayer-multilayer configuration. The numerical dosimetry is verified by dosimetric temperature measurements. The non-uniformity of the specific absorption rate (SAR) distribution is 30% for monolayer, and 59-75% for suspension configurations. The latter should be taken into account when biological experiment is performed.
Masamitsu TANAKA Atsushi KITAYAMA Masakazu OKADA Tomohito KOUKETSU Takumi TAKINAMI Masato ITO Akira FUJIMAKI
We report the successful operation of a low-power arithmetic logic unit (ALU) based on a low-voltage rapid single-flux-quantum (LV-RSFQ) logic circuit, whereby a dc bias current is fed to circuits from lowered constant-voltage sources through small resistors. Both the static and dynamic energy consumptions are reduced because of the reduction in the amplitudes of voltage pulses across the Josephson junctions, with a trade-off of slightly slower switching speeds. The designed bias voltage was set to 0.25mV, which is one-tenth that of our standard RSFQ circuit design. We investigated several issues related to such low-voltage operation, including margins and timing design. To achieve successful operation, we tuned the circuit parameters in the logic gate design and carefully controlled the timing by considering the interference of pulse signals. We show test results for the low-voltage ALU in on-chip high-speed testing. The circuit was fabricated using the AIST Nb/AlOx/Nb Advanced Process with a critical current density of 10kA/cm2. We verified that arithmetic and logical operations were correctly implemented and obtained dc bias margins of 18% at a target clock frequency of 20GHz and achieved a maximum clock frequency of 28GHz with a power consumption of 28µW. These experimental results indicate energy efficiency of 3.6 times that of the standard RSFQ circuit design.
Kyosuke SANO Yuki YAMANASHI Nobuyuki YOSHIKAWA
We have been developing a superconducting time-of-flight mass spectrometry (TOF-MS) system, which utilizes a superconductive strip ion detector (SSID) and a single-flux-quantum (SFQ) multi-stop time-to-digital converter (TDC). The SFQ multi-stop TDC can measure the time intervals between multiple input signals and directly convert them into binary data. In this study, we designed and implemented 24-bit SFQ multi-stop TDCs with a 3×24-bit FIFO buffer using the AIST Nb standard process (STP2), whose time resolution and dynamic range are 100ps and 1.6ms, respectively. The timing jitter of the TDC was investigated by comparing two types of TDCs: one uses an on-chip SFQ clock generator (CG) and the other uses a microwave oscillator at room temperature. We confirmed the correct operation of both TDCs and evaluated their timing jitter. The experimentally-obtained timing jitter is about 40ns and 700ps for the TDCs with and without the on-chip SFQ CG, respectively, for the measured time interval of 50µs, which linearly increases with increase of the measured time interval.
Yoshitaka TAKAHASHI Hiroshi SHIMADA Masaaki MAEZAWA Yoshinao MIZUGAKI
We present our design and operation of a 6-bit quasi-triangle voltage waveform generator comprising three circuit blocks; an improved variable Pulse Number Multiplier (variable-PNM), a Code Generator (CG), and a Double-Flux-Quantum Amplifier (DFQA). They are integrated into a single chip using a niobium Josephson junction technology. While the multiplication factor of our previous m-bit variable-PNM was limited between 2m-1 and 2m, that of the improved one is extended between 1 and 2m. Correct operations of the 6-bit variable-PNM are confirmed in low-speed testing with respect to the codes from the CG, whereas generation of a 6-bit, 0.20mVpp quasi-triangle voltage waveform is demonstrated with the 10-fold DFQA in high-speed testing.
Zhao-xin XIONG Min CAI Xiao-Yong HE Yun YANG
A digital background calibration technique using signal-dependent dithering is proposed, to correct the nonlinear errors which results from capacitor mismatches and finite opamp gain in pipelined analog-to-digital converter (ADC). Large magnitude dithers are used to measure and correct both errors simultaneously in background. In the proposed calibration system, the 2.5-bit capacitor-flip-over multiplying digital-to-analog converter (MDAC) stage is modified for the injection of large magnitude dithering by adding six additional comparators, and thus only three correction parameters in every stage subjected to correction were measured and extracted by a simple calibration algorithm with multibit first stage. Behavioral simulation results show that, using the proposed calibration technique, the signal-to-noise-and-distortion ratio improves from 63.3 to 79.3dB and the spurious-free dynamic range is increased from 63.9 to 96.4dB after calibrating the first two stages, in a 14-bit 100-MS/s pipelined ADC with σ=0.2% capacitor mismatches and 60dB nonideal opamp gain. The time of calibrating the first two stages is around 1.34 seconds for the modeled ADC.
Kazuki TERAOKA Kohei HATANO Eiji TAKIMOTO
We consider Monte Carlo tree search problem, a variant of Min-Max tree search problem where the score of each leaf is the expectation of some Bernoulli variables and not explicitly given but can be estimated through (random) playouts. The goal of this problem is, given a game tree and an oracle that returns an outcome of a playout, to find a child node of the root which attains an approximate min-max score. This problem arises in two player games such as computer Go. We propose a simple and efficient algorithm for Monte Carlo tree search problem.
Masaki KAWABATA Takao NISHIZEKI
Let G be a graph with a single source w, assigned a positive integer called the supply. Every vertex other than w is a sink, assigned a nonnegative integer called the demand. Every edge is assigned a positive integer called the capacity. Then a spanning tree T of G is called a spanning distribution tree if the capacity constraint holds when, for every sink v, an amount of flow, equal to the demand of v, is sent from w to v along the path in T between them. The spanning distribution tree problem asks whether a given graph has a spanning distribution tree or not. In the paper, we first observe that the problem is NP-complete even for series-parallel graphs, and then give a pseudo-polynomial time algorithm to solve the problem for a given series-parallel graph G. The computation time is bounded by a polynomial in n and D, where n is the number of vertices in G and D is the sum of all demands in G.
In a convex drawing of a plane graph, all edges are drawn as straight-line segments without any edge-intersection and all facial cycles are drawn as convex polygons. In a convex grid drawing, all vertices are put on grid points. A plane graph G has a convex drawing if and only if G is internally triconnected, and an internally triconnected plane graph G has a convex grid drawing on an (n-1)×(n-1) grid if either G is triconnected or the triconnected component decomposition tree T(G) of G has two or three leaves, where n is the number of vertices in G. An internally triconnected plane graph G has a convex grid drawing on a 2n×2n grid if T(G) has exactly four leaves. In this paper, we show that an internally triconnected plane graph G has a convex grid drawing on a 6n×n2 grid if T(G) has exactly five leaves. We also present an algorithm to find such a drawing in linear time. This is the first algorithm that finds a convex grid drawing of such a plane graph G in a grid of polynomial size.
The Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM) are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. It is assumed that warps (or groups of threads) on the DMM and the UMM work synchronously in a round-robin manner. However, warps work asynchronously in real GPUs, in the sense that they are randomly (or arbitrarily) dispatched for execution. The first contribution of this paper is to introduce asynchronous versions of these models in which warps are arbitrarily dispatched. In addition, we assume that threads can execute the “syncthreads” instruction for barrier synchronization. Since the barrier synchronization operation may be costly, we should evaluate and minimize the number of barrier synchronization operations executed by parallel algorithms. The second contribution of this paper is to show a parallel algorithm to the sum of n numbers in optimal computing time and few barrier synchronization steps. Our parallel algorithm computes the sum of n numbers in O(n/w+llog n) time units and O(log l/log w+log log w) barrier synchronization steps using wl threads on the asynchronous UMM with width w and latency l. Since the computation of the sum takes at least Ω(n/w+llog n) time units, this algorithm is time optimal. Finally, we show that the prefix-sums of n numbers can also be computed in O(n/w+llog n) time units and O(log l/log w+log log w) barrier synchronization steps using wl threads.
Shuichi INOKUCHI Takahiro ITO Mitsuhiko FUJIO Yoshihiro MIZOGUCHI
We introduce the notion of 'Composition', 'Union' and 'Division' of cellular automata on groups. A kind of notions of compositions was investigated by Sato [10] and Manzini [6] for linear cellular automata, we extend the notion to general cellular automata on groups and investigated their properties. We observe the all unions and compositions generated by one-dimensional 2-neighborhood cellular automata over Z2 including non-linear cellular automata. Next we prove that the composition is right-distributive over union, but is not left-distributive. Finally, we conclude by showing reformulation of our definition of cellular automata on group which admit more than three states. We also show our formulation contains the representation using formal power series for linear cellular automata in Manzini [6].
Hiroshi YAMADA Shuntaro TONOSAKI Kenji KONO
Infrastructure as a Service (IaaS), a form of cloud computing, is gaining attention for its ability to enable efficient server administration in dynamic workload environments. In such environments, however, updating the software stack or content files of virtual machines (VMs) is a time-consuming task, discouraging administrators from frequently enhancing their services and fixing security holes. This is because the administrator has to upload the whole new disk image to the cloud platform via the Internet, which is not yet fast enough that large amounts of data can be transferred smoothly. Although the administrator can apply incremental updates directly to the running VMs, he or she has to carefully consider the type of update and perform operations on all running VMs, such as application restarts. This is a tedious and error-prone task. This paper presents a technique for synchronizing VMs with less time and lower administrative burden. We introduce the Virtual Disk Image Repository, which runs on the cloud platform and automatically updates the virtual disk image and the running VMs with only the incremental update information. We also show a mechanism that performs necessary operations on the running VM such as restarting server processes, based on the types of files that are updated. We implement a prototype on Linux 2.6.31.14 and Amazon Elastic Compute Cloud. An experiment shows that our technique can synchronize VMs in an order-of-magnitude shorter time than the conventional disk-image-based VM method. Also, we discuss limitations of our technique and some directions for more efficient VM updates.
Traditionally, in computer systems, file I/O has been a big performance bottleneck for I/O intensive applications. The recent advent of non-volatile byte-addressable memory (NVM) technologies such as STT-MRAM and PCM, provides a chance to store persistent data with a high performance close to DRAM's. However, as the location of the persistent storage device gets closer to the CPU, the system software layers overheads for accessing the data such as file system layer including virtual file system layer and device driver are no longer negligible. In this paper, we propose a light-weight user-level persistent storage, called UStore, which is physically allocated on the NVM and is mapped directly into the virtual address space of an application. UStore makes it possible for the application to fast access the persistent data without the system software overheads and extra data copy between the user space and kernel space. We show how UStore is easily applied to existing applications with little elaboration and evaluate its performance enhancement through several benchmark tests.
Trung Thanh NGO Yasushi MAKIHARA Hajime NAGAHARA Yasuhiro MUKAIGAWA Yasushi YAGI
Gait-based owner authentication using accelerometers has recently been extensively studied owing to the development of wearable electronic devices. An actual gait signal is always subject to change due to many factors including variation of sensor attachment. In this research, we tackle to the practical sensor-orientation inconsistency, for which signal sequences are captured at different sensor orientations. We present an iterative signal matching algorithm based on phase-registration technique to simultaneously estimate relative sensor-orientation and register the 3D acceleration signals. The iterative framework is initialized by using 1D orientation-invariant resultant signals which are computed from 3D signals. As a result, the matching algorithm is robust to any initial sensor-orientation. This matching algorithm is used to match a probe and a gallery signals in the proposed owner authentication method. Experiments using actual gait signals under various conditions such as different days, sensors, weights being carried, and sensor orientations show that our authentication method achieves positive results.
Yoichi TOMIOKA Ryota TAKASU Takashi AOKI Eiichi HOSOYA Hitoshi KITAZAWA
Hardware acceleration is an essential technique for extracting and tracking moving objects in real time. It is desirable to design tracking algorithms such that they are applicable for parallel computations on hardware. Exclusive block matching methods are designed for hardware implementation, and they can realize detailed motion extraction as well as robust moving object tracking. In this study, we develop tracking hardware based on an exclusive block matching method on FPGA. This tracking hardware is based on a two-dimensional systolic array architecture, and can realize robust moving object extraction and tracking at more than 100 fps for QVGA images using the high parallelism of an exclusive block matching method, synchronous shift data transfer, and special circuits to accelerate searching the exclusive correspondence of blocks.
Mingfu XUE Wei LIU Aiqun HU Youdong WANG
Hardware Trojan (HT) has emerged as an impending security threat to hardware systems. However, conventional functional tests fail to detect HT since Trojans are triggered by rare events. Most of the existing side-channel based HT detection techniques just simply compare and analyze circuit's parameters and offer no signal calibration or error correction properties, so they suffer from the challenge and interference of large process variations (PV) and noises in modern nanotechnology which can completely mask Trojan's contribution to the circuit. This paper presents a novel HT detection method based on subspace technique which can detect tiny HT characteristics under large PV and noises. First, we formulate the HT detection problem as a weak signal detection problem, and then we model it as a feature extraction model. After that, we propose a novel subspace HT detection technique based on time domain constrained estimator. It is proved that we can distinguish the weak HT from variations and noises through particular subspace projections and reconstructed clean signal analysis. The reconstructed clean signal of the proposed algorithm can also be used for accurate parameter estimation of circuits, e.g. power estimation. The proposed technique is a general method for related HT detection schemes to eliminate noises and PV. Both simulations on benchmarks and hardware implementation validations on FPGA boards show the effectiveness and high sensitivity of the new HT detection technique.
Faced with social problems such as rapidly aging society, the solutions have been expected in sports medicine. Humans became widely distributed on the earth from their birth by acquiring abilities to walk in an upright position and to adapt themselves to various natural environments. However, seeking a ‘comfortable environment’ in modern civilization has deteriorated these genetic characteristics of humans, and the consumption of resources and energy to acquire such a ‘comfortable environment’ has induced global warming-associated natural disasters and the destruction of social order. To halt this vicious cycle, we may reactivate the genetic characteristics in humans by doing exercise. To do this, we have developed a health promotion program for middle aged and older people, Jukunen Taiikudaigaku Program, in cooperation with the Japanese government, developed high-intensity interval walking training (IWT), and examined the physical and mental effects on 5,400 people for these 10 years. We found that IWT for 4 months increased physical fitness by 10-20%, decreased the indices of life-style related diseases by 10-20%. Since a prescription of IWT can be conducted by using an IT network system called e-Health Promotion System, the participants in the program were able to receive the prescription even if they lived remote from trainers, enabling them to perform IWT at their favored places and times, and also at low cost. Moreover, we found some single nucleotide polymorphisms closely related to inter-individual differences in the responses to IWT. Further, the system enables us to assess the inactivation/activation of genes for inflammatory responses which has been suggested to be involved in life-style related diseases. Also, the system enables us to search foods to promote health when they are consumed during exercise training. Thus, the system would have strong potential to promote health of middle-aged and older people in advanced aging society.
Yasushi IGARASHI Tadashi CHIBA Shin-ichi O'UCHI Meishoku MASAHARA Kunihiro SAKAMOTO
Voltage multiplier (VM) circuits for RF (2.45GHz)-to-DC conversion are developed for battery-less sensor nodes. Converted DC power is charged on a storage capacitor before driving a wireless sensor module. A charging time of the storage capacitor of the proposed VM circuits is reduced 1/10 of the conventional VM circuits, because they have constant current characteristics owing to self-control of body bias in diode-connected SOI MOSFETs. The wireless sensor system composed of the fabricated VM chip and a commercially available sensor module is operated using an RF signal of a wireless LAN modem (2.45GHz) as a power source.
Fanxin ZENG Xiaoping ZENG Zhenyu ZHANG Guixin XUAN
This letter presents three methods for producing 8-QAM+ sequences. The first method transforms a ternary complementary sequence set (CSS) with even number of sub-sequences into an 8-QAM+ periodic CSS with both of the period and the number of sub-sequences unaltered. The second method results in an 8-QAM+ aperiodic CSS with confining neither the period nor the number of sub-sequences. The third method produces 8-QAM+ periodic sequences having ideal autocorrelation property on the real part of the autocorrelation function. The proposed sequences can be potentially applied to suppression of multiple access interference or synchronization in a communication system.