Hiroki FUJISAWA Takeshi SAKATA Tomonori SEKIGUCHI Kazuyoshi TORII Katsutaka KIMURA Kazuhiko KAJIGAYA
A small data-line-swing read/write scheme is described for half-Vcc plate nonvolatile DRAMs with ferroelectric capacitors designed to achieve high reliability for read/write operations. In this scheme, the normal read/write operation holds the data as a charge with a small data-line-swing, and the store operation provides sufficient polarization with a full data-line-swing. This scheme enables high read/write endurance, because the small data-line-swing reduces the fatigue of the ferroelectric capacitor. Two circuit technologies are used in this scheme to increase the operating margin. The first is a plate voltage control technique that solves the polarization retention problem of half-Vcc plate nonvolatile DRAM technologies. The second is a doubled data-line-capacitance recall technique that connects two data lines to a cell and enlarges the readout signal compared to normal operation, when only one data line is connected to a cell. These techniques and circuits improve the write-cycle endurance by almost three orders of magnitude, while reducing the array power consumption during read/write operations to one-third that of conventional nonvolatile DRAMs.
Tree task structures occur frequently in many applications where parallelization may be desirable. We present a formal treatment of non-preemptively scheduling task trees on distributed memory multiprocessors and show that the fundamental problems of scheduling (i) a task tree in absence of any inter-task communication on a fixed number of processors and (ii) a task tree with inter-task communication on an unbounded number of processors are NP-complete. For task trees that satisfy certain constraints, we present an optimal scheduling algorithm. The algorithm is shown optimal over a wider set of task trees than previous works.
Thanyapat SAKUNKONCHAK Sawasd TANTARATANA
In this paper, we propose a high-speed multiplier-free realization using ROM's to store the results of coefficient scalings in combination with higher signal rate and pipelined operations, without the need of hardware multipliers. By varying some parameters, the proposed structure provides various combinations of hardware and clock speed (or throughput). Examples are given comparing the proposed realization with the distributed arithmetic (DA) realization and direct-form realization with power-of-two coefficients. Results show that with proper choices of the parameters the proposed structure achieves a faster processing speed with less hardware, as compared to the DA realization, while it is much faster than the direct-form with slightly more hardware.
Kirilka NIKOLOVA Atusi MAEDA Masahiro SOWA
A parallel program with a fixed degree of parallelism cannot be executed efficiently, or at all, by a parallel computer with a different degree of parallelism. This will cause a problem in the distribution of software applications in the near future when parallel computers with various degrees of parallelism will be widely used. In this paper we propose a way to make the machine code of the programs parallelism-independent, i.e. executable in minimum time on parallel computers with any degree of parallelism. We propose and evaluate three parallelism-independent scheduling algorithms for direct acyclic graphs (DAGs) of tasks with non-uniform execution times. To prove their efficiency, we performed simulations both with random DAGs and DAGs extracted from real applications. We evaluate them in terms of schedule length, computation time and size of the scheduled program. Their results are compared to those of the traditional CP/MISF algorithm which is used separately for each number of processors.
Masayoshi NABESHIMA Takashi SHIMIZU Ikuo YAMASAKI
The differentiated services (diffserv) architecture has been proposed for implementing scalable service differentiation in the Internet. Expedited forwarding and assured forwarding have been standardized as Per-Hop Behaviors (PHB) in diffserv. Assured forwarding can be utilized to realize the service, which provides each user with a minimum guaranteed rate and a fair share of the residual bandwidth. We call it guaranteed rate (GR) service. With GR service, each packet for flow i is marked in or out based on comparison between the sending rate and the minimum guaranteed rate. When congestion occurs in networks, out packets are dropped more aggressively than in packets. Recently, several fair queuing schemes have been proposed for core stateless networks. They can achieve fairer bandwidth allocation than random early detection (RED). However, there have not been any studies that consider in/out bit usage to support GR service. This paper proposes how to extend the schemes that have been proposed for core stateless networks to allow the support of in/out bit usage. We present the performance of one of the extended schemes and compare the scheme to RED with in/out bit (RIO) in terms of fair bandwidth allocation.
In this paper we propose a new algorithm to approximate the solution of Hamilton-Jacobi-Bellman equation by using a three layer neural network for affine and general nonlinear systems, and the state feedback controller can be obtained which make the closed-loop systems be suboptimal within a restrictive training domain. Matrix calculus theory is used to get the gradients of training error with respect to the weight parameter matrices in neural networks. By using pattern mode learning algorithm, many examples show the effectiveness of the proposed method.
Yoshiaki KAMIGAKI Shin'ichi MINAMI
We have manufactured large-scaled highly reliable MNOS EEPROMs over the last twenty years. In particular, at the present time, the smart-card microcontroller incorporating an embedded 32-kB MNOS EEPROM is rapidly expanding the markets for mobile applications. It might be said that we have established the conventional MNOS nonvolatile semiconductor memory technology. This paper describes the device design concepts of the MNOS memory, which include the optimization and control of the tunnel oxide film thickness (1.8 nm), and the scaling guideline that considers the charge distribution in the trapping nitride film. We have developed a high-performance MONOS structure and have not found any failure due to the MONOS devices in high-density EEPROM products during 10-year data retention tests after 105 erase/write cycles. The future development of this highly reliable MNOS-type memory will be focussed on the high-density cell structure and high-speed programming method. Recently, some promising ideas for utilizing an MNOS-type memory device, such as 1-Tr/bit cell for byte-erasable full-featured EEPROMs and 2-bit/Tr cell for flash EEPROMs have been proposed. We are convinced that MNOS technology will advance into the area of nonvolatile semiconductor memories because of its high reliability and high yield of products.
Eun Hye CHOI Tatsuhiro TSUCHIYA Tohru KIKUNO
We propose a two-level hierarchical method for dependability evaluation of distributed systems with replicated programs and data files. Since Markov modeling is limited only to each component in this method, state explosion can be circumvented successfully. Simulation results show that the method can accomplish evaluation even for large systems for which Markov modeling is not feasible.
Kyung-Seok SEO Chang-Joon PARK Sang-Hyun CHO Heung-Moon CHOI
A high-speed context-free marker controlled and minima imposition-free watershed transform is proposed for efficient multi-object detection and segmentation from a complex background. The context-free markers are extracted from a complex backgrounded multi-object image using a noise tolerant attention operator. These make high speed marker-controlled watershed possible without over-segmentation and region merging. The proposed method presents a marker-constrained labeling that can speed up the segmentation of the marker-controlled watershed transform by eliminating the necessity of the minima imposition. Simulation results show that the proposed method can efficiently detect and segment multiple objects from a complex background while reducing the over-segmentation and computation time.
In this paper, we first propose a new speech enhancement preprocessing algorithm by combining power subtraction method and maximal ratio combining technique, then apply it to both energy-based and statistical model-based VAD algorithm to improve the performance even in low SNR conditions. We also perform extensive computer simulations to demonstrate the performance improvement of the proposed VAD algorithm employing the proposed speech enhancement preprocessing algorithm under various background noise environments.
Takeshi IKENAGA Kenji KAWAHARA Yuji OIE
In QoS networks, routing algorithms for QoS traffic have to provide the transmission path satisfying its QoS requirement while achieving high utilization of network resources. Therefore, server-based QoS routing algorithms would be more effective than distributed routing ones which are very common on the Internet. Furthermore, we believe that rerouting function enhances the advantage of their algorithms in which an already accepted flow with established path is replaced on some other path in order to accept newly arriving transmission request if it can not be accepted without doing so. Thus in this paper, we will propose a rerouting algorithm with the server-based QoS routing and evaluate its performance in terms of the blocking probability by computer simulation. In addition, we will investigate the impact of the amount of traffic with high-priority on the performance in some network topologies. Through some simulation results, we also discuss some issues arising in improving the effectiveness of rerouting.
Given a graph G, a designated vertex r and a natural number k, we wish to find k "independent" spanning trees of G rooted at r, that is, k spanning trees such that, for any vertex v, the k paths connecting r and v in the k trees are internally disjoint in G. In this paper we give a linear-time algorithm to find k independent spanning trees in a k-connected maximal planar graph rooted at any designated vertex.
This is a study on a certain group theoretic property of the set of encryption functions of a block cipher. We have shown how to construct a subset which has this property in a given symmetric group by a computer algebra software GAP4.2 (Groups, Algorithms, and Programming, Version 4.2). These observations on group structures of block ciphers suggest us that we may be able to set a trapdoor based on meet-in-the-middle attack on block ciphers.
Byung In MOON Dong Ryul RYU Jong Wook HONG Tae Young LEE Sangook MOON Yong Surk LEE
We have designed a 32-bit RISC microprocessor with 16-/32-bit fixed-point DSP functionality. This processor, called YD-RISC, combines both general-purpose microprocessor and digital signal processor (DSP) functionality using the reduced instruction set computer (RISC) design principles. It has functional units for arithmetic operation, digital signal processing (DSP) and memory access. They operate in parallel in order to remove stall cycles after DSP or load/store instructions, which usually need one or more issue latency cycles in addition to the first issue cycle. High performance was achieved with these parallel functional units while adopting a sophisticated five-stage pipeline structure. The pipelined DSP unit can execute one 32-bit multiply-accumulate (MAC) or 16-bit complex multiply instruction every one or two cycles through two 17-b 17-b multipliers and an operand examination logic circuit. Power-saving techniques such as power-down mode and disabling execution blocks allow low power consumption. In the design of this processor, we use logic synthesis and automatic place-and-route. This top-down approach shortens design time, while a high clock frequency is achieved by refining the processor architecture.
Seong-Moo YOO Hee Yong YOUN Hyunseung CHOO
Among several multiprocessor topologies, two-dimensional (2D) mesh topology has become popular due to its simplicity and efficiency. Even though a number of scheduling and processor allocation schemes for 2D meshes have been proposed in the literature, little study has been done aimed for real-time environment. In this paper, we propose an on-line scheduling and allocation scheme for real-time tasks that require the exclusive use of submeshes in 2D mesh system. By effectively manipulating the information on allocated or reserved submeshes, the proposed scheme can quickly identify the earliest available time of a free submesh for a newly arrived task. We employ a limited preemption approach to reduce the complexity of the search for a feasible schedule. Computer simulation reveals that the proposed scheme allows high throughput by decreasing the number of tasks rejected.
Chang-Zheng SUN Bing XIONG Guo-Peng WEN Yi LUO Tong-Ning LI Yoshiaki NAKANO
The effect of wavelength detuning on the device performance of identical-epitaxial-layer (IEL) electroabsorption (EA) modulator integrated distributed feedback (DFB) lasers is studied in detail. Based on the lasing behavior of integrated devices with different amount of wavelength detuning and the photocurrent spectra under different reverse biases, the optimal wavelength detuning is experimentally determined to be around 30-40 nm for our IEL integrated devices. By adopting gain-coupled DFB laser section, integrated devices with optimal wavelength detuning have demonstrated excellent single mode performances. The extinction ratio is measured to be greater than 15 dB at -3 V, and the modulation bandwidth is around 8 GHz.
Chang-Zheng SUN Bing XIONG Guo-Peng WEN Yi LUO Tong-Ning LI Yoshiaki NAKANO
The effect of wavelength detuning on the device performance of identical-epitaxial-layer (IEL) electroabsorption (EA) modulator integrated distributed feedback (DFB) lasers is studied in detail. Based on the lasing behavior of integrated devices with different amount of wavelength detuning and the photocurrent spectra under different reverse biases, the optimal wavelength detuning is experimentally determined to be around 30-40 nm for our IEL integrated devices. By adopting gain-coupled DFB laser section, integrated devices with optimal wavelength detuning have demonstrated excellent single mode performances. The extinction ratio is measured to be greater than 15 dB at -3 V, and the modulation bandwidth is around 8 GHz.
Hiroshi NAGAMOCHI Koji MOCHIZUKI Toshihide IBARAKI
We consider a single-vehicle scheduling problem on a tree, where each vertex has a job with a release time and a processing time and each edge has a travel time. There is a single vehicle which starts from a start vertex s and reaches a goal vertex g after finishing all jobs. In particular, s is called a home location if s = g. The objective of the problem is to find a depth-first routing on T so as to minimize the completion time. In this paper, we first show that the minimum completion times of the problem for all home locations s V can be simultaneously computed in O(n) time, once the problem with a specified home location s V has been solved, where n is the number of vertices. We also show that given a specified start vertex s, the minimum completion times for all goal vertices g can be computed in O(n) time.
Yoshiyuki SHINKAWA Masao J. MATSUMOTO
Adaptation of software components to the requirements is one of the key concerns in Component Based Software Development (CBSD). In this paper, we propose a formal approach to compose component based systems which are adaptable to the requirements. We focus on the functional aspects of software components and requirements, which are expressed in S-sorted functions. Those S-sorted functions are transformed into Colored Petri Nets (CPN) models in order to evaluate connectivity between the components, and to evaluate adaptability of composed systems to the requirements. The connectivity is measured based on colors or data types in CPN, while the adaptability is measured based on functional equivalency. We introduce simple glue codes to connect the components each other. The paper focuses on business applications, however the proposed approach can be applied to any other domains as far as the functional adaptability is concerned.
David Chee Kheong SIEW Gang FENG
The problem of finding a minimum-cast multicast tree (Steiner tree) is known as NP complete. Heuristic based algorithms for this problem to achieve good performance are usually time-consuming. In this paper, we propose a new strategy called tree-caching for efficient multicast connection setup in connection-oriented networks. In this scheme, the tree topologies that have been computed are cached in a database of the source nodes. This can reduce the connection establishment time for subsequent connection requests which have some common multicast members, by an efficient reuse of cached trees without having to re-run a multicast routing algorithm for the whole group. This method can provide an efficient way to eliminate, when ever possible, the expensive tree computation algorithm that has to be performed in setting up a multicast connection. We first formulate the problem of tree-caching and then propose a tree-caching algorithm to reduce the complexity of the tree computations when a new connection is to be established. Through simulations, we find that the proposed tree-caching strategy performs very well and can significantly reduce the computation complexity for setting up multicast connections.