Akimasa YOSHIDA Ken'ichi KOSHIZUKA Wataru OGATA Hironori KASAHARA
This paper proposes a data-localization scheduling scheme inside a processor-cluster for multigrain parallel processing, which hierarchically exploits parallelism among coarsegrain tasks like loops, medium-grain tasks like loop iterations and near-fine-grain tasks like statements. The proposed scheme assigns near-fine-grain or medium-grain tasks inside coarse-grain tasks onto processors inside a processor-cluster so that maximum parallelism can be exploited and inter-processor data transfer can be minimum after data-localization for coarse-grain tasks across processor-clusters. Performance evaluation on a multiprocessor system OSCAR shows that multigrain parallel processing with the proposed data-localization scheduling can reduce execution time for application programs by 10% compared with multigrain parallel processing without data-localization.
SeongSik LEE Jeong Woo JWA HwangSoo LEE
We propose an improved orthogonal frequency division multiplexing (OFDM) signal detector which uses the minimum mean-square error (MMSE) noise feedback equalization (NFE). The input bit stream is trellis-coded to form OFDM signal blocks and the maximal ratio combining (MRC) is adopted at the receiver in order to improve the performance of the detector. As a result, we obtain significantly improved detection performance compared with the conventional OFDM receivers as follows. Using the proposed MMSE-NFE in the receiver, we can obtain the performance gain of about 1.5 dB to 2 dB in symbol energy to noise power spectral density (Es/No) for Doppler frequencies of fd=20 and 100 Hz, respectively, over the receiver with the MMSE linear equalization (LE) alone at symbol error rate (SER) of about 10-3. With MRC and trellis coding, the performance gain of about 11 dB in Es/No for fd=20 and 100 Hz at SER of about 10-3 is obtained.
Mitsuru KAWAMOTO Kiyotoshi MATSUOKA Masahiro OYA
This paper proposes a new method for recovering the original signals from their linear mixtures observed by the same number of sensors. It is performed by identifying the linear transform from the sources to the sensors, only using the sensor signals. The only assumption of the source signals is basically the fact that they are statistically mutually independent. In order to perform the 'blind' identification, some time-correlational information in the observed signals are utilized. The most important feature of the method is that the full information of available time-correlation data (second-order statistics) is evaluated, as opposed to the conventional methods. To this end, an information-theoretic cost function is introduced, and the unknown linear transform is found by minimizing it. The propsed method gives a more stable solution than the conventional methods.
Byungho KIM Boseob KWON Hyunsoo YOON Jung Wan CHO
Multipath interconnection networks can support higher bandwidth than those of nonblocking networks by passing multiple packets to the same output simultaneously and these packets are buffered in the output buffer. The delay-throughput performance of the output buffer in multipath networks is closely related to output traffic distribution, packet arrival process at each output link connected to a given output buffer. The output traffic distributions are different according to the various input traffic patterns. Focusing on nonuniform output traffic distributions, this paper develops a new, general analytic model of the output buffer in multipath networks, which enables us to investigate the delay-throughput performance of the output buffer under various input traffic patterns. This paper also introduces Multipath Crossbar network as a representative multipath network which is the base architecture of our analysis. It is shown that the output buffer performances such as packet loss probability and delay improve as nonuniformity of the output traffic distribution becomes larger.
Shigeki SAKAGUCHI Shin-ichi TODOROKI
We propose low Rayleigh scattering Na2O-MgO-SiO2 (NMS) glass as a candidate material for low-loss optical fibers. This glass exhibits Rayleigh scattering which is only 0.4 times that of silica glass, and a theoretical evaluation suggests that it is dominated by density fluctuation. An investigation of the optical properties of NMS glass reveals that a minimum loss of 0.06 dB/km is expected at a wavelength of 1.6 µm and that the zero-material dispersion wavelength is found in the 1.5 µm band. To establish the waveguide structure, we evaluated the feasibility of using F-doped NMS (NMS-F) glass as a cladding layer for an NMS core and found that it is suitable because it exhibits low relative scattering (e.g. 0.7) and is versatile in terms of viscosity matching. We also describe an attempt to draw optical fibers using the double crucible technique.
In this paper, a new method capable of effectively coding arbitrarity-shaped image regions is presented. The image region is spanned into the 8 8 rectangular block and its intermediate luminances are interpolated. After all liminances in the 8 8 block are obtained from pixels in the region, they are transformed by 8 8 DCT. The proposed extension/interpolation (EL) method is compared with conventional ones, such as SA-DCT, mean stuffing, etc., under three aspects: peak signal-to-noise ratio (PSNR), hardware complexity, and the flexibility for improvement of performance. Simulation results show that the performance of the proposed method is superior to that of the conventional ones. In addition, we introduce an improved version by repetitively performing the EL method.
Hirohisa YOKOTA Emiko OKITSU Yutaka SASAKI
Thermally-diffused expanded core (TEC) techniques brought the fibers with the mode fields expanded by thermal diffusion of core dopants. The techniques are effective to the reduction of splice or connection losses between the different kind of fibers, and are applied to the integrations of thin film optical devices in fiber networks, the fabrications of chirped fiber gratings, and so on. In the practical use of TEC techniques, the fibers are heated high temperature of about 1650 because of a short peried of time in processing by microburners. The mode field diameter expansion (MFDE) ratio, which is defined as the ratio of the mode field diameter in the fiber section having the core expanded and that unexpanded, is desired to be more than 2.0 from the viewpoint of loss reduction in industrial uses of the TEC techniques. When the TEC techniques are applied to polarization-maintaining optical fibers (PM fibers), such as PANDA fibers, both core dopants and stress applying part (SAP) dopants diffuse simultaneously. So the MFDE ratio is less than two without mode field deformation in conventional PANDA fibers which are practically used as PM fibers. In this paper a PANDA fiber design suitable for the TEC techniques is newly proposed. The fiber has 1.28 µm cutoff wavelength and the mode field diameter is about 11 µm before core expansion at 1.3µm wavelength.
In this paper,we propose general fast one dimensional (1-D) and two dimensional (2-D) slant transform algorithms. By introducing simple and structural permutations, the heavily computational operations are centralized to become standardized and localized processing units. The total numbers of multiplications for the proposed fast 1-D and 2-D slant transforms are less than those of the existed methods. With advantages of convenient description in formulation and efficient computation for realization, the proposed fast slant transforms are suitable for applications in signal compression and pattern recognition.
This paper deals with an orthogonal functional expansion of a non-linear stochastic functional of a stationary binary sequence taking 1 with equal probability. Several mathematical formulas, such as multi-variate orthogonal polynomials, recurrence formula and generating function, are given in explicit form. A simple example of orthogonal functional expansion and stationary random seqence generated by the stationary binary sequence are discussed.
Shinichiro SHIRATAKE Daisaburo TAKASHIMA Takehiro HASEGAWA Hiroaki NAKANO Yukihito OOWAKI Shigeyoshi WATANABE Takashi OHSAWA Kazunori OHUCHI
A new memory cell arrangement for a gigabit-scale NAND DRAM is proposed. Although the conventional NAND DRAM in which memory cells are connected in series realizes the small die size, it faces a crucial array noise problem in the 1 gigabit generation and beyond because of its inherent noise of the open bitline arrangement. By introducing the new cell arrangement to a NAND DRAM, the folded bitline scheme is realized, resulting in good noise immunity. The basic operation of the proposed folded bitline scheme was successfully verified using the 64 kbit test chip. The die size of the proposed NAND DRAM with the folded bitline scheme (F-NAND DRAM) at the 1 Gbit generation is reduced to 63% of that of the conventional 1 Gbit DRAM with the folded bitline scheme, assuming the bitlines and the wordlines are fabricated with the same pitch. The new 4/4 bitline grouping scheme in which cell data are read out to four neighboring bitlines is also introduced to reduce the bitline-to-bitline coupling noise to half of that of the conventional folded bitline scheme. The array noise of the proposed F-NAND DRAM with the 4/4 bitline grouping scheme at 1 Gbit generation is reduced to 10% of the read-out signal, while that of the conventional NAND DRAM with open bitline scheme is 29%, and that of the conventional DRAM with the folded bitline scheme is 22%.
A method of planar curve classification, which is invariant to rotation, scaling and translation using the zerocrossings representation of wavelet transform was introduced. The description of the object is represented by taking a ratio between its two adjacent boundary points so it is invariant to object rotation, translation and size. Transforming this signal to zero-crossings representation using wavelet transform, the minimum distance between the object and model while shifting the signals each other, can be used as classification parameter.
We frequently use a polynomial to approximate a complex function. This study shows a method which determines the optimum coefficients and the number of terms of the polynomial, and the error of the polynomial is estimated.
Kazuhito SHIDA Kaoru OHNO Masayuki KIMURA Yoshiyuki KAWAZOE
A large scale simulation for polymer chains in good solvent is performed. The implementation technique for efficient parallel execution, optimization, and load-balancing are discussed on this practical application. Finally, a simple performance model is proposed.
Qing-An ZENG Kaiji MUKUMOTO Akira FUKUDA
In this paper, we propose a handoff scheme with two-level priority for the reservation of handoff request calls in mobile cellular radio systems. We assume two types of mobile subscribers with different distributions of moving speed, that is, users with low average moving speed (e.g., pedestrians) and high average moving speed (e.g., people in moving cars). A fixed number of channels in each cell are reserved exclusively for handoff request calls. Out of these number of channels, some are reserved exclusively for the high speed handoff request calls. The remaining channels are shared by both the originating and handoff request calls. In the proposed scheme, both kinds of handoff request calls make their own queues. The system is modeled by a three-dimensional Markov chain. We apply the Successive Over-Relaxation (SOR) method to obtain the equilibrium state probabilities. Blocking probabilities of calls, forced termination probabilities and average queue length of handoff calls of each type are evaluated. We can make the forced termination probabilities of handoff request calls smaller than the blocking probability of originating calls. Moreover, we can make the forced termination probability of high speed handoff request calls smaller than that of the low speed ones. Necessary queue size for the two kinds of handoff request calls are also estimated.
Barbara M. CHAPMAN Piyush MEHROTRA Hans P. ZIMA
Highly parallel scalable multiprocessing systems (HMPs) are powerful tools for solving large-scale scientific and engineering problems. However, these machines are difficult to program since algorithms must exploit locality in order to achieve high performance. Vienna Fortran was the first fully specified data-parallel language for HMPs that provided features for the specification of data distribution and alignment at a high level of abstraction. In this paper we outline the major elements of Vienna Fortran and compare it to High Performance Fortran (HPF), a de-facto standard in this area. A significant weakness of HPF is its lack of support for many advanced applications, which require irregular data distributions and dynamic load balancing. We introduce HPF +, an extension of HPF based on Vienna Fortran, that provides the required functionality.
MPO optical backplane connectors using multi-fiber push-on plugs (MPO plugs) have been developed. MPO optical backplane connector is a connector connecting a printed board to a backplane using MPO plug. MPO plug is held in the housing with self-retentive mechanism. To get same optical performances as standard MPO connector, precision in dimension and mechanism for appropriate connecting-disconnecting sequence are necessary. We have developed a new backplane housing and printed board housing based on previously reported MU connector. The optical performance is similar to that of MPO connectors.
InHwan KIM Takayuki NAKACHI Nozomu HAMADA
In the adaptive lattice estimation process, it is well known that the convergence speed of the successive stage is affected by the estimation errors of reflection coefficients in its preceding stages. In this paper, we propose block estimation methods of two-dimensional (2-D) adaptive lattice filter. The convergence speed of the proposed algorithm is significantly enhanced by improving the adaptive performance of preceding stages. Furthermore, this process can be simply realized. The modeling of 2-D AR field and texture image are demonstrated through computer simulations.
In high performance compilers to process pointer-handling programs, precise pointer alias analysis is useful for the compilers to generate efficient object code. It is well known that most compiler techniques such as data flow analysis, dependence analysis, side effect analysis and optimizations are related to the alias problem. However, without data structure information, there is a limit on the precision of the alias analysis. Even though the automatic data structure detection problem is complex, when pointer manipulation satisfies some restrictions, some data structures can be detected automatically by compilers with some knowledge of aliases. In this paper, we propose an automatic data structure detection method for Pascal and Fortran 90. Linear list, tree and dag data structures are detected. Detected data structure information can be used not only for raising the precision of alias analysis but also for some optimizing techniques for pointer handling programs directly.
Yoshimasa OHNISHI Yoshinari SUGIMOTO Toshinori SUEYOSHI
We conducted research and development of Distributed Supercomputing Environment (DSE) based on distributed shared memory model to serve as a cluster computing environment to provide parallel processing facilities. Shared memory model and message passing model are well-known typical models of parallel processing. It is desired that hybrid programming environment will make the best use of the prominent features of both models. Consequently, we add a new message passing mechanism to present DSE, and create a prototype called Hybrid DSE as a hybrid model based cluster computing environment. In this paper, we describe the implementation of a message passing mechanism on DSE and performance evaluation of Hybrid DSE.
Dingchao LI Akira MIZUNO Yuji IWAHORI Naohiro ISHII
This paper describes a new approach to the scheduling problem that assigns tasks of a parallel program described as a task graph onto parallel machines. The approach handles interprocessor communication and heterogeneity, based on using both the theoretical results developed so far and a lookahead scheduling strategy. The experimental results on randomly generated task graphs demonstrate the effectiveness of this scheduling heuristic.