Kazumine MATOBA Shingo ATA Masayuki MURATA
Network dimensioning is an important issue to provide stable and QoS-rich communication services. A reliable estimation of bandwidths of links between the end-to-end path is a first step towards the network dimensioning. Pathchar is one of such tools for the bandwidth estimation for every link between two end hosts. However, pathchar still has several problems. If unexpectedly large errors are included or if route alternation is present during the measurement, the obtained estimation is much far from the correct one. We investigate the method to eliminate those errors in estimating the bandwidth. To increase the reliability on the estimation, the confidence interval for the estimated bandwidth is important. For this purpose, two approaches, parametric and nonparametric approaches, are investigated to add the confidence intervals. Another important issue is the method for controlling the measurement period to eliminate the measurement overheads. In this paper, we propose a measurement method to adaptively control the number of measurement data sets. Through experimental results, we show that our statistical approaches can provide the robust estimation regardless of the network conditions.
Various shadows are one of main factors that cause errors in vision based vehicle detection. In this paper, two simple methods, land mark based method and BS & Edge method, are proposed for vehicle detection and shadow rejection. In the experiments, the accuracy of vehicle detection is higher than 98%, during which the shadows arisen from roadside buildings grew considerably. Based on these two methods, vehicle counting, tracking, classification, and speed estimation are achieved so that real-time traffic parameters concerning traffic flow can be extracted to describe the load of each lane.
Kirilka NIKOLOVA Atusi MAEDA Masahiro SOWA
A parallel program with a fixed degree of parallelism cannot be executed efficiently, or at all, by a parallel computer with a different degree of parallelism. This will cause a problem in the distribution of software applications in the near future when parallel computers with various degrees of parallelism will be widely used. In this paper we propose a way to make the machine code of the programs parallelism-independent, i.e. executable in minimum time on parallel computers with any degree of parallelism. We propose and evaluate three parallelism-independent scheduling algorithms for direct acyclic graphs (DAGs) of tasks with non-uniform execution times. To prove their efficiency, we performed simulations both with random DAGs and DAGs extracted from real applications. We evaluate them in terms of schedule length, computation time and size of the scheduled program. Their results are compared to those of the traditional CP/MISF algorithm which is used separately for each number of processors.
Kouhei FUJIMOTO Shingo ATA Masayuki MURATA
A packet transmission delay is an important quality characteristic for various applications including real-time and data applications. In particular, it is necessary to investigate not only a whole distribution of the packet transmission delay, but also the tail part of the distribution, in order to detect the packet loss. In this paper, we analyze the characteristics of the tail part of packet delay distributions by statistical analytic approach. Our analytic results show that the Pareto distribution is most appropriate in 95-99.9% region of the cumulative distribution of packet transmission delays. Based on our statistical analysis, we next propose an adaptive playout control algorithm, which is suitable to real-time applications. Numerical examples show that our algorithm provides the stable packet loss ratio independently on traffic fluctuations.
In this paper, we propose an efficient linear ordering algorithm for netlist partitioning. The proposed algorithm incrementally merges two segments which are selected based on the proposed cost function until only one segment remains. The final resultant segment then corresponds to the linear order. Compared to the earlier work, the proposed algorithm yields an average of 11.4% improvement for the ten-way scaled cost partitioning.
Hiro ITO Hideyuki UEHARA Mitsuo YOKOYAMA
Let m 2, n 2, and q 2 be positive integers. Let Sr and Sb be two disjoint sets of points in the plane such that no three points of Sr Sb are collinear, |Sr| = nq, and |Sb| = mq. This paper shows that Kaneko and Kano's conjecture is true, i.e., there are q disjoint convex regions of the plain such that each region includes n points of Sr and m points of Sb. This is a generalization of 2-dimension Ham Sandwich Theorem.
Multi-level divide-and-conquer (MDC) is a generalized divide-and-conquer technique, which consists of more than one division step organized hierarchically. In this paper, we investigate the paradigm of the MDC and show that it is an efficient technique for designing parallel algorithms. The following parallel algorithms are used for studying the MDC: finding the convex hull of discs, finding the upper envelope of line segments, finding the farthest neighbors of a convex polygon and finding all the row maxima of a totally monotone matrix. The third and the fourth algorithms are newly presented. Our discussion is based on the EREW PRAM, but the methods discussed here can be applied to any parallel computation models.
In this paper, we present deterministic parallel algorithms for the convex hull of sorted points and their application to a related problem. The algorithms are proposed for the coarse grained multicomputer (CGM) model. We first propose a cost optimal parallel algorithm for computing the problem with a constant number of communication rounds for n/p p2, where n is the size of an input and p is the number of processors. Next we propose a cost optimal algorithm, which is more complicated, for n/p pε, where 0 < ε < 2. From the above two results, we can compute the convex hull of sorted points with O(n/p) computation time and a constant number of communication rounds for n/p pε, where ε > 0. Finally we show an application of our convex hull algorithms. We solve the convex layers for d lines in O((n log n)/p) computation time with a constant number of communication rounds. The algorithm is also cost optimal for the problem.
Carla Denise CASTANHO Wei CHEN Koichi WADA Akihiro FUJIWARA
P-complete problems seem to have no parallel algorithm which runs in polylogarithmic time using a polynomial number of processors. A P-complete problem is in the class EP (Efficient and Polynomially fast) if and only if there exists a cost optimal algorithm to solve it in T(n) = O(t(n)ε) (ε < 1) using P(n) processors such that T(n) P(n) = O(t(n)), where t(n) is the time complexity of the fastest sequential algorithm which solves the problem. The goal of our research is to find EP parallel algorithms for some P-complete problems. In this paper first we consider the convex layers problem. We give an algorithm for computing the convex layers of a set S of n points in the plane. Let k be the number of the convex layers of S. When 1 k nε/2 (0 ε < 1) our algorithm runs in O((n log n)/p) time using p processors, where 1 p n1-ε/2, and it is cost optimal. Next, we consider the envelope layers problem of a set S of n line segments in the plane. Let k be the number of the envelope layers of S. When 1 k nε/2 (0 ε < 1), we propose an algorithm for computing the envelope layers of S in O((n α(n) log3 n)/p) time using p processors, where 1 p n1-ε/2, and α(n) is the functional inverse of Ackermann's function which grows extremely slowly. The computational model we use in this paper is the CREW-PRAM. Our first algorithm, for the convex layers problem, belongs to EP, and the second one, for the envelope layers problem, belongs to the class EP if a small factor of log n is ignored.
Elsaid Mohamed ABDELRAHIM Takashi YAHAGI
Chen et al., have proposed a new estimation method for the membership values in fuzzy sets. The proposed scheme takes input from empirical/experimental data, which reflect the expert's knowledge on the relative degree of belonging of the members, and then searches for the best fit membership values of the element. Through the estimation of the practical case (Sect. 3 in [1]) the algorithm suggests to normalize the estimated membership values if there is any among them more than one and change some condition to guarantee its positiveness. In this paper, we show how to use the same imposed condition to guarantee that the estimated membership values will be within the unit interval without normalization.
Bin-Chul IHM Dong-Jo PARK Young-Hyun KWON
We propose a new intelligent blind source separation algorithm for the mixture of sub-Gaussian and super-Gaussian sources. The algorithm consists of an update equation of the separating matrix and an adjustment equation of nonlinear functions. To verify the validity of the proposed algorithm, we compare the proposed algorithm with extant methods.
We present a new family of algorithms that solve the bias problem in the equation-error based adaptive infinite impulse response (IIR) filtering. A novel constraint, called the constant-norm constraint, unifies the quadratic constraint and the monic one. By imposing the monic constraint on the mean square error (MSE) optimization, the merits of both constraints are inherited and the shortcomings are overcome. A new cost function based on the constant-norm constraint and Lagrange multiplier is defined. Minimizing the cost function gives birth to a new family of bias-free adaptive IIR filtering algorithms. For example, two efficient algorithms belonging to the family are proposed. The analysis of the stationary points is presented to show that the proposed methods can indeed produce bias-free parameter estimates in the presence of white noise. The simulation results demonstrate that the proposed methods indeed produce unbiased parameter estimation, while being simple both in computation and implementation.
Tatsuya HAGIWARA Hiroki DOI Hideki TODE Hiromasa IKEDA
Recent studies on traffic measurement analysis in the various networks (LAN, MAN, WAN) have shown that packet traffic exhibits Self-Similarity. The packet traffic represents some behavior quite different from what it has been assumed. Some papers reported that Self-Similarity degrades the network performance, such as buffer overflow and network congestion. Thus, we need new network control scheme considering Self-Similar properties. The control scheme requires high-speed calculation method of Hurst Parameter. In this paper, we propose high-speed calculation method of Hurst Parameter based on the Variance-Time Plot method, and show its performance. Furthermore, we try to apply this method to the simple network control, in order to show effectiveness of the network control with Self-Similarity.
Hyoung-Il LEE Han-You JEONG Seung-Woo SEO
In this paper, we investigate the performance characteristics of parallel switching architectures constructed by a stack of multistage switching networks. We first find that the performances of the previously proposed parallel switching architectures are much worse than the expected ones from analytic models which are based on the assumption that traffic is uniformly distributed at each stage of a switching network. We show that this phenomenon is closely related to a traffic-distribution capability of a parallel switching system and has a large influence on the performance. From these results, we then propose an architectural solution based on the Generalized Shuffle Network (GSN) and analyze its performance by proposing a new iterative analysis method. The proposed architecture uses self-routing and deflection routing, and inherently has a traffic-distribution capability to improve switch performances such as cell loss and delay in a cost-effective manner. From the comparison of simulation and analysis results, it is shown that the developed models are quite accurate in predicting the performance of a new parallel switching system.
Kengo R. AZEGAMI Atsushi TAKAHASHI Yoji KAJITANI
We improve the algorithm to obtain the min-cut graph of a hyper-graph and show an application to the sub-network extraction problem. The min-cut graph is a directed acyclic graph whose directed cuts correspond one-to-one to the min-cuts of the hyper-graph. While the known approach trades the exactness of the min-cut graph for some speed improvement, our proposed algorithm gives an exact one without substantial computation overhead. By using the exact min-cut graph, an exhaustive algorithm finds an optimal sub-circuit that is extracted by a min-cut from the circuit. By experiments with the industrial data, the proposing method showed a performance enough for practical use.
Byung In MOON Dong Ryul RYU Jong Wook HONG Tae Young LEE Sangook MOON Yong Surk LEE
We have designed a 32-bit RISC microprocessor with 16-/32-bit fixed-point DSP functionality. This processor, called YD-RISC, combines both general-purpose microprocessor and digital signal processor (DSP) functionality using the reduced instruction set computer (RISC) design principles. It has functional units for arithmetic operation, digital signal processing (DSP) and memory access. They operate in parallel in order to remove stall cycles after DSP or load/store instructions, which usually need one or more issue latency cycles in addition to the first issue cycle. High performance was achieved with these parallel functional units while adopting a sophisticated five-stage pipeline structure. The pipelined DSP unit can execute one 32-bit multiply-accumulate (MAC) or 16-bit complex multiply instruction every one or two cycles through two 17-b 17-b multipliers and an operand examination logic circuit. Power-saving techniques such as power-down mode and disabling execution blocks allow low power consumption. In the design of this processor, we use logic synthesis and automatic place-and-route. This top-down approach shortens design time, while a high clock frequency is achieved by refining the processor architecture.
Takahiro ASAI Tadashi MATSUMOTO
This paper presents the outline of the systolic array recursive least-squares (RLS) processor prototyped primarily with the aim of broadband mobile communication applications. To execute the RLS algorithm effectively, this processor uses an orthogonal triangularization technique known in matrix algebra as QR decomposition for parallel pipelined processing. The processor board comprises 19 application-specific integrated circuit chips, each with approximately one million gates. Thirty-two bit fixed-point signal processing takes place in the processor, with which one cycle of internal cell signal processing requires approximately 500 nsec, and boundary cell signal processing requires approximately 80 nsec. The processor board can estimate up to 10 parameters. It takes approximately 35 µs to estimate 10 parameters using 41 known symbols. To evaluate signal processing performance of the prototyped systolic array processor board, processing time required to estimate a certain number of parameters using the prototyped board was comapred with using a digital signal processing (DSP) board. The DSP board performed a standard form of the RLS algorithm. Additionally, we conducted minimum mean-squared error adaptive array in-lab experiments using a complex baseband fading/array response simulator. In terms of parameter estimation accuracy, the processor is found to produce virtually the same results as a conventional software engine using floating-point operations.
A simple and an efficient algorithm for polygon morphing is proposed in this paper. We adopt the parametric curve representation based on Fourier parameter estimation to transfer the traditional morphing process in spatial domain to a process in the parametric space instead. The principles are to express the polygon as the union of matching segments that are described by the estimated Fourier parameters. We have also designed a data resampling method that effectively controls the shape morphing according to the corresponding curvature values. Intermediate objects in-between the source and target polygons are then constructed based on the interpolation of Fourier parameters of the two polygons. Fourier parameters of the resampled polygons can be obtained efficiently by using the fast Fourier transform (FFT) algorithm. The experimental results show that the appearances of the morphed objects are superior to the ones obtained by the methods available.
Yasushi SHIZUKI Ken ONODERA Kazuhiro ARAI Masaaki ISHIDA Shigeru WATANABE
A K-band MMIC subharmonically pumped mixer integrating local oscillator (LO) amplifier has been developed. For up-converter application, it is necessary to reduce the leakage of second harmonic component of LO frequency to RF port, which is generated by nonlinear operation of LO amplifier. A quasi-lumped short-circuited stub using microstrip structure has been successfully applied to the MMIC mixer to enhance 2fLO-suppression. We propose a new configuration of a quasi-lumped short-circuited stub, which reduces the influence of parasitic elements of via-holes. The developed MMIC has a one-stage LO amplifier and it has shown about 10 dB-improvement of 2fLO-suppression compared to conventional configuration using a quarter-wavelength short-circuited stub.
Norio TAGAWA Atsuya INAGAKI Akihiro MINAGAWA
Since the detection of optical flow (two-dimensional motion field on an image) from image sequences is essentially an ill-posed problem, most of the conventional methods use a smoothness constraint for optical flow heuristically and detect reasonable optical flow. However, little discussion exists regarding the degree of smoothness. Furthermore, to recover the relative three-dimensional motion and depth between a camera and a rigid object, in general at first, the optical flow is detected without a rigid motion constraint, and next, the motion and depth are estimated using the detected optical flow. Rigorously speaking, the optical flow should be detected with such a constraint, and consequently three-dimensional motion and depth should be determined. To solve these problems, in this paper, we apply a parametric model to an optical flow, and construct an estimation algorithm based on this model.