In order to accommodate periodic and bursty sources into ATM networks effectively, we propose phase assignment control (PAC), which actively controls the phase of the new connection at its connection setup phase. To realize PAC, we develop an algorithm to find a good phase of the new connection in a short time. Simulation results show that the PAC can improve the system performance.
In a multisystem data sharing environment (MDSE), the computing nodes are locally coupled via a high-speed network and share a common database at the disk level. To reduce the amount of expensive and slow disk I/O, each node caches database pages in its main memory buffer. This paper focuses on the MDSE that uses record-level locking as a concurrency control. While the record-level locking can guarantee higher concurrency than page-level locking, it may result in heavy message traffic. In this paper, we first propose a cache coherency scheme that can reduce the message traffic in the standard locking. Then the scheme is extended to the context where lock caching and lock de-escalation are adopted. Using a distributed database simulation model, we evaluate the performance of the proposed schemes under a wide variety of database workloads.
Takuya ASAKA Hiroyoshi MIWA Yoshiaki TANAKA
Distributed Web caching allows multiple clients to quickly access a pool of popular Web pages. Conventional distributed Web caching schemes, e. g. , the Internet cache protocol and hash routing, require the sending of many query messages among cache servers and/or impose a large load on the cache servers when they are widely dispersed. To overcome these problems, we propose a hash-based query caching method using both a hash function and a query caching method. This method can find cached objects among several cache servers by using only one query message, enabling the construction of an efficient large-scale distributed Web cache server. Compared to conventional methods, this method reduces cache server overhead and object retrieval latency.
A scheme is proposed to exploit statistical multiplexing to support connections with diverse characteristics and requirements. Previous designs on measurement-based admission control mainly focused on strategies that consider the worst case traffic source model to guarantee QoS bounds for all connections. In this paper, we develop a simple mechanism in which statistical multiplexing gain as well as QoS is considered to achieve higher bandwidth utilization. An accurate formula for the cell loss probability which combines measurments with the Gaussian approximation is presented for a new traffic model. Furthermore we enhance the performance of this mechanism through real-time measurements of traffic and monitoring of QoS.
Doo Seop EOM Masashi SUGANO Masayuki MURATA Hideo MIYAHARA
In this paper, we investigate a call admission control (CAC) problem in a multimedia wireless ATM network that supports various multimedia applications based on micro/pico cellular architectures. Due to reduced wireless cell size (compared to conventional cellular networks), forced termination of calls in progress becomes a more serious problem in the wireless ATM network. Another problem specific to the multimedia wireless network is how to avoid an excessive delay of non-realtime applications under the presence of various realtime applications with priority over non-realtime applications. We consider two service classes; CBR for realtime applications and UBR for non-realtime applications, and then propose a new CAC scheme that addresses above two problems while minimizing a blocking probability of newly arriving calls of CBR. Through the analytical methods, we derive the blocking probabilities and forced termination probabilities of CBR calls and the average packet delay of UBR connections. We also present a method that decides the optimal CAC threshold values in our CAC scheme.
Sujata BANERJEE Panos K. CHRYSANTHIS
The advent of high-speed networks with quality of service guarantees, will enable the deployment of data-server distributed systems over wide-area networks. Most implementations of data-server systems have been over local area networks. Thus it is important, in this context, to study the performance of existing distributed data management protocols in the new networking environment, identify the performance bottlenecks and develop protocols that are capable of taking advantage of the high speed networking technology. In this paper, we examine and compare the scalability of the server-based two-phase locking protocol (s-2PL), and the group two-phase locking protocol (g-2PL). The s-2PL protocol is the most widely used concurrency control protocol, while the g-2PL protocol is an optimized version of the s-2PL protocol, tailored for high-speed wide-area network environments. The g-2PL protocol reduces the effect of the network latency by message grouping, client-end caching and data migration. Detailed simulation results indicate that g-2PL indeed scales better than s-2PL. For example, upto 28% improvement in response time is reported.
Piya TANTHAWICHIAN Akihiro FUJII Yoshiaki NEMOTO
Major problems of traffic control in ATM networks include how to decide whether a network accepts a new call or not in real time and how to select the best set of Dual Leaky Bucket (DLB) parameter values. To solve these problems, it is necessary to determine the amount of network bandwidth required by the call. In this paper, we present an analysis based on bounding technique to derive an upper bound on bandwidth requirement when the call is characterized by a set of DLB parameters. Consequently, a new definition of the upper bound on bandwidth requirement and simple formulae used for computing the upper bound have been obtained. To clarify the advantages of the derived upper bound, we demonstrate its two applications, one to select the best set of DLB parameter values from candidates for minimizing the amount of bandwidth to be allocated to the call and the other to establish a Connection Admission Control (CAC) scheme. The upper bound-based CAC scheme is fast enough to process in real time due to its simplicity and provides a significant improvement of network utilization compared to the peak rate-based CAC scheme.
Hiroyuki TOMIYAMA Tohru ISHIHARA Akihiko INOUE Hiroto YASUURA
In many embedded systems, a significant amount of power is consumed for off-chip driving because off-chip capacitances are much larger than on-chip capacitances. This paper proposes instruction scheduling techniques to reduce power consumed for off-chip driving. The techniques minimize the switching activity of a data bus between an on-chip cache and a main memory when instruction cache misses occur. The scheduling problem is formulated and two scheduling algorithms are presented. Experimental results demonstrate the effectiveness and the efficiency of the proposed algorithms.
In a shared-memory multiprocessor, shared data are usually accessed in a critical section that is protected by a lock variable. Therefore, the order of accesses by multiple processors to the shared data corresponds to the order of acquiring the ownership of the lock variable. This paper presents a selective write-update protocol, where data modified in a critical section are stored in a write cache and, at a synchronization point, they are transferred only to the processor that will execute the critical section following the current processor. By using QOLB synchronization primitives, the next processor can be determined at the execution time. We prove that the selective write-update protocol ensures data coherency of parallel programs that comply with release consistency, and evaluate the performance of the protocol by analytical modeling and program-driven simulation. The simulation results show that our protocol can reduce the number of coherence misses in a critical section while avoiding the multicast of write-update requests on an interconnection network. In addition, we observe that synchronization latency can be decreased by reducing both the execution time of a critical section and the number of write-update requests. From the simulation results, it is shown that our protocol provides better performance than a write-invalidate protocol and a write-update protocol as the number of processors increases.
Kenichi MASE Takuya ASAKA Yoshiaki TANAKA Hideyoshi TOMINAGA
An architecture is presented for efficient and reliable delivery of multimedia contents from a primary center (PC) to secondary centers (SCs). Requested contents are delivered from the PC to the SCs through a satellite broadcast channel, or from one SC to another SC through a terrestrial channel. Cycling methods are presented that enable sharing of the contents directory of each SC. Several fundamental models and algorithms are introduced for possible consideration during the planning and design of a contents-delivery system. Simulation has shown that using both satellite broadcast and terrestrial channels for contents delivery is superior in terms of cost to the conventional use of only a satellite network.
Hiroyuki YOKOYAMA Hajime NAKAMURA
Priority control and call admission control are indispensable traffic management methods to guarantee each QoS requirement of connections in ATM networks. The key technique of call admission control under priority control is to estimate required bandwidth of each connection to satisfy all QoSs of calls in progress. In this paper, we propose a novel approximation method to calculate the required bandwidth of ATM connections through priority queues and show a practical call admission control scheme using the proposed method. The essence of the approximation method is to model prioritized parallel queues as a series of queues in tandem with no priority control by focusing on the number of cells in queues. The tandem queue approximation method enables us to model each queue under priority control as a single non-priority FIFO queue in terms of its queue length. This results in that effective bandwidth techniques are applicable to priority queues. The effectiveness of the proposed scheme is evaluated by some numerical examples.
Minoru SUZUKI Shin-ichi KARIMOTO
We describe several properties of very thin stacks of 10 to 20 intrinsic Josephson junctions fabricated on the surface of Bi2Sr2CaCu2O8+δ single crystals. We show that the Joule heating is significantly reduced in these stacks and that the gap structure is clearly observable in the quasiparticle current-voltage (I-V) characteristics. The I-V curves are characterized by a large subgap conductance and a significant gap suppression due to the injection of quasiparticle current. It is found that the IcRn product of these intrinsic Josephson junction stacks is significantly small compared with that expected from the BCS theory. It is also found that there is a tendency that IcRn decreases with increasing c-axis resistivity. Both Ic and the gap voltage exhibit unsaturated temperature dependence at low temperatures. The behavior presents a sharp contrast to that of Josephson junctions made of conventional superconductors. The characteristics are discussed in relation to the d-wave symmetry of the order parameter.
Distributed web caching reduces retrieval latency of World Wide Web (WWW) objects such as text and graphics. Conventional distributed web caching methods, however, require many query messages among cache servers, which limits their scalability and reliability. To overcome these problems, we propose a query caching method in which each cache server caches not only WWW objects but also a query history. This method of finding cached objects can reduce the number of query messages among cache servers, making it possible to construct a large-scale distributed web cache server. We also propose an algorithm for constructing efficient query relationships among cache servers.
Koji INOUE Koji KAI Kazuaki MURAKAMI
Merged DRAM/logic LSIs could provide high on-chip memory bandwidth by interconnecting logic portions and DRAM with wider on-chip buses. For merged DRAM/logic LSIs with the memory hierarchy including cache memory, we can exploit such high on-chip memory bandwidth by means of replacing a whole cache line (or cache block) at a time on cache misses. This approach tends to increase the cache-line size if we attempt to improve the attainable memory bandwidth. Larger cache lines, however, might worsen the system performance if programs running on the LSIs do not have enough spatial locality of references and cache misses frequently take place. This paper describes a novel cache architecture suitable for merged DRAM/logic LSIs, called variable line-size cache or VLS cache, for resolving the above-mentioned dilemma. The VLS cache can make good use of the high on-chip memory bandwidth by means of larger cache lines and, at the same time, alleviate the negative effects of larger cache-line size by partitioning each large cache line into multiple sub-lines and allowing every sub-line to work as an independent cache line. The number of sub-lines involved when a cache replacement occurs can be determined depending on the characteristics of programs. This paper also evaluates the cost/performance improvements attainable by the VLS cache and compares it with those of conventional cache architectures. As a result, it is observed that a VLS cache reduces the average memory-access time by 16. 4% while it increases the hardware cost by only 13%, compared to a conventional direct-mapped cache with fixed 32-byte lines.
This paper describes a real-time connection admission control scheme for supporting multiple service categories. The scheme is based on a real-time cell-loss ratio evaluation algorithm for VBR based on peak/sustainable cell rates and maximum burst size. The algorithm is based on a notion of Allan variance of VP utilization. The most remarkable characteristics of the admission control scheme are that it terminates within constant time, a few milliseconds, and that its time is independent of both the number of VCs and the capacity of a cell buffer.
Masaki AIDA Noriyuki TAKAHASHI Tetsuya ABE
This paper proposes the Dual Zipfian Model addressing how to describe HTTP access trends in large-scale data communication networks, and discusses how to design the capacity of address cache tables in an edge router of the networks. We show that destination addresses of packets can be characterized by two types of Zipf's law. Fundamental concept of the Dual Zipfian Model is in complementary use of these laws, and we can derive the relationship between the number of accesses and the number of destination addresses. Experimental results show that the relation gives a good approximation. Applying this relation, we derive cache hit probabilities of the address cache table that incorporates high-speed address resolution. Using the probabilities, design issues including the capacity of the cache tables and aging algorithms of cache entries are also discussed.
Yonghwan LEE Wookyeong JEONG Yongsurk LEE
A unified tag by which both TLBs and caches can be accessed is presented. This architecture reduces the chip area of conventional cache tags and also improves the speed of cache systems. In addition, it has expanded to support snoop accesses for multiprocessor environments. To validate the proposed architecture, we measured the area and speed based on VLSI circuits.
In this paper, we apply the Semi-markov Memory and Cache coherence Interference (SMCI) model, which we had proposed for invalidating based cache coherent parallel computers, to an updating based protocol. The model proposed here, the SMCI/Dragon model, can predict performance of cache coherent parallel computers with the Dragon protocol as well as the original SMCI model for the Synapse protocol. Conventional analytic models by stochastic processes to describe parallel computers have the problem of numerical explosion in the number of states necessary as the system size increases. We have already shown that the SMCI model achieved both the small number of states to describe parallel computers with the Synapse protocol and the inexpensive computation cost to predict their performance. In this paper, we demonstrate generality of the SMCI model by applying it to the another cache coherence protocol, Dragon, which has opposite characteristics than Synapse. We show the number of states required by constructing the SMCI/Dragon model is only 21 which is as small as SMCI/Synapse, and the computation cost is also the order of microseconds. Using the SMCI/Dragon model, we investigate several comparative experiments with widely known simulation results. We found that there is only a 5. 4% differences between the simulation and the SMCI/Dragon model.
Toshiaki TSUCHIYA Hiroshi SAITO
We introduce the concepts of conservative cell loss ratio (CLR) estimation and worst case cell arrival patterns, and apply them to cell arrival patterns that conform to the generic cell rate algorithm (GCRA). We define new sets of cell arrival patterns which contain the worst case cell arrival patterns for conforming cell arrival patterns. Based on these sets, we propose an upper bound formula using the burst tolerance as well as peak cell rate and sustainable cell rate, and develope a connection admission control method that guarantees cell loss ratio performance satisfying its objective.
Pier Luigi CONTI Hiroshi SAITO Livia DE GIOVANNI
In this paper an algorithm of Connection Admission Control in ATM is considered. It is shown that it works under many different kinds of dependence among arrivals, including long range dependence. This point is relevant, since recent papers show that ATM traffic is characterised by self-similarity, and hence by long range dependence. An upper bound for CLR is given, without assuming any specific cell arrival process. Applications to simulated and real data (obtained by segmenting and shaping Ethernet packets) are considered. They show the goodness and the tightness of the considered upper bound.