The search functionality is under construction.

Author Search Result

[Author] Zhe ZHANG(9hit)

1-9hit
  • Efficient Distributed Web Crawling Utilizing Internet Resources

    Xiao XU  Weizhe ZHANG  Hongli ZHANG  Binxing FANG  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E93-D No:10
      Page(s):
    2747-2762

    Internet computing is proposed to exploit personal computing resources across the Internet in order to build large-scale Web applications at lower cost. In this paper, a DHT-based distributed Web crawling model based on the concept of Internet computing is proposed. Also, we propose two optimizations to reduce the download time and waiting time of the Web crawling tasks in order to increase the system's throughput and update rate. Based on our contributor-friendly download scheme, the improvement on the download time is achieved by shortening the crawler-crawlee RTTs. In order to accurately estimate the RTTs, a network coordinate system is combined with the underlying DHT. The improvement on the waiting time is achieved by redirecting the incoming crawling tasks to light-loaded crawlers in order to keep the queue on each crawler equally sized. We also propose a simple Web site partition method to split a large Web site into smaller pieces in order to reduce the task granularity. All the methods proposed are evaluated through real Internet tests and simulations showing satisfactory results.

  • Energy-Aware Task Scheduling for Real-Time Systems with Discrete Frequencies

    Dejun QIAN  Zhe ZHANG  Chen HU  Xincun JI  

     
    PAPER-Software System

      Vol:
    E94-D No:4
      Page(s):
    822-832

    Power-aware scheduling of periodic tasks in real-time systems has been extensively studied to save energy while still meeting the performance requirement. Many previous studies use the probability information of tasks' execution cycles to assist the scheduling. However, most of these approaches adopt heuristic algorithms to cope with realistic CPU models with discrete frequencies and cannot achieve the globally optimal solution. Sometimes they even show worse results than non-stochastic DVS schemes. This paper presents an optimal DVS scheme for frame-based real-time systems under realistic power models in which the processor provides only a limited number of speeds and no assumption is made on power/frequency relation. A suboptimal DVS scheme is also presented in this paper to work out a solution near enough to the optimal one with only polynomial time expense. Experiment results show that the proposed algorithm can save at most 40% more energy compared with previous ones.

  • Fast Persistent Heap Based on Non-Volatile Memory

    Wenzhe ZHANG  Kai LU  Xiaoping WANG  Jie JIAN  

     
    PAPER-Software System

      Pubricized:
    2017/02/01
      Vol:
    E100-D No:5
      Page(s):
    1035-1045

    New volatile memory (e.g. Phase Change Memroy) presents fast access, large capacity, byte-addressable, and non-volatility features. These features will bring impacts on the design of current software system. It has become a hot research topic of how to manage it and provide what kind of interface for upper application to use it. This paper proposes FP-Heap. FP-Heap supports direct access to non-volatile memory through a persistent heap interface. With FP-Heap, traditional persistent object systems can benefit directly from the byte-persistency of non-volatile memory. FP-Heap extends current virtual memory manager (VMM) to manage non-volatile memory and maintain a persistent mapping relationship. Also, FP-Heap offers a lightweight transaction mechanism to support atomic update of persistent data, a simple namespace to facilitate data indexing, and a basic access control mechanism to support data sharing. Compared with previous work Mnemosyne, FP-Heap achieves higher performance by its customized VMM and optimized transaction mechanism.

  • Dynamic Voltage Scaling for Real-Time Systems with System Workload Analysis

    Zhe ZHANG  Xin CHEN  De-jun QIAN  Chen HU  

     
    PAPER-Electronic Circuits

      Vol:
    E93-C No:3
      Page(s):
    399-406

    Dynamic Voltage Scaling (DVS) is a well-known low-power design technique, which adjusts the clock speed and supply voltage dynamically to reduce the energy consumption of real-time systems. Previous studies considered the probabilistic distribution of tasks' workloads to assist DVS in task scheduling. These studies use probability information for intra-task frequency scheduling but do not sufficiently explore the opportunities for the system workload to save more energy. This paper presents a novel DVS algorithm for periodic real-time tasks based on the analysis of the system workload to reduce its power consumption. This algorithm takes full advantage of the probabilistic distribution characteristics of the system workload under priority-driven scheduling such as Earliest-Deadline-First (EDF). Experimental results show that the proposed algorithm reduces processor idle time and spends more busy time in lower-power speeds. The measurement indicates that compared to the relative DVS algorithms, this algorithm saves energy by at least 30% while delivering statistical performance guarantees.

  • Deterministic Message Passing for Distributed Parallel Computing

    Xu ZHOU  Kai LU  Xiaoping WANG  Wenzhe ZHANG  Kai ZHANG  Xu LI  Gen LI  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:5
      Page(s):
    1068-1077

    The nondeterminism of message-passing communication brings challenges to program debugging, testing and fault-tolerance. This paper proposes a novel deterministic message-passing implementation (DMPI) for parallel programs in the distributed environment. DMPI is compatible with the standard MPI in user interface, and it guarantees the reproducibility of message with high performance. The basic idea of DMPI is to use logical time to solve message races and control asynchronous transmissions, and thus we could eliminate the nondeterministic behaviors of the existing message-passing mechanism. We apply a buffering strategy to alleviate the performance slowdown caused by mismatch of logical time and physical time. To avoid deadlocks introduced by deterministic mechanisms, we also integrate DMPI with a lightweight deadlock checker to dynamically detect and solve these deadlocks. We have implemented DMPI and evaluated it using NPB benchmarks. The results show that DMPI could guarantee determinism with incurring modest runtime overhead (14% on average).

  • A Built-in Reseeding Technique for LFSR-Based Test Pattern Generation

    Youhua SHI  Zhe ZHANG  Shinji KIMURA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Timing Verification and Test Generation

      Vol:
    E86-A No:12
      Page(s):
    3056-3062

    Reseeding technique is proposed to improve the fault coverage in pseudo-random testing. However most of previous works on reseeding is based on storing the seeds in an external tester or in a ROM. In this paper we present a built-in reseeding technique for LFSR-based test pattern generation. The proposed structure can run both in pseudorandom mode and in reseeding mode. Besides, our method requires no storage for the seeds since in reseeding mode the seeds can be generated automatically in hardware. In this paper we also propose an efficient grouping algorithm based on simulated annealing to optimize test vector grouping. Experimental results for benchmark circuits indicate the superiority of our technique against other reseeding methods with respect to test length and area overhead. Moreover, since the theoretical properties of LFSRs are preserved, our method could be beneficially used in conjunction with any other techniques proposed so far.

  • Exploring Social Relations for Personalized Tag Recommendation in Social Tagging Systems

    Kaipeng LIU  Binxing FANG  Weizhe ZHANG  

     
    PAPER

      Vol:
    E94-D No:3
      Page(s):
    542-551

    With the emergence of Web 2.0, social tagging systems become highly popular in recent years and thus form the so-called folksonomies. Personalized tag recommendation in social tagging systems is to provide a user with a ranked list of tags for a specific resource that best serves the user's needs. Many existing tag recommendation approaches assume that users are independent and identically distributed. This assumption ignores the social relations between users, which are increasingly popular nowadays. In this paper, we investigate the role of social relations in the task of tag recommendation and propose a personalized collaborative filtering algorithm. In addition to the social annotations made by collaborative users, we inject the social relations between users and the content similarities between resources into a graph representation of folksonomies. To fully explore the structure of this graph, instead of computing similarities between objects using feature vectors, we exploit the method of random-walk computation of similarities, which furthermore enable us to model a user's tag preferences with the similarities between the user and all the tags. We combine both the collaborative information and the tag preferences to recommend personalized tags to users. We conduct experiments on a dataset collected from a real-world system. The results of comparative experiments show that the proposed algorithm outperforms state-of-the-art tag recommendation algorithms in terms of prediction quality measured by precision, recall and NDCG.

  • Data Spoofing Attacks by IPv6 Tunnels

    Yu CUI  Zhi-Hong TIAN  Bin-Xing FANG  Hong-Li ZHANG  Wei-Zhe ZHANG  

     
    PAPER-Internet

      Vol:
    E96-B No:11
      Page(s):
    2875-2882

    Tunneling is one of the main methods for the transition from IPv4 to IPv6 networks. By encapsulating IPv6 packets in IPv4 or UDP packets, tunnels like 6to4, Isatap and Teredo provide a feasible way for IPv4 hosts to establish IPv6 connections to hosts in IPv6 internet or IPv6 islands. For IPv4 internet, the use of tunnels varies the traffic and increases the type of packets, making the network environment more complex. In addition to common tunnels, various types of tunnels with more layers are tested in this paper. The results of successful connections prove the usefulness of multi-layer packets with diverse layer-count and type on the internet. To ensure the security of internal networks, the influence on traffic analysis in dual-stack IDS devices caused by the diversity is studied. Three spoofing attacks of “data insertion”, “data evasion” and “attacks using UDP” are proposed to show the influence on IDS caused by tunnels. Compared to the attacks without tunnels, some constraining factors are eliminated, which may increase the security risk of IDS and decrease the attacker's difficulties. To summarize this kind of problem, the concept of “Tunnel Interference” is revealed. And as solutions to this problem, two methods, RA (Record All) and HEH (Hash for Each Header), are presented in this paper which theoretically solve these problems to a great extent. RA records all headers and compares from the outermost to innermost layer. HEH is hash-based and accumulates hash values of each header. Both of them have linear time and space complexity. Experimental results show that RA and HEH will lead to minor space increase and up to 1.2% time increment in each layer compared to the original dual-stack.

  • Exploring Web Partition in DHT-Based Distributed Web Crawling

    Xiao XU  Weizhe ZHANG  Hongli ZHANG  Binxing FANG  

     
    PAPER

      Vol:
    E93-D No:11
      Page(s):
    2907-2921

    The basic requirements of the distributed Web crawling systems are: short download time, low communication overhead and balanced load which largely depends on the systems' Web partition strategies. In this paper, we propose a DHT-based distributed Web crawling system and several DHT-based Web partition methods. First, a new system model based on a DHT method called the Content Addressable Network (CAN) is proposed. Second, based on this model, a network-distance-based Web partition is implemented to reduce the crawler-crawlee network distance in a fully distributed manner. Third, by utilizing the locality on the link space, we propose the concept of link-based Web partition to reduce the communication overhead of the system. This method not only reduces the number of inter-links to be exchanged among the crawlers but also reduces the cost of routing on the DHT overlay. In order to combine the benefits of the above two Web partition methods, we then propose 2 distributed multi-objective Web partition methods. Finally, all the methods we propose in this paper are compared with existing system models in the simulated experiments under different datasets and different system scales. In most cases, the new methods show their superiority.