The search functionality is under construction.

Author Search Result

[Author] Huaimin WANG(4hit)

1-4hit
  • VAWS: Constructing Trusted Open Computing System of MapReduce with Verified Participants Open Access

    Yan DING  Huaimin WANG  Lifeng WEI  Songzheng CHEN  Hongyi FU  Xinhai XU  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    721-732

    MapReduce is commonly used as a parallel massive data processing model. When deploying it as a service over the open systems, the computational integrity of the participants is becoming an important issue due to the untrustworthy workers. Current duplication-based solutions can effectively solve non-collusive attacks, yet most of them require a centralized worker to re-compute additional sampled tasks to defend collusive attacks, which makes the worker a bottleneck. In this paper, we try to explore a trusted worker scheduling framework, named VAWS, to detect collusive attackers and assure the integrity of data processing without extra re-computation. Based on the historical results of verification, we construct an Integrity Attestation Graph (IAG) in VAWS to identify malicious mappers and remove them from the framework. To further improve the efficiency of identification, a verification-couple selection method with the IAG guidance is introduced to detect the potential accomplices of the confirmed malicious worker. We have proven the effectiveness of our proposed method on the improvement of system performance in theoretical analysis. Intensive experiments show the accuracy of VAWS is over 97% and the overhead of computation is closed to the ideal value of 2 with the increasing of the number of map tasks in our scheme.

  • Multi Model-Based Distillation for Sound Event Detection Open Access

    Yingwei FU  Kele XU  Haibo MI  Qiuqiang KONG  Dezhi WANG  Huaimin WANG  Tie HONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/07/08
      Vol:
    E102-D No:10
      Page(s):
    2055-2058

    Sound event detection is intended to identify the sound events in audio recordings, which has widespread applications in real life. Recently, convolutional recurrent neural network (CRNN) models have achieved state-of-the-art performance in this task due to their capabilities in learning the representative features. However, the CRNN models are of high complexities with millions of parameters to be trained, which limits their usage for the mobile and embedded devices with limited computation resource. Model distillation is effective to distill the knowledge of a complex model to a smaller one, which can be deployed on the devices with limited computational power. In this letter, we propose a novel multi model-based distillation approach for sound event detection by making use of the knowledge from models of multiple teachers which are complementary in detecting sound events. Extensive experimental results demonstrated that our approach achieves a compression ratio about 50 times. In addition, better performance is obtained for the sound event detection task.

  • Hierarchical Categorization of Open Source Software by Online Profiles

    Tao WANG  Huaimin WANG  Gang YIN  Cheng YANG  Xiang LI  Peng ZOU  

     
    PAPER-Software Engineering

      Vol:
    E97-D No:9
      Page(s):
    2386-2397

    The large amounts of freely available open source software over the Internet are fundamentally changing the traditional paradigms of software development. Efficient categorization of the massive projects for retrieving relevant software is of vital importance for Internet-based software development such as solution searching, best practices learning and so on. Many previous works have been conducted on software categorization by mining source code or byte code, but were verified on only relatively small collections of projects with coarse-grained categories or clusters. However, Internet-based software development requires finer-grained, more scalable and language-independent categorization approaches. In this paper, we propose a novel approach to hierarchically categorize software projects based on their online profiles. We design a SVM-based categorization framework and adopt a weighted combination strategy to aggregate different types of profile attributes from multiple repositories. Different basic classification algorithms and feature selection techniques are employed and compared. Extensive experiments are carried out on more than 21,000 projects across five repositories. The results show that our approach achieves significant improvements by using weighted combination. Compared to the previous work, our approach presents competitive results with more finer-grained and multi-layered category hierarchy with more than 120 categories. Unlike approaches that use source code or byte code, our approach is more effective for large-scale and language-independent software categorization. In addition, experiments suggest that hierarchical categorization combined with general keyword-based searching improves the retrieval efficiency and accuracy.

  • Towards Trusted Result Verification in Mass Data Processing Service

    Yan DING  Huaimin WANG  Peichang SHI  Hongyi FU  Xinhai XU  

     
    PAPER

      Vol:
    E97-B No:1
      Page(s):
    19-28

    Computation integrity is difficult to verify when mass data processing is outsourced. Current integrity protection mechanisms and policies verify results generated by participating nodes within a computing environment of service providers (SP), which cannot prevent the subjective cheating of SPs. This paper provides an analysis and modeling of computation integrity for mass data processing services. A third-party sampling-result verification method, named TS-TRV, is proposed to prevent lazy cheating by SPs. TS-TRV is a general solution of verification on the intermediate results of common MapReduce jobs, and it utilizes the powerful computing capability of SPs to support verification computing, thus lessening the computing and transmission burdens of the verifier. Theoretical analysis indicates that TS-TRV is effective on detecting the incorrect results with no false positivity and almost no false negativity, while ensuring the authenticity of sampling. Intensive experiments show that the cheating detection rate of TS-TRV achieves over 99% with only a few samples needed, the computation overhead is mainly on the SP, while the network transmission overhead of TS-TRV is only O(log N).