IEICE global.ieice.org Site

Keyword Search Result

[Keyword] source retrieval(2hit)

1-2hit

A Partial Matching Convolution Neural Network for Source Retrieval of Plagiarism Detection
Leilei KONG Yong HAN Haoliang QI Zhongyuan HAN

LETTER-Natural Language Processing

Pubricized:
2021/03/03
Vol:
E104-D No:6
Page(s):
915-918
Source retrieval is the primary task of plagiarism detection. It searches the documents that may be the sources of plagiarism to a suspicious document. The state-of-the-art approaches usually rely on the classical information retrieval models, such as the probability model or vector space model, to get the plagiarism sources. However, the goal of source retrieval is to obtain the source documents that contain the plagiarism parts of the suspicious document, rather than to rank the documents relevant to the whole suspicious document. To model the “partial matching” between documents, this paper proposes a Partial Matching Convolution Neural Network (PMCNN) for source retrieval. In detail, PMCNN exploits a sequential convolution neural network to extract the plagiarism patterns of contiguous text segments. The experimental results on PAN 2013 and PAN 2014 plagiarism source retrieval corpus show that PMCNN boosts the performance of source retrieval significantly, outperforming other state-of-the-art document models.
A Ranking Approach to Source Retrieval of Plagiarism Detection
Leilei KONG Zhimao LU Zhongyuan HAN Haoliang QI

LETTER-Data Engineering, Web Information Systems

Pubricized:
2016/09/29
Vol:
E100-D No:1
Page(s):
203-205
This paper addresses the issue of source retrieval in plagiarism detection. The task of source retrieval is retrieving all plagiarized sources of a suspicious document from a source document corpus whilst minimizing retrieval costs. The classification-based methods achieved the best performance in the current researches of source retrieval. This paper points out that it is more important to cast the problem as ranking and employ learning to rank methods to perform source retrieval. Specially, it employs RankBoost and Ranking SVM to obtain the candidate plagiarism source documents. Experimental results on the dataset of PAN@CLEF 2013 Source Retrieval show that the ranking based methods significantly outperforms the baseline methods based on classification. We argue that considering the source retrieval as a ranking problem is better than a classification problem.

Keyword Search Result

[Keyword] source retrieval(2hit)

A Partial Matching Convolution Neural Network for Source Retrieval of Plagiarism Detection

A Ranking Approach to Source Retrieval of Plagiarism Detection

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles