Home   >   CSC-OpenAccess Library   >    Manuscript Information
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
Pages - 53 - 66     |    Revised - 15-09-2012     |    Published - 25-10-2012
Volume - 3   Issue - 1    |    Publication Date - October 2012  Table of Contents
Plagiarism Detection Tool, Turnitin, Clough-Stevenson’s Corpus, Ferret, Sherlock
Abstract An attempt was made to carry out an experiment with three plagiarism detection tools (two free/open source tools, namely, Ferret and Sherlock, and one commercial web-based software called Turnitin) on Clough-Stevenson’s corpus including documents classified in three types of plagiarism and one type of non-plagiarism. The experiment was toward Extrinsic/External detecting plagiarism. The goal was to observe the performance of the tools on the corpus and then to analyze, compare, and discuss the outputs and, finally to see whether the tools’ identification of documents is the same as that identified by Clough and Stevenson. It appeared that Ferret and Sherlock, in most cases, produce the same results in plagiarism detection performance; however, Turnitin reported the results with great difference from the other two tools: It showed a higher percentage of similarities between the documents and the source. After investigating the reason (just checked with Ferret and Turnitin, cause Sherlock does not provide a view of the two documents with the overlapped and distinct parts), it was discovered that Turnitin performs quite acceptable and it is Ferret that does not show the expected percentage; it considers the longer text (for this corpus the longer is always the source) as the base and then looks how much of this text is overlapped by the shorter text and the result is shown as the percentage of similarity between the two documents, and this leads to wrong results. From this it can be also speculated that Sherlock does not manifest the results properly.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
B. Stein and S. Meyer zu Eissen. “Near similarity search and plagiarism analysis,” in From Data and Information Analysis to Knowledge Engineering, M. Spiliopoulou et al., EDs.Springer, 2006, pp. 430-437.
C. Grozea and M. Poescu. “The encoplot similarity measure for automatic detection of plagiarism”: notebook for PAN at CLEF 2011, in Notebook Papers of CLEF 2011 LABs and Workshops, 19-22 Sep., Amsterdam, The Netherlands, 2011.
C. J. Neill and G. Shanmuganthan. “A Web – enabled plagiarism detection tool.” IT Professional, vol. 6 (5), pp. 19 – 23, 2004.
C. Lyon, R. Barrett and J. Malcolm. “A theoretical basis to the automated detection of copying between texts and its practical implementation in the Ferret plagiarism and collusion detector,” in Proc. The Plagiarism: Prevention, Practice and Policies Conference, 2004.
G. Judge. “Plagiarism: Bringing Economics and Educations Together (With a Little Help fromIT).” Computers in Higher Economic Review, vol. 20(1), pp. 21-26, 2008.
G. Oberreuter, G. L´Huillier, S. Apíos, and J. D. Velasquez. “Approaches for intrinsic and external plagiarism detection”: notebook for PAN at CLEF 2011, in Notebook Papers of CLEF 2011 LABs and Workshops, 19-22 Sep., Amsterdam, The Netherlands, 2011.
H. Maurer, F. Kappe, and B. Zaka. “Plagiarism – A Survey.” Journal of Universal Computer Sciences, vol. 12 (8), pp. 1050 – 1084, 2006.
J. Grman and R. Ravas. “Improved implementation for finding text similarities in large collection of data: notebook for PAN at CLEF 2011, in Notebook Papers of CLEF 2011 LABs and Workshops, 19-22 Sep., Amsterdam, The Netherlands, 2011.
M. Delvin. “Plagiarism detection software: how effective is it? Assessing Learning in Australian Universities.” Internet: http://www.cshe.unimelb.edu.au/assessinglearning/docs/PlagSoftware.pdf, 2002 [Sep. 23, 2012].
M. Potthast et al. “Overview of the 3rd international competition in plagiarism detection”:notebook for PAN at CLEF 2011, in Notebook Papers of CLEF 2011 LABs and Workshops,19-22 Sep., Amsterdam, The Netherlands, 2011.
P. Clough and M. Stevenson. “Developing a Corpus of Plagiarized Short Answers, Language Resources and Evaluation: Special Issue on Plagiarism and Authorship Analysis, In Press.”Internet: http://ir.shef.ac.uk/cloughie/resources/plagiarism_corpus.html#Download, Sep. 10,2009 [Oct. 12, 2011].
R. Pike. “The Sherlock Plagiarism Detector.” Internet:http://www.cs.su.oz.au/~scilect/sherlock, 2007 [Oct. 04, 2011].
T. Lancaster and F. Culwin. “A review of electronic services for plagiarism detection in student submissions.” the Teaching of Computing, Edinburgh, 2000. Internet:http://www.ics.heacademy.ac.uk/events/presentations/317_Culwin.pdf, 2000 [Oct. 01, 2012].
T. Lancaster and F. Culwin. “Classifications of Plagiarism Detection Engines.” ITALICS, vol. 4(2), 2005.
University of Aveiro - Portugal