Call for Papers - Ongoing round of submission, notification and publication.
    
  
Home    |    Login or Register    |    Contact CSC
By Title/Keywords/Abstract   By Author
Browse CSC-OpenAccess Library.
  • HOME
  • LIST OF JOURNALS
  • AUTHORS
  • EDITORS & REVIEWERS
  • LIBRARIANS & BOOK SELLERS
  • PARTNERSHIP & COLLABORATION
Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available
(no registration required)

(446.77KB)


-- CSC-OpenAccess Policy
-- Creative Commons Attribution NonCommercial 4.0 International License
>> COMPLETE LIST OF JOURNALS

EXPLORE PUBLICATIONS BY COUNTRIES

EUROPE
MIDDLE EAST
ASIA
AFRICA
.............................
United States of America
United Kingdom
Canada
Australia
Italy
France
Brazil
Germany
Malaysia
Turkey
China
Taiwan
Japan
Saudi Arabia
Jordan
Egypt
United Arab Emirates
India
Nigeria
Evaluating Binary n-gram Analysis For Authorship Attribution
Mark Carman, Helen Ashman
Pages - 60 - 91     |    Revised - 31-10-2019     |    Published - 01-12-2019
Published in International Journal of Computational Linguistics (IJCL)
Volume - 10   Issue - 4    |    Publication Date - December 2019  Table of Contents
MORE INFORMATION
References   |   Abstracting & Indexing
KEYWORDS
Authorship Attribution, Binary n-gram, Stop Word, Cross-domain, Cross-genre.
ABSTRACT
Authorship attribution techniques focus on characters and words. However the inclusion of words with meaning may complicate authorship attribution. Using only function words provides good authorship attribution with semantic or character n-gram analyses but it is not yet known whether it improves binary n-gram analyses.

The literature mostly reports on authorship attribution at word or character level. Binary n-grams interpret text as binary. Previous work with binary n-grams assessed authorship attribution of full texts only. This paper evaluates binary n-gram authorship attribution over text stripped of content words as well as over a range of cross-domain scenarios.

This paper reports a sequence of experiments. First the binary n-gram analysis method is directly compared with character n-grams for authorship attribution. Then it is evaluated over three forms of input text, full text, stop words and function words only, and content words only. Subsequently, it was tested over cross-domain and cross-genre texts, as well as multiple-author texts.
ABSTRACTING & INDEXING
1 Google Scholar 
2 refSeek 
3 Doc Player 
4 Scribd 
REFERENCES
A. Hamilton, J. Madison, J. Jay and J. Rakove J. "The Federalist". Bedford/St. Martin's, Boston. 2003
A. Rocha, W. Scheirer, C. Forstall, T. Cavalcante, Theophilo, B. Shen, A. Carvalho and E. Stamatatos. "Authorship Attribution for Social Media Forensics". IEEE Transactions on Information Forensics and Security, vol. 12 (1), pp. 5-33. 2017.
Alexa. "Facebook.com Traffic, Demographics and Competitors". (accessed 2019/09/10), 2019. https://www.alexa.com/siteinfo/facebook.com.
Alexa. "Twitter.com Traffic, Demographics and Competitors". (accessed 2019/09/10), 2019. https://www.alexa.com/siteinfo/twitter.com.
B. Blatt. Nabokov's favourite word is Mauve. Simon and Schuster. 2017.
B. Kjell, W. Woods and O. Frieder. "Discrimination of authorship using visualization". Information Processing and Management, vol. 30 (1), pp. 141-150.
D. Doyle. "Stopwords" (English) (accessed 2019/09/10), http://www.ranks.nl/stopwords. 2017.
D. Lowe and R. Matthews. "Shakespeare vs. Fletcher: A stylometric analysis by radial basis functions". Computers and the Humanities, vol. 29 (6), pp. 449-461. 1995.
E. Stamatatos. "A survey of modern authorship attribution methods". Journal of the American Society for Information Science and Technologies, vol. 60 (3), pp. 538-556. 2009.
E. Stamatatos. "On the robustness of authorship attribution based on character n-gram features". (Symposium: Authorship Attribution Workshop). Journal of Law and Policy, vol. 21, pp. 421-439. 2013.
H. Fouche Gaines. H. Cryptanalysis. Dover, New York. 1956.
HDJ. Coupe. "Non-Symbolic Fragmentation Cryptographic Algorithms". PhD thesis, University of Nottingham, UK. 2005.
J. Peng, K-KR. Choo and Ashman H. "Bit-level n-gram based forensic authorship analysis on social media: Identifying individuals from linguistic profiles". Journal of Networked and Computer Applications, vol. 70, pp. 171-182. 2016.
J. Peng, S. Detchon, K-KR. Choo and H. Ashman. "Astroturfing Detection in Social Media: A Binary N-gram Based Approach". Concurrency and Computation: Practice and Experience, doi: 10.1002/cpe.4013. 2016.
J. Peng. "Authorship Attribution with Binary N-gram Analysis for Detecting Astroturfing in Social Media". PhD thesis, University of South Australia, Australia. 2017.
J. Rowling, J. Tiffany J and J. Thorne. Harry Potter and the cursed child. Little & Brown, London. 2016.
J. Rowling. Harry Potter and the Half-Blood Prince. Pottermore, England. 2012.
Judges 5:5-6. Holy Bible. Authorised King James Version.
K. Sundararajan and D. Woodard. "What constitutes 'style' in authorship attribution?". Proc. 27th Int. Conf. on Computational Linguistics. Assoc. Computational Linguistics. pp. 2814–2822, https://www.aclweb.org/anthology/C18-1238. 2018.
L. Milos. "Playing the Pronoun Game: Are All of The Hobbit’s Dwarves Male?". http://middleearthnews.com/2018/01/09/playing-the- pronoun-game-are-all-of-the-hobbits-dwarves-male/ (accessed 2019/09/10). 2018.
M. Kestemont. "Function Words in Authorship Attribution: From Black Magic to Theory?". Proc. 3rd workshop on Computational Linguistics for Literature, pp. 59-66, Gothenburg, Sweden, ACL, https://www.aclweb.org/anthology/W14-0908 2014,
P. Juola. "Authorship Attribution". Foundations and Trends in Information Retrieval, vol. 1, (3), pp. 233-334. 2006.
R. Galbraith. "About Robert Galbraith". 2019/07/25, http://robert-galbraith.com/about/. 2017. (accessed 2019/09/10).
R. Matthews. "Neural Computation in Stylometry I: An Application to the Works of Shakespeare and Fletcher". Literary and Linguistic Computing, vol. 8 (4), pp. 203-210. 1993.
S. Rogers. "The Boston Bombing: How journalists used Twitter to tell the story". (accessed 2019/09/10), https://blog.twitter.com/official/en_us/a/2013/the-boston-bombing-how-journalists-used-twitter-to-tell-the-story.html. 2017.
S. Walker. "Salutin' Putin: inside a Russian troll house". (accessed 2019/09/10), https://www.theguardian.com/world/2015/apr/02/putin-kremlin-inside- russian-troll-house. 2017.
T. Clancy. Locked On, by Tom Clancy with Mark Greaney. (accessed 2019/09/10), https://tomclancy.com/product/locked-on/. 2017.
T. Merriam. "Neural Computation in Stylometry II: An Application to the Works of Shakespeare and Marlowe". Literary and Linguistic Computing, vol. 9 (1), pp. 1-6. 1994.
U. Sapkota, S. Bethard, M. Montes-y-G mez and T. Solorio. "Not all character n-grams are created equal: A study in authorship attribution". Proc. Annual Conf. North Amer. Chapter ACL Human Lang. Technologies. https://www.aclweb.org/anthology/N15-1010, pp. 93-102. 2015.
US Congress. "The Federalist Papers". Congress.gov Resources. (accessed 2019/09/10), 2017. https://www.congress.gov/resources/display/content/The+Federalist+Papers.
V. Kešelj, F. Peng, N. Cercone and C. Thomas. "N-gram-based author profiles for authorship attribution". Proc. of the Pacific association for computational linguistics, Vol. 3, pp/ 255-264). 2003.
MANUSCRIPT AUTHORS
Mr. Mark Carman
School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia - Australia
carmd006@mymail.unisa.edu.au
Dr. Helen Ashman
School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, SA 5095, Australia - Australia


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS
 
You can contact us anytime since we have 24 x 7 support.
Join Us|List of Journals|
    
Copyrights © 2025 Computer Science Journals (CSC Journals). All rights reserved. Privacy Policy | Terms of Conditions