Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
“C’mon – You Should Read This”: Automatic Identification of Tone from Language Text
Lisa Pearl, Mark Steyvers
Pages - 12 - 30     |    Revised - 01-08-2013     |    Published - 30-08-2013
Volume - 4   Issue - 1    |    Publication Date - August 2013  Table of Contents
language Text, Mental States, Tone, Game With a Purpose, Information Extraction, Natural Language Processing
Information extraction researchers have recently recognized that more subtle information beyond the basic semantic content of a message can be communicated via linguistic features in text, such as sentiments, emotions, perspectives, and intentions. One way to describe this information is that it represents something about the generator’s mental state, which is often interpreted as the tone of the message. A current technical barrier to developing a general-purpose tone identification system is the lack of reliable training data, with messages annotated with the message tone. We first describe a method for creating the necessary annotated data using human-based computation, based on interactive games between humans trying to generate and interpret messages conveying different tones. This draws on the use of game with a purpose methods from computer science and wisdom of the crowds methods from cognitive science. We then demonstrate the utility of this kind of database and the advantage of human-based computation by examining the performance of two machine learning classifiers trained on the database, each of which uses only shallow linguistic features. Though we already find near-human levels of performance with one classifier, we also suggest more sophisticated linguistic features and alternate implementations for the database that may improve tone identification results further.
CITED BY (1)  
1 Nagarsekar, U., Mhapsekar, A., Kulkarni, P., & Kalbande, D. R. (2013, December). Emotion detection from “the SMS of the internet”. In Intelligent Computational Systems (RAICS), 2013 IEEE Recent Advances in (pp. 316-321). IEEE.
1 Google Scholar
2 CiteSeerX
3 Scribd
4 SlideShare
5 PdfSR
1 B. Pang and L. Lee. “Opinion Mining and Sentiment Analysis”. Foundations and Trends in Information Retrieval, vol. 2(1-2), pp. 1-135, 2008.
2 A. Abbasi. “Affect intensity analysis of dark web forums,” in Proceedings of Intelligence and Security Informatics (ISI), 2007, pp. 282-288.
3 A. Agarwal, F. Biadsy, and K. Mckeown. “Contextual Phrase-Level Polarity Analysis using Lexical Affect Scoring and Syntactic N-grams,” in Proceedings of the 12th Conference of the European Chapter of the ACL, 2009, pp. 24-32.
4 K. Dave, S. Lawrence, and D. Pennock. “Mining the peanut gallery: Opinion extraction and semantic classification of product reviews,” in Proceedings of WWW, 2003, pp. 519-528.
5 S. Greene and P. Resnik. “More Than Words: Syntactic Packaging and Implicit Sentiment,” in Proceedings of NAACL, 2009.
6 A. Kennedy and D. Inkpen, D. “Sentiment classification of movie reviews using contextual valence shifters”. Computational Intelligence, vol. 22, pp. 110-125, 2006.
7 B. Pang, L. Lee, and S. Vaithyanathan. “Thumbs up? Sentiment Classification using Machine Learning Techniques,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2002, pp. 79-86.
8 P. Turney. 2002. “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews,” in Proceedings of the Association for Computational Linguistics (ACL), 2002, pp. 417-424.
9 J. Wiebe, T. Wilson, R. Bruce, M. Bell, and M. Martin. “Learning subjective language”.Computational Linguistics, vol. 30, pp. 277-308, 2004.
10 C. Alm, D. Roth, and R. Sproat. “Emotions from text: Machine learning for text-based emotion prediction,” in Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005.
11 P. Anand, J. King, J. Boyd-Graber, E. Wagner, C. Martell, D. Oard, and P. Resnik, "Believe Me -- We Can Do This! Annotating Persuasive Acts in Blog Text", in Proceedings of the AAAI Workshop on Computational Models of Natural Argument, 2011.
12 P. Subasic and A. Huettner. “Affect analysis of text using fuzzy semantic typing”. IEEE Transactions on Fuzzy Systems, vol. 9, pp. 483-496, 2001.
13 E. Hardisty, J. Boyd-Graber, and P. Resnik. “Modeling Perspective using Adaptor Grammars,” in Proceedings of Empirical Methods in Natural Language Processing, 2010.
14 W. Lin, T. Wilson, J. Wiebe, and A. Hauptmann. “Which side are you on? Identifying perspectives at the document and sentence levels,” in Proceedings of the Conference on Natural Language Learning (CoNLL), 2006. Internet: https://sites.google.com/site/weihaolinatcmu/data
15 L. Anolli, M. Balconi, and R. Ciceri. “Deceptive Miscommunication Theory (DeMiT): A New Model for the Analysis of Deceptive Communication,” in Say not to say: new perspectives on miscommunication. L. Anolli, R. Ciceri, and G. Rivs, Ed. IOS Press, 2002, pp. 73-100.
16 S. Gupta and D. Skillicorn. 2006. “Improving a Textual Deception Detection Model,” in Proceedings of the conference of the Center for Advanced Studies on Collaborative research,2006.
17 R. Mihalcea and C. Strapparava. “The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language,” in Proceedings of the Association for Computational Linguistics (ACL), 2009, pp. 309-312.
18 L. Zhou, J. Burgoon, J. Nunamaker, and D. Twitchell. “Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communication”. Group Decision and Negotiation, vol. 13, pp. 81-106, 2004.
19 L. Zhou and Y. Sung. 2008. “Cues to deception in online Chinese groups,” in Proceedings of the 41st Annual Hawaii international Conference on System Sciences, 2008, pp. 146-151.
20 A. Kosorukoff. “Human-based Genetic Algorithm, “ in IEEE Transactions on Systems, Man,and Cybernetics (SMC), 2001, pp. 3464-3469.
21 L. von Ahn. “Games With A Purpose”. IEEE Computer Magazine (June, 2006), pp. 96-98.
22 L. von Ahn and L. Dabbish. “Labeling Images with a Computer Game,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Association for Computing Machinery), 2004, pp. 319-326.
23 L. von Ahn, M. Kedia, and M. Blum. 2006. “Verbosity: A Game for Collecting Common-Sense Facts,” in proceedings of the SIGCHI conference on Human Factors in computing systems, 2006.
24 D. Graff & C. Cieri. “English Gigaword.” Linguistic Data Consortium, Internet:http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2003T05, 2003. [Mar 16,2012].
25 A. Gordon, A. “Story management technologies for organizational learning,” in Proceedings of the International Conference on Knowledge Management Graz, 2008. Internet:http://ict.usc.edu/files/publications/ IKNOW08.PDF [Feb 10, 2012].
26 K. Burton, A. Java, and I. Soboroff. “The ICWSM 2009 Spinn3r Dataset,” in Proceedings of the Third Annual Conference on Weblogs and Social Media (ICWSM), 2009. Internet:http://www.icwsm.org/ data/ [Feb 10 2012].
27 M. Diab, B. Dorr, L. Levin, T. Mitamura, R. Passonneau, O. Rambow, and L. Ramshaw.“Language Understanding Annotation Corpus”, Linguistic Data Consortium, 2009. Internet:http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2009T10 [Mar 16 2012].
28 J. Schler, M. Koppel, S. Argamon, and J. Pennebaker. “Effects of Age and Gender on Blogging”, in Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs, 2006.
29 M. Steyvers, M. Lee, B. Miller, and P. Hemmer. “The Wisdom of Crowds in the Recollection of Order Information,” In Advances in Neural Information Processing Systems, 2009.
30 B. Turner and M. Steyvers. “A Wisdom of the Crowd Approach to Forecasting,” in Proceedings of the 2nd NIPS workshop on Computational Social Science and the Wisdom of Crowds, 2011.
31 S. Yi, M. Steyvers, and M. Lee. “The Wisdom of Crowds in Combinatorial Problems.”Cognitive Science, to appear 2012.
32 M. Lee, M. Steyvers, M. de Young, and B. Miller. “Inferring expertise in knowledge and prediction ranking tasks”. Topics in Cognitive Science, to appear 2012.
33 R. Snow, B. O’Connor, D. Jurafsky, and A. Ng. “Cheap and Fast - But is it Good? Evaluating Non- Expert Annotations for Natural Language Tasks,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2008, pp. 254-263.
34 S. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game,” in Proceedings of ACM Conference on Human Factors in Computing Systems, 2009, pp 1207-1216.
35 E. Law and L. von Ahn. “Input-Agreement: A New Mechanism for Collecting Data Using Human Computation Games,” in Proceedings of ACM Conference on Human Factors in Computing Systems, 2009, pp 1197-1206.
36 L. Pearl and M. Steyvers. “Detecting authorship deception: A supervised machine learning approach using author writeprints”. Literary and Linguistic Computing., 2012. doi:10.1093/llc/fqs003.
37 B. Krishnapuram, M. Figueiredo, L. Carin, and A. Hartemink. “Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, pp. 957-968, 2005.
38 D. Klein and C. Manning. “Accurate Unlexicalized Parsing,” in Proceedings of the 41st Meeting of the Association for Computational Linguistics (ACL), 2003, pp. 423-430.
39 J. Pennebaker and M. Francis. Linguistic Inquiry and Word Count, 1st edition. Lawrence Erlbaum, 1999.
40 C. Strapparava and A. Valitutti, "WordNet-Affect: an affective extension of WordNet," in the Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC),2004, pp. 1083-1086.
41 T. Griffiths, and M. Steyvers. “Finding scientific topics”. Proceedings of the National Academy of Sciences, vol. 101, pp. 5228–5235, 2004.
Professor Lisa Pearl
University of California, Irvine - United States of America
Professor Mark Steyvers
University of California, Irvine - United States of America