Language Combinatorics: A Sentence Pattern Extraction Architecture Based on Combinatorial Explosion
Michal Ptaszynski, Rafal Rzepka, Yoshio Momouchi
Pages - 24 - 36     |    Revised - 01-07-2011     |    Published - 05-08-2011
Volume - 2   Issue - 1    |    Publication Date - July / August 2011  Table of Contents
Computational Linguistics, Information Retrieval and Extraction, Corpus Linguistis
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
1 NAKAJIMA, Y., PTASZYNSKI, M., HONMA, H., & MASUI, F. (2016). An Extraction Method for Future Reference Expressions Using Morphological and Semantic Patterns.
2 Sakuta, H., & Adachi, E. How Differently Do We Talk? A Study of Sentence Patterns in Groups of Different Age, Gender and Social Status.
3 NAKAJIMA, Y., PTASZYNSKI, M., HONMA, H., & MASUI, F. (2014). FAN-14-029 Extraction of Future Reference Expressions in Trend Information. ? nn te ri ji e nn Suites ? su Te Rousseau · ? nn Polyster ji ? Rousseau Lecture Proceedings, 2014 (24) , 129-134.
4 Ptaszynski, M., Masui, F., Rzepka, R., & Araki, K. (2014). First Glance on Pattern-based Language Modeling. Language Acquisition and Understanding Research Group (LAU), Technical Reports, Summer.
5 Nakajima, Y., Ptaszynski, M., Honma, H., & Masui, F. (2014, March). Investigation of Future Reference Expressions in Trend Information. In Proceedings of the 2014 AAAI Spring Symposium Series (pp. 31-38).
6 Ptaszynski, M., Masui, F., Rzepka, R., & Araki, K. (2014). Detecting emotive sentences with pattern-based language modelling. Procedia Computer Science, 35, 484-493.
7 D'hondt, E. K. L. (2014). Cracking the patent: using phrasal representations to aid patent classfication. [Sl: sn].
8 Ptaszynski, M., Masui, F., Rzepka, R., & Araki, K. (2014). Automatic Extraction of Emotive and Non-emotive Sentence Patterns. In Proceedings of The Twentieth Annual Meeting of The Association for Natural Language Processing (NLP2014) (pp. 868-871).
9 Ptaszynski, M., Masui, F., Dybala, P., Rzepka, R., & Araki, K. Open Source Affect Analysis System with Extensions.
10 Nakajima, Y., Ptaszynski, M., Honma, H., & Masui, F. Extracting References to the Future from News using Morphosemantic Patterns.
11 Ptaszynski, M., Dokoshi, H., Oyama, S., Rzepka, R., Kurihara, M., Araki, K., & Momouchi, Y. (2013). Affect analysis in context of characters in narratives. Expert Systems with Applications, 40(1), 168-176.
12 Ptaszynski, M., Hasegawa, D., & Masui, F. Women Like Backchannel, But Men Finish Earlier: Pattern Based Language Modeling of Conversations Reveals Gender and Social Distance Differences.
13 D’hondt, E., Verberne, S., Weber, N., Koster, C., & Boves, L. (2012). Using skipgrams and pos-based feature selection for patent classification. Computational Linguistics in the Netherlands Journal, 2, 52-70.
14 Lempa, P., Ptaszynski, M., & Masui, F. Cyberbullying Blocker Application for Android.
15 Ptaszynski, M., Masui, F., Kimura, Y., Rzepka, R., & Araki, K. Extracting Patterns of Harmful Expressions for Cyberbullying Detection.
