Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14365/3406
Title: | Collocation extraction in Turkish texts using statistical methods | Authors: | Kumova Metin S. Karao?lan B. |
Keywords: | Collocation collocation extraction Collocation collocation extraction Frequency of occurrences Hypothesis tests Machine translations Mutual information method Mutual informations NAtural language processing Part of speech tagging Recall and precision Statistical techniques Turkish texts Turkishs Word Sense Disambiguation Computational linguistics Information theory Speech transmission Statistical tests Natural language processing systems |
Abstract: | Collocation is the combination of words in which words appear together more often than by chance. Since collocations are blocks of meaning, they play an important role in natural language processing applications (word sense disambiguation, part of speech tagging, machine translation, etc). In this study, a corpus of Turkish is subjected to the following statistical techniques: frequency of occurrence, mutual information and hypothesis tests. We have utilized both stemmed and surface form of corpus to explore the effect of stemming in collocation extraction. The techniques are evaluated by recall and precision measures. Chi-square hypothesis test and mutual information methods have produced better results compared to other methods on Turkish corpus. In addition, we have found that a stemmed corpus facilitates discrimination between successful and unsuccessful collocation extraction methods. © 2010 Springer-Verlag Berlin Heidelberg. | Description: | IZETeam;Microsoft Island;Post and Telecom Administration 7th International Conference on NLP, IceTAL 2010 -- 16 August 2010 through 18 August 2010 -- Reykjavik -- 81659 |
URI: | https://doi.org/10.1007/978-3-642-14770-8_27 https://hdl.handle.net/20.500.14365/3406 |
ISBN: | 3642147690 9783642147692 |
ISSN: | 0302-9743 |
Appears in Collections: | Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Show full item record
CORE Recommender
SCOPUSTM
Citations
14
checked on Nov 27, 2024
WEB OF SCIENCETM
Citations
10
checked on Nov 27, 2024
Page view(s)
74
checked on Nov 25, 2024
Download(s)
14
checked on Nov 25, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.