Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14365/4562
Title: Identifying collocations in Turkish using statistical methods
Other Titles: Türkiye Türkçesinde Eşdizimlerin İstatistiksel Yöntemlerle Belirlenmesi
Authors: Metin S.K.
Karaoğlan B.
Keywords: Collocation
Corpus
Natural language processing
Turkey Turkish
Publisher: Ahmet Yesevi University
Abstract: Collocation is the combination of words in which words appear together more often than by chance in order to create a block of meaning. Since the extraction of collocations provides many benefits in automatic processing, translation of Turkish texts and in learning Turkish, it is an important issue in Turkish natural language processing. In this study several statistical techniques, including occurrence frequency, pointwise mutual information and hypothesis tests, are applied on Turkey Turkish corpus to automatically identify collocations. We have utilized both stemmed and surface forms of words in order to explore the effect of stemming in collocation extraction. The techniques are evaluated using the F-measure. The chi-square hypothesis test and pointwise mutual information methods have produced better results compared to other methods. In addition, we have observed that when words are stemmed, methods which may be considered as successful in collocation extraction may be more clearly discriminated. © 2016, Ahmet Yesevi University. All rights reserved.
URI: https://hdl.handle.net/20.500.14365/4562
ISSN: 1301-0549
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:
File SizeFormat 
3624.pdf
  Restricted Access
930.97 kBAdobe PDFView/Open    Request a copy
Show full item record



CORE Recommender

SCOPUSTM   
Citations

6
checked on Nov 20, 2024

WEB OF SCIENCETM
Citations

3
checked on Nov 20, 2024

Page view(s)

74
checked on Nov 18, 2024

Download(s)

6
checked on Nov 18, 2024

Google ScholarTM

Check





Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.