Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14365/3406
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKumova Metin S.-
dc.contributor.authorKarao?lan B.-
dc.date.accessioned2023-06-16T14:58:03Z-
dc.date.available2023-06-16T14:58:03Z-
dc.date.issued2010-
dc.identifier.isbn3642147690-
dc.identifier.isbn9783642147692-
dc.identifier.issn0302-9743-
dc.identifier.urihttps://doi.org/10.1007/978-3-642-14770-8_27-
dc.identifier.urihttps://hdl.handle.net/20.500.14365/3406-
dc.descriptionIZETeam;Microsoft Island;Post and Telecom Administrationen_US
dc.description7th International Conference on NLP, IceTAL 2010 -- 16 August 2010 through 18 August 2010 -- Reykjavik -- 81659en_US
dc.description.abstractCollocation is the combination of words in which words appear together more often than by chance. Since collocations are blocks of meaning, they play an important role in natural language processing applications (word sense disambiguation, part of speech tagging, machine translation, etc). In this study, a corpus of Turkish is subjected to the following statistical techniques: frequency of occurrence, mutual information and hypothesis tests. We have utilized both stemmed and surface form of corpus to explore the effect of stemming in collocation extraction. The techniques are evaluated by recall and precision measures. Chi-square hypothesis test and mutual information methods have produced better results compared to other methods on Turkish corpus. In addition, we have found that a stemmed corpus facilitates discrimination between successful and unsuccessful collocation extraction methods. © 2010 Springer-Verlag Berlin Heidelberg.en_US
dc.language.isoenen_US
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectCollocationen_US
dc.subjectcollocation extractionen_US
dc.subjectCollocationen_US
dc.subjectcollocation extractionen_US
dc.subjectFrequency of occurrencesen_US
dc.subjectHypothesis testsen_US
dc.subjectMachine translationsen_US
dc.subjectMutual information methoden_US
dc.subjectMutual informationsen_US
dc.subjectNAtural language processingen_US
dc.subjectPart of speech taggingen_US
dc.subjectRecall and precisionen_US
dc.subjectStatistical techniquesen_US
dc.subjectTurkish textsen_US
dc.subjectTurkishsen_US
dc.subjectWord Sense Disambiguationen_US
dc.subjectComputational linguisticsen_US
dc.subjectInformation theoryen_US
dc.subjectSpeech transmissionen_US
dc.subjectStatistical testsen_US
dc.subjectNatural language processing systemsen_US
dc.titleCollocation extraction in Turkish texts using statistical methodsen_US
dc.typeConference Objecten_US
dc.identifier.doi10.1007/978-3-642-14770-8_27-
dc.identifier.scopus2-s2.0-77956604153en_US
dc.authorscopusid24471923700-
dc.identifier.volume6233 LNAIen_US
dc.identifier.startpage238en_US
dc.identifier.endpage249en_US
dc.identifier.wosWOS:000289187000027en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.identifier.scopusqualityQ3-
dc.identifier.wosqualityN/A-
item.openairetypeConference Object-
item.cerifentitytypePublications-
item.grantfulltextembargo_20300101-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.fulltextWith Fulltext-
item.languageiso639-1en-
crisitem.author.dept05.04. Software Engineering-
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Files in This Item:
File SizeFormat 
2514.pdf
  Until 2030-01-01
254.63 kBAdobe PDFView/Open
Show simple item record



CORE Recommender

SCOPUSTM   
Citations

14
checked on Nov 27, 2024

WEB OF SCIENCETM
Citations

10
checked on Nov 27, 2024

Page view(s)

74
checked on Nov 25, 2024

Download(s)

14
checked on Nov 25, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.