Collocation Extraction in Turkish Texts Using Statistical Methods

dc.contributor.author Kumova Metin S.
dc.contributor.author Karao?lan B.
dc.date.accessioned 2023-06-16T14:58:03Z
dc.date.available 2023-06-16T14:58:03Z
dc.date.issued 2010
dc.description IZETeam;Microsoft Island;Post and Telecom Administration en_US
dc.description 7th International Conference on NLP, IceTAL 2010 -- 16 August 2010 through 18 August 2010 -- Reykjavik -- 81659 en_US
dc.description.abstract Collocation is the combination of words in which words appear together more often than by chance. Since collocations are blocks of meaning, they play an important role in natural language processing applications (word sense disambiguation, part of speech tagging, machine translation, etc). In this study, a corpus of Turkish is subjected to the following statistical techniques: frequency of occurrence, mutual information and hypothesis tests. We have utilized both stemmed and surface form of corpus to explore the effect of stemming in collocation extraction. The techniques are evaluated by recall and precision measures. Chi-square hypothesis test and mutual information methods have produced better results compared to other methods on Turkish corpus. In addition, we have found that a stemmed corpus facilitates discrimination between successful and unsuccessful collocation extraction methods. © 2010 Springer-Verlag Berlin Heidelberg. en_US
dc.identifier.doi 10.1007/978-3-642-14770-8_27
dc.identifier.isbn 3642147690
dc.identifier.isbn 9783642147692
dc.identifier.issn 0302-9743
dc.identifier.scopus 2-s2.0-77956604153
dc.identifier.uri https://doi.org/10.1007/978-3-642-14770-8_27
dc.identifier.uri https://hdl.handle.net/20.500.14365/3406
dc.language.iso en en_US
dc.relation.ispartof Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Collocation en_US
dc.subject collocation extraction en_US
dc.subject Collocation en_US
dc.subject collocation extraction en_US
dc.subject Frequency of occurrences en_US
dc.subject Hypothesis tests en_US
dc.subject Machine translations en_US
dc.subject Mutual information method en_US
dc.subject Mutual informations en_US
dc.subject NAtural language processing en_US
dc.subject Part of speech tagging en_US
dc.subject Recall and precision en_US
dc.subject Statistical techniques en_US
dc.subject Turkish texts en_US
dc.subject Turkishs en_US
dc.subject Word Sense Disambiguation en_US
dc.subject Computational linguistics en_US
dc.subject Information theory en_US
dc.subject Speech transmission en_US
dc.subject Statistical tests en_US
dc.subject Natural language processing systems en_US
dc.title Collocation Extraction in Turkish Texts Using Statistical Methods en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.scopusid 24471923700
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.collaboration.industrial false
gdc.description.departmenttemp Kumova Metin, S., Engineering and Computer Science Faculty, Izmir University of Economics, Izmir, Turkey; Karao?lan, B., International Computing Institute, Ege University, Izmir, Turkey en_US
gdc.description.endpage 249 en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q3
gdc.description.startpage 238 en_US
gdc.description.volume 6233 LNAI en_US
gdc.description.wosquality N/A
gdc.identifier.openalex W1509179932
gdc.identifier.wos WOS:000289187000027
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 0.0
gdc.oaire.influence 3.0536842E-9
gdc.oaire.isgreen false
gdc.oaire.keywords collocation extraction
gdc.oaire.keywords Collocation
gdc.oaire.popularity 2.0605542E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration National
gdc.openalex.fwci 0.0
gdc.openalex.normalizedpercentile 0.09
gdc.opencitations.count 6
gdc.plumx.crossrefcites 4
gdc.plumx.mendeley 13
gdc.plumx.scopuscites 14
gdc.scopus.citedcount 14
gdc.virtual.author Kumova Metin, Senem
gdc.wos.citedcount 10
relation.isAuthorOfPublication 81d6fcea-c590-42aa-8443-7459c9eab7fa
relation.isAuthorOfPublication.latestForDiscovery 81d6fcea-c590-42aa-8443-7459c9eab7fa
relation.isOrgUnitOfPublication 805c60d5-b806-4645-8214-dd40524c388f
relation.isOrgUnitOfPublication 26a7372c-1a5e-42d9-90b6-a3f7d14cad44
relation.isOrgUnitOfPublication e9e77e3e-bc94-40a7-9b24-b807b2cd0319
relation.isOrgUnitOfPublication.latestForDiscovery 805c60d5-b806-4645-8214-dd40524c388f

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2514.pdf
Size:
254.63 KB
Format:
Adobe Portable Document Format