Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14365/3148
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Metin, Senem Kumova | - |
dc.date.accessioned | 2023-06-16T14:55:18Z | - |
dc.date.available | 2023-06-16T14:55:18Z | - |
dc.date.issued | 2016 | - |
dc.identifier.issn | 0267-6192 | - |
dc.identifier.uri | https://hdl.handle.net/20.500.14365/3148 | - |
dc.description.abstract | In all natural languages, due to the strong cohesive ties between the composing words, some recurrent combinations of words generate multiword expressions (MWEs). The extraction of MWEs in a text has an important role in natural language processing applications and information retrieval. In this study, we introduce a method of MWE extraction that ranks the candidates by the weakness of outer ties between the candidate and the neighbouring words in the text. The method presents a measure for the weakness of outer ties based on the degree of unpredictability of surrounding words in order to distinguish MWEs from other recurrent groups of consecutive words. Simply in the method, if the words following and preceding a MWE candidate are unpredictable due to the relatively excessive number of different neighbouring words, the candidate is accepted to have a strong evidence to be a real MWE. The method generates a single normalized score of unpredictability, which enables not only the comparison of MWE candidates of different occurrence frequency but also the comparison of MWE candidates with different number of composing words (such as the comparison of two-word candidates with three-word candidates). Comparisons with different groups of well-known methods; statistical measures of association and term hood, vector space models of composition and supervised learning methods; illustrate the effectiveness of the proposed method on two-word MWE candidates and in the merged set of two- and three-word candidates. | en_US |
dc.language.iso | en | en_US |
dc.publisher | C R L Publishing Ltd | en_US |
dc.relation.ispartof | Computer Systems Scıence And Engıneerıng | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Multiword expression | en_US |
dc.subject | predictability | en_US |
dc.subject | association measures | en_US |
dc.subject | term hood measures | en_US |
dc.subject | compositionality | en_US |
dc.subject | supervised learning | en_US |
dc.subject | Automatic Extraction | en_US |
dc.title | Neighbour unpredictability measure in multiword expression extraction | en_US |
dc.type | Article | en_US |
dc.identifier.scopus | 2-s2.0-84991725989 | en_US |
dc.department | İzmir Ekonomi Üniversitesi | en_US |
dc.identifier.volume | 31 | en_US |
dc.identifier.issue | 3 | en_US |
dc.identifier.startpage | 209 | en_US |
dc.identifier.endpage | 221 | en_US |
dc.identifier.wos | WOS:000383926100003 | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.identifier.scopusquality | Q2 | - |
item.grantfulltext | reserved | - |
item.openairetype | Article | - |
item.openairecristype | http://purl.org/coar/resource_type/c_18cf | - |
item.fulltext | With Fulltext | - |
item.languageiso639-1 | en | - |
item.cerifentitytype | Publications | - |
crisitem.author.dept | 05.04. Software Engineering | - |
Appears in Collections: | Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Files in This Item:
File | Size | Format | |
---|---|---|---|
2277.pdf Restricted Access | 513.14 kB | Adobe PDF | View/Open Request a copy |
CORE Recommender
SCOPUSTM
Citations
6
checked on Nov 20, 2024
WEB OF SCIENCETM
Citations
5
checked on Nov 20, 2024
Page view(s)
86
checked on Nov 18, 2024
Download(s)
6
checked on Nov 18, 2024
Google ScholarTM
Check
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.