Neighbour Unpredictability Measure in Multiword Expression Extraction

dc.contributor.author Metin, Senem Kumova
dc.date.accessioned 2023-06-16T14:55:18Z
dc.date.available 2023-06-16T14:55:18Z
dc.date.issued 2016
dc.description.abstract In all natural languages, due to the strong cohesive ties between the composing words, some recurrent combinations of words generate multiword expressions (MWEs). The extraction of MWEs in a text has an important role in natural language processing applications and information retrieval. In this study, we introduce a method of MWE extraction that ranks the candidates by the weakness of outer ties between the candidate and the neighbouring words in the text. The method presents a measure for the weakness of outer ties based on the degree of unpredictability of surrounding words in order to distinguish MWEs from other recurrent groups of consecutive words. Simply in the method, if the words following and preceding a MWE candidate are unpredictable due to the relatively excessive number of different neighbouring words, the candidate is accepted to have a strong evidence to be a real MWE. The method generates a single normalized score of unpredictability, which enables not only the comparison of MWE candidates of different occurrence frequency but also the comparison of MWE candidates with different number of composing words (such as the comparison of two-word candidates with three-word candidates). Comparisons with different groups of well-known methods; statistical measures of association and term hood, vector space models of composition and supervised learning methods; illustrate the effectiveness of the proposed method on two-word MWE candidates and in the merged set of two- and three-word candidates. en_US
dc.identifier.issn 0267-6192
dc.identifier.scopus 2-s2.0-84991725989
dc.identifier.uri https://hdl.handle.net/20.500.14365/3148
dc.language.iso en en_US
dc.publisher C R L Publishing Ltd en_US
dc.relation.ispartof Computer Systems Scıence And Engıneerıng en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Multiword expression en_US
dc.subject predictability en_US
dc.subject association measures en_US
dc.subject term hood measures en_US
dc.subject compositionality en_US
dc.subject supervised learning en_US
dc.subject Automatic Extraction en_US
dc.title Neighbour Unpredictability Measure in Multiword Expression Extraction en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.description.department İzmir Ekonomi Üniversitesi en_US
gdc.description.departmenttemp [Metin, Senem Kumova] Izmir Univ Econ, Dept Software Engn, Izmir, Turkey en_US
gdc.description.endpage 221 en_US
gdc.description.issue 3 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 209 en_US
gdc.description.volume 31 en_US
gdc.identifier.wos WOS:000383926100003
gdc.index.type WoS
gdc.index.type Scopus
gdc.scopus.citedcount 6
gdc.virtual.author Kumova Metin, Senem
gdc.wos.citedcount 5
relation.isAuthorOfPublication 81d6fcea-c590-42aa-8443-7459c9eab7fa
relation.isAuthorOfPublication.latestForDiscovery 81d6fcea-c590-42aa-8443-7459c9eab7fa
relation.isOrgUnitOfPublication 805c60d5-b806-4645-8214-dd40524c388f
relation.isOrgUnitOfPublication 26a7372c-1a5e-42d9-90b6-a3f7d14cad44
relation.isOrgUnitOfPublication e9e77e3e-bc94-40a7-9b24-b807b2cd0319
relation.isOrgUnitOfPublication.latestForDiscovery 805c60d5-b806-4645-8214-dd40524c388f

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2277.pdf
Size:
513.14 KB
Format:
Adobe Portable Document Format