Feature Selection in Multiword Expression Recognition

dc.contributor.author Metin, Senem Kumova
dc.date.accessioned 2023-06-16T12:59:26Z
dc.date.available 2023-06-16T12:59:26Z
dc.date.issued 2018
dc.description.abstract In multiword expression (MWE) recognition, there exist many studies where different learning methods are employed to decide whether given word combination is a multiword expression. The recognition methods commonly utilize a number of features that are extracted from a data source, frequently from the given text. Though the recognition methods and the features are well studied, we believe that to achieve the best possible performance with a learning method, different subsets of features should also be considered and the best performing subset must be selected. In this paper, we propose a procedure that covers the performance comparison of well-known feature selection methods to obtain the best feature subset in MWE recognition. The evaluation tests are performed on a Turkish MWE data set and the performance is measured by precision, recall and Fl values. The highest Fl value =0.731 is obtained by C4.5 classifier employing either wrapper or filtering method in feature selection. In the regarding setting(s), it is examined that the performance is increased by 1.11% compared to the setting where all features are employed in classification. Based on the experimental results, it may be stated that feature selection improves the performance of MWE recognition by eliminating the noisy/non-effective features. Moreover, it is obvious that proposed feature selection method contributes to the overall MWE recognition system by reducing the measurement and storage requirements due to the lower number of features in classification, providing a faster and more -cost effective learning model. (C) 2017 Elsevier Ltd. All rights reserved. en_US
dc.description.sponsorship TUBITAK - The Scientific and Technological Research Council of Turkey [115E469] en_US
dc.description.sponsorship This work is carried under the grant of TUBITAK - The Scientific and Technological Research Council of Turkey to Project No: 115E469, Identification of Multi-word Expressions in Turkish Texts. en_US
dc.identifier.doi 10.1016/j.eswa.2017.09.047
dc.identifier.issn 0957-4174
dc.identifier.issn 1873-6793
dc.identifier.scopus 2-s2.0-85029703177
dc.identifier.uri https://doi.org/10.1016/j.eswa.2017.09.047
dc.identifier.uri https://hdl.handle.net/20.500.14365/1218
dc.language.iso en en_US
dc.publisher Pergamon-Elsevier Science Ltd en_US
dc.relation.ispartof Expert Systems Wıth Applıcatıons en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Multiword expression en_US
dc.subject Multiword expression recognition en_US
dc.subject Learning algorithms en_US
dc.subject Feature selection en_US
dc.subject Named Entity Recognition en_US
dc.title Feature Selection in Multiword Expression Recognition en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.scopusid 24471923700
gdc.bip.impulseclass C4
gdc.bip.influenceclass C4
gdc.bip.popularityclass C4
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department İzmir Ekonomi Üniversitesi en_US
gdc.description.departmenttemp [Metin, Senem Kumova] Izmir Univ Econ, Fac Engn, TR-35330 Izmir, Turkey en_US
gdc.description.endpage 123 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.startpage 106 en_US
gdc.description.volume 92 en_US
gdc.description.wosquality Q1
gdc.identifier.openalex W2754717791
gdc.identifier.wos WOS:000414107100009
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.downloads 2
gdc.oaire.impulse 6.0
gdc.oaire.influence 3.354068E-9
gdc.oaire.isgreen true
gdc.oaire.popularity 6.630462E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.oaire.views 5
gdc.openalex.collaboration National
gdc.openalex.fwci 1.3652
gdc.openalex.normalizedpercentile 0.86
gdc.opencitations.count 11
gdc.plumx.crossrefcites 11
gdc.plumx.mendeley 39
gdc.plumx.scopuscites 14
gdc.scopus.citedcount 14
gdc.virtual.author Kumova Metin, Senem
gdc.wos.citedcount 14
relation.isAuthorOfPublication 81d6fcea-c590-42aa-8443-7459c9eab7fa
relation.isAuthorOfPublication.latestForDiscovery 81d6fcea-c590-42aa-8443-7459c9eab7fa
relation.isOrgUnitOfPublication 805c60d5-b806-4645-8214-dd40524c388f
relation.isOrgUnitOfPublication 26a7372c-1a5e-42d9-90b6-a3f7d14cad44
relation.isOrgUnitOfPublication e9e77e3e-bc94-40a7-9b24-b807b2cd0319
relation.isOrgUnitOfPublication.latestForDiscovery 805c60d5-b806-4645-8214-dd40524c388f

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
241.pdf
Size:
1.87 MB
Format:
Adobe Portable Document Format