Enlarging Multiword Expression Dataset by Co-Training
| dc.contributor.author | Kumova Metin, Senem | |
| dc.date.accessioned | 2023-06-16T14:41:20Z | |
| dc.date.available | 2023-06-16T14:41:20Z | |
| dc.date.issued | 2018 | |
| dc.description.abstract | In multiword expressions (MWEs), multiple words unite to build a new unit in language. When MWE identification is accepted as a binary classification task, one of the most important factors in performance is to train the classifier with enough number of labelled samples. Since manual labelling is a time-consuming task, the performances of MWE recognition studies are limited with the size of the training sets. In this study, we propose the comparison-based and common-decision co-training approaches in order to enlarge the MWE dataset. In the experiments, the performances of the proposed approaches were compared to those of the standard co-training [1] and manual labelling where statistical and linguistic features are employed as two different views of the MWE dataset [2]. A number of tests with different settings were performed on a Turkish MWE dataset. Ten different classifiers were utilized in the experiments and the best performing classifier pair was observed to be the SMO-SMO pair. The experimental results showed that the common-decision co-training approach is an alternative to hand-labeling of large MWE datasets and both newly proposed approaches outperform the standard co-training [2] when the training set is to be enlarged in MWE classification. | en_US |
| dc.description.sponsorship | Scientific and Technological Research Council of Turkey [115E469] | en_US |
| dc.description.sponsorship | This work was carried out under the grant of The Scientific and Technological Research Council of Turkey (Project No. 115E469, Identification of Multiword Expressions in Turkish Texts). Further information/statistics on the MWE dataset is available on the project web page (http://app.ieu-nlpteam.com:8000). | en_US |
| dc.identifier.doi | 10.3906/elk-1709-185 | |
| dc.identifier.issn | 1300-0632 | |
| dc.identifier.issn | 1303-6203 | |
| dc.identifier.scopus | 2-s2.0-85054525652 | |
| dc.identifier.uri | https://doi.org/10.3906/elk-1709-185 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14365/2600 | |
| dc.language.iso | en | en_US |
| dc.publisher | Scientific Technical Research Council Turkey-Tubitak | en_US |
| dc.relation.ispartof | Turkısh Journal of Electrıcal Engıneerıng And Computer Scıences | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | Multiword expression | en_US |
| dc.subject | classification | en_US |
| dc.subject | training set | en_US |
| dc.subject | co-training | en_US |
| dc.title | Enlarging Multiword Expression Dataset by Co-Training | en_US |
| dc.type | Article | en_US |
| dspace.entity.type | Publication | |
| gdc.author.id | Kumova Metin, Senem/0000-0002-9606-3625 | |
| gdc.author.scopusid | 24471923700 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::journal::journal article | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | İEÜ, Mühendislik Fakültesi, Yazılım Mühendisliği Bölümü | en_US |
| gdc.description.departmenttemp | [Kumova Metin, Senem] Izmir Univ Econ, Fac Engn, Dept Software Engn, Izmir, Turkey | en_US |
| gdc.description.endpage | 2594 | en_US |
| gdc.description.issue | 5 | en_US |
| gdc.description.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | Q2 | |
| gdc.description.startpage | 2583 | en_US |
| gdc.description.volume | 26 | en_US |
| gdc.description.wosquality | Q3 | |
| gdc.identifier.openalex | W2895395003 | |
| gdc.identifier.trdizinid | 323563 | |
| gdc.identifier.wos | WOS:000448109200034 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.index.type | TR-Dizin | |
| gdc.oaire.accesstype | GOLD | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.downloads | 12 | |
| gdc.oaire.impulse | 0.0 | |
| gdc.oaire.influence | 2.4895952E-9 | |
| gdc.oaire.isgreen | true | |
| gdc.oaire.popularity | 1.0376504E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0202 electrical engineering, electronic engineering, information engineering | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.oaire.views | 54 | |
| gdc.openalex.fwci | 0.0 | |
| gdc.openalex.normalizedpercentile | 0.11 | |
| gdc.opencitations.count | 0 | |
| gdc.plumx.mendeley | 1 | |
| gdc.plumx.scopuscites | 0 | |
| gdc.scopus.citedcount | 0 | |
| gdc.virtual.author | Kumova Metin, Senem | |
| gdc.wos.citedcount | 0 | |
| relation.isAuthorOfPublication | 81d6fcea-c590-42aa-8443-7459c9eab7fa | |
| relation.isAuthorOfPublication.latestForDiscovery | 81d6fcea-c590-42aa-8443-7459c9eab7fa | |
| relation.isOrgUnitOfPublication | 805c60d5-b806-4645-8214-dd40524c388f | |
| relation.isOrgUnitOfPublication | 26a7372c-1a5e-42d9-90b6-a3f7d14cad44 | |
| relation.isOrgUnitOfPublication | e9e77e3e-bc94-40a7-9b24-b807b2cd0319 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 805c60d5-b806-4645-8214-dd40524c388f |
Files
Original bundle
1 - 1 of 1
