Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14365/2600
Title: | Enlarging multiword expression dataset by co-training | Authors: | Kumova Metin, Senem | Keywords: | Multiword expression classification training set co-training |
Publisher: | Scientific Technical Research Council Turkey-Tubitak | Abstract: | In multiword expressions (MWEs), multiple words unite to build a new unit in language. When MWE identification is accepted as a binary classification task, one of the most important factors in performance is to train the classifier with enough number of labelled samples. Since manual labelling is a time-consuming task, the performances of MWE recognition studies are limited with the size of the training sets. In this study, we propose the comparison-based and common-decision co-training approaches in order to enlarge the MWE dataset. In the experiments, the performances of the proposed approaches were compared to those of the standard co-training [1] and manual labelling where statistical and linguistic features are employed as two different views of the MWE dataset [2]. A number of tests with different settings were performed on a Turkish MWE dataset. Ten different classifiers were utilized in the experiments and the best performing classifier pair was observed to be the SMO-SMO pair. The experimental results showed that the common-decision co-training approach is an alternative to hand-labeling of large MWE datasets and both newly proposed approaches outperform the standard co-training [2] when the training set is to be enlarged in MWE classification. | URI: | https://doi.org/10.3906/elk-1709-185 https://hdl.handle.net/20.500.14365/2600 |
ISSN: | 1300-0632 1303-6203 |
Appears in Collections: | Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection TR Dizin İndeksli Yayınlar Koleksiyonu / TR Dizin Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Files in This Item:
File | Size | Format | |
---|---|---|---|
2600.pdf Until 2030-01-01 | 506.55 kB | Adobe PDF | View/Open Request a copy |
CORE Recommender
Page view(s)
98
checked on Nov 18, 2024
Download(s)
6
checked on Nov 18, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.