Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14365/2600
Title: Enlarging Multiword Expression Dataset by Co-Training
Authors: Kumova Metin, Senem
Keywords: Multiword expression
classification
training set
co-training
Publisher: Scientific Technical Research Council Turkey-Tubitak
Abstract: In multiword expressions (MWEs), multiple words unite to build a new unit in language. When MWE identification is accepted as a binary classification task, one of the most important factors in performance is to train the classifier with enough number of labelled samples. Since manual labelling is a time-consuming task, the performances of MWE recognition studies are limited with the size of the training sets. In this study, we propose the comparison-based and common-decision co-training approaches in order to enlarge the MWE dataset. In the experiments, the performances of the proposed approaches were compared to those of the standard co-training [1] and manual labelling where statistical and linguistic features are employed as two different views of the MWE dataset [2]. A number of tests with different settings were performed on a Turkish MWE dataset. Ten different classifiers were utilized in the experiments and the best performing classifier pair was observed to be the SMO-SMO pair. The experimental results showed that the common-decision co-training approach is an alternative to hand-labeling of large MWE datasets and both newly proposed approaches outperform the standard co-training [2] when the training set is to be enlarged in MWE classification.
URI: https://doi.org/10.3906/elk-1709-185
https://hdl.handle.net/20.500.14365/2600
ISSN: 1300-0632
1303-6203
Appears in Collections:Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
TR Dizin İndeksli Yayınlar Koleksiyonu / TR Dizin Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:
File SizeFormat 
2600.pdf
  Until 2030-01-01
506.55 kBAdobe PDFView/Open    Request a copy
Show full item record



CORE Recommender

Page view(s)

102
checked on Jan 6, 2025

Download(s)

6
checked on Jan 6, 2025

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.