Standard Co-Training in Multiword Expression Detection

Metin, Senem Kumova2023-06-162023-06-162017978-3-319-72038-8978-3-319-72037-10302-97431611-3349https://doi.org/10.1007/978-3-319-72038-8_14https://hdl.handle.net/20.500.14365/8239th International Conference on Intelligent Human Computer Interaction (IHCI) -- DEC 11-13, 2017 -- Evry, FRANCEMultiword expressions (MWEs) are units in language where multiple words unite without an obvious/known reason. Since MWEs occupy a prominent amount of space in both written and spoken language materials, identification of MWEs is accepted to be an important task in natural language processing. In this paper, considering MWE detection as a binary classification task, we propose to use a semi-supervised learning algorithm, standard co-training [1] Co-training is a semi-supervised method that employs two classifiers with two different views to label unlabeled data iteratively in order to enlarge the training sets of limited size. In our experiments, linguistic and statistical features that distinguish MWEs from random word combinations are utilized as two different views. Two different pairs of classifiers are employed with a group of experimental settings. The tests are performed on a Turkish MWE data set of 3946 positive and 4230 negative MWE candidates. The results showed that the classifier where statistical view is considered succeeds in MWE detection when the training set is enlarged by co-training.eninfo:eu-repo/semantics/openAccessMultiword expressionClassificationCo-trainingStandard Co-Training in Multiword Expression DetectionConference Object10.1007/978-3-319-72038-8_142-s2.0-85038215750