Neighbour Unpredictability Measure in Multiword Expression Extraction

Loading...
Publication Logo

Date

2016

Authors

Metin, Senem Kumova

Journal Title

Journal ISSN

Volume Title

Publisher

C R L Publishing Ltd

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Journal Issue

Abstract

In all natural languages, due to the strong cohesive ties between the composing words, some recurrent combinations of words generate multiword expressions (MWEs). The extraction of MWEs in a text has an important role in natural language processing applications and information retrieval. In this study, we introduce a method of MWE extraction that ranks the candidates by the weakness of outer ties between the candidate and the neighbouring words in the text. The method presents a measure for the weakness of outer ties based on the degree of unpredictability of surrounding words in order to distinguish MWEs from other recurrent groups of consecutive words. Simply in the method, if the words following and preceding a MWE candidate are unpredictable due to the relatively excessive number of different neighbouring words, the candidate is accepted to have a strong evidence to be a real MWE. The method generates a single normalized score of unpredictability, which enables not only the comparison of MWE candidates of different occurrence frequency but also the comparison of MWE candidates with different number of composing words (such as the comparison of two-word candidates with three-word candidates). Comparisons with different groups of well-known methods; statistical measures of association and term hood, vector space models of composition and supervised learning methods; illustrate the effectiveness of the proposed method on two-word MWE candidates and in the merged set of two- and three-word candidates.

Description

Keywords

Multiword expression, predictability, association measures, term hood measures, compositionality, supervised learning, Automatic Extraction

Fields of Science

Citation

WoS Q

Scopus Q

Q2

Source

Computer Systems Scıence And Engıneerıng

Volume

31

Issue

3

Start Page

209

End Page

221
SCOPUS™ Citations

6

checked on Mar 15, 2026

Web of Science™ Citations

5

checked on Mar 15, 2026

Google Scholar Logo
Google Scholar™

Sustainable Development Goals

SDG data is not available