Application of Vector Space Models To Detect Semantically Non-Compositional Word Combinations in Turkish

Eren, Levent Tolga

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14365/57

Title:	Application of Vector Space Models To Detect Semantically Non-Compositional Word Combinations in Turkish
Other Titles:	Türkçede Anlamsal Birleşimi Olmayan Kelime Gruplarının Tespitinde Vektör Uzay Modellerinin Uygulanması
Authors:	Eren, Levent Tolga
Advisors:	Metin, Senem Kumova
Keywords:	anlamsal birle¸simlilik vekt¨or uzay modeli do^gal dil i¸sleme. semantic compositionality vector space model natural language processing. Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Computer Engineering and Computer Science and Control
Publisher:	İzmir Ekonomi Üniversitesi
Abstract:	Anlamsal birleşimlilik, kelime kombinasyonları ve bunların parçalarının anlamları arasındaki ilişkiyi tanımlamaktadır. Anlamsal birleşimli olmayan ifadelerde kelimeler bir araya gelerek farklı anlamlar meydana getirmektedir. Anlamsal birleşimli olmayan ifadelerin tanımlanması makine çevirisi, kelime anlamını belirginleştirme ve dil üretme gibi birçok dil işleme görevlerini destekleyebilmektedir. Bu tez çalışmasının amacı, Türkçe'de anlamsal birleşimli olmayan ifadelerin tespitinde uzay vektör modellerinin performanslarını araştırmaktır. Bu tezde altı farklı Türkçe derlemden elde edilen 2229 adet ikili kelime kombinasyonu içeren bir veri kümesi kullanılmıştır. Yapılan deneylerde beş farklı vektör uzay modeli içeren üç küme kullanılmıştır. Bu modeller duyarlılık, anma, ve F-ölçümü ölçütleriyle değerlendirilmiştir. Deneylerde tüm test derlemleri için kelime kombinasyonu ve kombinasyonu oluşturan ikinci kelimeye ait vektörler arası benzerliği ölçen modelin daha yüksek F değerleri ürettiği görülmüştür. The semantic compositionality defines the relation between the meanings of word combinations and their components. In non-compositional expressions, the words combine to generate a different meaning. The identification of non-compositional expressions may support several natural language processing tasks such as machine translation, word sense disambiguation and language generation. The objective of the thesis is exploring the performance of vector space models in detection of non-compositional expressions in Turkish. In this thesis, a data set of 2229 two-word combinations that is built from six different Turkish corpora is utilized. Three sets of five different vector space models are employed in the experiments. The evaluation of models is performed using three metrics: precision, recall and F-measure. The experimental results show that the model that measures the similarity between the vectors of word combination and the second composing word produced higher average F-scores for all testing corpora.
URI:	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=OykDDeWBWTL9-Wm52sZBrN1LMzGnDtR5tJFxpH3d6YD_Y7DSyhTmJrphyvG8jER7 https://hdl.handle.net/20.500.14365/57
Appears in Collections:	Lisansüstü Eğitim Enstitüsü Tez Koleksiyonu

Files in This Item:

File	Size	Format
57.pdf	943.51 kB	Adobe PDF	View/Open

Show full item record

CORE Recommender

Page view(s)

372

checked on Oct 27, 2025

Download(s)

44

checked on Oct 27, 2025

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM