From Words To Sentences: Advancing Turkish Emotion Analysis Through Emotion Enrichment

Aka Uymaz, Hande

From Words To Sentences: Advancing Turkish Emotion Analysis Through Emotion Enrichment

Files

845177.pdf (3.53 MB)

Date

2023

Authors

Aka Uymaz, Hande

Publisher

İzmir Ekonomi Üniversitesi

Abstract

Doğal dil işleme çalışmalarında dilin makineler tarafından anlaşılması, dilin doğru algılanması, veri kaynağındaki gerçek anlamın yakalanması ve duygusal nüansların ayırt edilmesi ihtiyacı nedenleriyle zorluklar içermektedir. Metinsel verileri temsil ederken mevcut kelime vektörleştirme modelleri anlamsal bilgilerin çıkarılmasında başarılıdır. Ancak bu modeller sıklıkla bir arada kullanılan kelimeleri vektör uzayında birbirine benzer şekilde temsil etmektedir. Bu nedenle, zıt duygulara sahip kelimeler, sık sık bir arada bulunmaları nedeniyle benzer vektör temsillerine sahip olabilir. Duygu tespitindeki bu tür eksikliklerin üstesinden gelmek için mevcut araştırmalar, duygusal bilgiler ekleyerek vektörleri zenginleştirmeye odaklanmaktadır. Vektör zenginleştirmede temel amaç, benzer semantik ve duygusal anlamlara sahip kelimelerin yakınlığını artırmak için vektör uzayını yeniden projekte etmektir. Bu çalışmada, iki semantik (Word2Vec ve GloVe) ve iki bağlamsal (BERT ve DistilBERT) vektörleştirme yöntemi kullanarak üç duygu zenginleştirme modeli Türkçe kelime ve cümlelere uygulanmıştır. Yapı itibariyle eklemeli bir dil olan Türkçenin bu bağlamda sıklıkla çalışılan diğer dillerden farklı sonuçlar üretmesi beklenmektedir. Sonuçlar, hem kelime hem de cümle düzeyinde zenginleştirmenin umut verici sonuçlarını göstermektedir. Zenginleştirilmiş cümle gösterimi literatürde ilk kez hem İngilizce hem de Türkçe dillerinde önerilmiştir. Ayrıca, herhangi bir dil ve vektör modeline uygulanabilen, duygu sözlüklerini filtreleme ve yüksek boyutlu vektörlerin boyutunu azaltarak duygusal bilgi içeren bölümleri belirleme amacını taşıyan bir optimizasyon yöntemi önerilmiştir. Deneysel sonuçlar, duygusal açıdan zenginleştirilmiş vektör temsillerinin orijinal modellerden daha iyi sonuçlar verdiğini göstermektedir.
The comprehension of language by machines in natural language processing studies poses challenges due to the need for an accurate understanding of language, capturing the true meaning within the data source, and distinguishing emotional nuances. When representing textual data, current word vectorization models are successful in extracting semantic information. However, these models represent words that are often used together as similar to each other in vector space. Thus, words with opposite emotions may have similar vector representations because of their frequent co-occurrence. To overcome such deficiencies in emotion detection, current research focuses on enriching vectors by adding emotional information. In vector enrichment, the fundamental goal is to reproject the vector space to increase the proximity of words with similar semantic and emotional meanings. This study applies three emotion enrichment models to Turkish words and sentences, using two semantic (Word2Vec and GloVe) and two contextual (BERT and DistilBERT) vectorization methods. Turkish, an agglutinative language by structure, is expected to produce different results than other languages frequently studied in this context. The results demonstrate promising outcomes of enrichment at both the word and sentence levels. Enriched sentence representation was proposed for the first time in the literature in both English and Turkish languages. Moreover, an optimization method involving filtering the emotion lexicons and reducing the dimensionality of the high-dimensional vectors to discern parts containing emotional information is proposed which can be applied to any language and vector model. Experimental results indicate that emotionally enriched vector representations yield better results than original models.

Keywords

Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control

Turkish CoHE Thesis Center URL

Click Here

WoS Q

N/A

Scopus Q

N/A

Start Page

1

End Page

152

URI

https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=weFMBHaUra8rsS5wi2bmHEM7JIZZ1iKLePTHraVD1S5WRKtSGTN0DctegmvnkgTr
https://hdl.handle.net/20.500.14365/5190

Collections

Lisansüstü Eğitim Enstitüsü Tez Koleksiyonu

Full item page

Google Scholar™

Check

From Words To Sentences: Advancing Turkish Emotion Analysis Through Emotion Enrichment

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Journal Issue

Abstract

Description

Keywords

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

Google Scholar™

Sustainable Development Goals