Data Mining for Emotion Recognition in Speech

Akkurt, Gamze

Data Mining for Emotion Recognition in Speech

Files

155-572974.pdf (903.38 KB)

Date

2019

Authors

Akkurt, Gamze

Publisher

İzmir Ekonomi Üniversitesi

Abstract

Konuş¸ma sinyalinde duygu sınıflandırması için kullanılan popüler özellikler temel frekans, ses kalitesi, enerji, spektral ve MFCC'dir. Çalışmaların çoğu konuşmadaki duyguların tanınmasında bu akustik özelliklere odaklanırken, bu tezde biz; duygusal kalıplardan elde edilen özellikleri kullanarak duygu tanıma sorunu ele alınmıstır. Yaklaşımımızda, konuş¸ma sinyalini ayrıklaştırılmış, sinyale dönüştürür ve farklı duygular arasında ayrım yapabilen ayırt edici kalıplar çıkartılmaktadır. Ardından, sınıflandırıcıyı güçlendirmek için; çıkartılan kalıplarla bir dizi vektör özelliği oluşturulur. Deneysel sonuçlar, önerilen yaklaşımın, hem desene dayalı özelliklerden hem de desene ait özelliklerle desteklenen akustik özelliklerden duygusal konuşma durumunu etkili bir şekilde öğrendiğini göstermektedir. Desen bazlı özellikler, son teknoloji akustik özelliklere kıyasla iki sınıflandırıcı teknik kullanılarak doğrulukta %35 'lik artış ile sonuçlanmaktadır. Ayrca, bütün akustik özellikler, desen bazlı özelliklerile desteklendiğinde % 80 'nin üzerinde artış göstermektedir.
The popular features used in speech signal for emotion classification are fundamental frequency, voice quality, energy, spectral, and MFCC. While most of the work focuses on these acoustic features in speech emotion recognition, we handle the problem of emotion recognition using features that are obtained from emotional patterns. In our approach, we transform the speech signal to discretized signal and extract distinctive patterns that can distinguish between different emotions. Then, a set of feature vectors is created using extracted patterns in order to feed a classifier. Experimental results indicate that the proposed approach learns the emotional state of speech efficiently from both pattern-based features and acoustic features that are supported by pattern features. Pattern-based features have resulted in 35 % improvement in accuracy using two classifiers compared to state of the art acoustic features. Moreover, when all acoustic features are combined with pattern-based features, classification accuracy enhances over 80 % in emotion recognition.

Keywords

Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control, Duygu tanıma, Emotion recognition, Ses işleme, Speech processing

Turkish CoHE Thesis Center URL

Click Here

WoS Q

N/A

Scopus Q

N/A

Start Page

1

End Page

65

URI

https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=jNRDC1RLfVd4_T7x7ZXmmW_Mpz0vZqk_6SmnsacG5dI-40OBIOgj0E2iWotHusLO
https://hdl.handle.net/20.500.14365/114

Collections

Yüksek Lisans Tezleri

Full item page

Google Scholar™

Check

Data Mining for Emotion Recognition in Speech

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Journal Issue

Abstract

Description

Keywords

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

Google Scholar™

Sustainable Development Goals