A Comparison of Neural Networks for Real-Time Emotion Recognition From Speech Signals

Ünlütürk, Mehmet Süleyman; Oguz K.; Atay C.

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14365/4618

Title:	A Comparison of Neural Networks for Real-Time Emotion Recognition From Speech Signals
Authors:	Ünlütürk, Mehmet Süleyman Oguz K. Atay C.
Keywords:	Back propagation learning algorithm Bayes optimal decision rule Emotion Fast-fourier transform (FFT) Neural network Power spectrum Speech Back propagation learning algorithm Bayes optimal decision rule Emotion Emotion recognition Fast-fourier transform (FFT) Hidden neurons Input node Optimality Recognition performance Software applications Speech signals Testing sets Training sets Voice recognition Voice signals Backpropagation Backpropagation algorithms Computer applications Face recognition Fast Fourier transforms Human computer interaction Interfaces (computer) Learning algorithms Learning systems Neural networks Neurons Power spectrum Speech recognition
Abstract:	Speech and emotion recognition improve the quality of human computer interaction and allow easier to use interfaces for every level of user in software applications. In this study, we have developed two different neural networks called emotion recognition neural network (ERNN) and Gram-Charlier emotion recognition neural network (GERNN) to classify the voice signals for emotion recognition. The ERNN has 128 input nodes, 20 hidden neurons, and three summing output nodes. A set of 97920 training sets is used to train the ERNN. A new set of 24480 testing sets is utilized to test the ERNN performance. The samples tested for voice recognition are acquired from the movies " Anger Management" and " Pick of Destiny" . ERNN achieves an average recognition performance of 100%. This high level of recognition suggests that the ERNN is a promising method for emotion recognition in computer applications. Furthermore, the GERNN has four input nodes, 20 hidden neurons, and three output nodes. The GERNN achieves an average recognition performance of 33%. This shows us that we cannot use Gram-Charlier coefficients to discriminate emotion signals. In addition, Hinton diagrams were utilized to display the optimality of ERNN weights.
URI:	https://hdl.handle.net/20.500.14365/4618
ISSN:	1790-5022
Appears in Collections:	Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Files in This Item:

File	Size	Format
3659.pdf Restricted Access	930.29 kB	Adobe PDF	View/Open

Show full item record

CORE Recommender

SCOPUS^TM
Citations

2

checked on Oct 15, 2025

Page view(s)

220

checked on Oct 13, 2025

Download(s)

8

checked on Oct 13, 2025

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

Page view(s)

Download(s)

Google ScholarTM

SCOPUS^TM
Citations

Google Scholar^TM