Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks

Belhan C.; Fikirdanis D.; Cimen O.; Pasinli P.; Akgun Z.; Yayci Z.O.; Türkan, Mehmet; Cimen, Ovgu; Belhan, Ceren; Yayci, Zeynep Ovgu; Fikirdanis, Damla; Akgun, Zeynep; Pasinli, Pelin

Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks

dc.contributor.author	Belhan C.
dc.contributor.author	Fikirdanis D.
dc.contributor.author	Cimen O.
dc.contributor.author	Pasinli P.
dc.contributor.author	Akgun Z.
dc.contributor.author	Yayci Z.O.
dc.contributor.author	Türkan, Mehmet
dc.contributor.author	Cimen, Ovgu
dc.contributor.author	Belhan, Ceren
dc.contributor.author	Yayci, Zeynep Ovgu
dc.contributor.author	Fikirdanis, Damla
dc.contributor.author	Akgun, Zeynep
dc.contributor.author	Pasinli, Pelin
dc.date.accessioned	2023-06-16T14:59:33Z
dc.date.available	2023-06-16T14:59:33Z
dc.date.issued	2021
dc.description	IEEE SMC Society;IEEE Turkey Section	en_US
dc.description	2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021 -- 6 October 2021 through 8 October 2021 -- 174400	en_US
dc.description.abstract	Lip reading, described as extracting speech data from the observable deeds in the face, particularly the jaws, lips, tongue and teeth, is a very challenging task. It is indeed a beneficial skill that helps people to comprehend and interpret the content of other people's speech, when it is not sufficient to recognize either audio or expression. Even experts require a certain level of experience and need an understanding of visual expressions to interpret spoken words. However, this may not be efficient enough for the process. Nowadays, lip sequences can be converted into expressive words and phrases with the aid of computers. Thus, the usage of neural networks (NNs) is increased rapidly in this field. The main contribution of this study is to use Short-Time Fourier Transformed (STFT) audio data as an extra image input and employing 3D Convolutional NNs (CNNs) for feature extraction. This generates features considering the change in consecutive frames and makes use of visual and auditory data together with the attributes from the image. After testing several experimental scenarios, it turns out to be the proposed method has a strong promise for further development in this research domain. © 2021 IEEE.	en_US
dc.description.sponsorship	IEEE SMC Society; IEEE Turkey Section
dc.identifier.doi	10.1109/ASYU52992.2021.9599016
dc.identifier.isbn	9.78E+12
dc.identifier.isbn	9781665434058
dc.identifier.scopus	2-s2.0-85123175238
dc.identifier.uri	https://doi.org/10.1109/ASYU52992.2021.9599016
dc.identifier.uri	https://hdl.handle.net/20.500.14365/3510
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	en_US
dc.relation.ispartof	Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	3D convolutional neural network	en_US
dc.subject	audio-visual speech recognition	en_US
dc.subject	automatic speech recognition	en_US
dc.subject	Lip reading	en_US
dc.subject	short-time Fourier Transform	en_US
dc.subject	Convolution	en_US
dc.subject	Speech	en_US
dc.subject	Speech recognition	en_US
dc.subject	3d convolutional neural network	en_US
dc.subject	Audiovisual speech recognition	en_US
dc.subject	Automatic speech recognition	en_US
dc.subject	Convolutional neural network	en_US
dc.subject	Fourier	en_US
dc.subject	Lip reading	en_US
dc.subject	Neural-networks	en_US
dc.subject	Short time Fourier transforms	en_US
dc.subject	Speech data	en_US
dc.subject	Spoken words	en_US
dc.subject	Convolutional neural networks	en_US
dc.title	Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks	en_US
dc.type	Conference Object	en_US
dspace.entity.type	Publication
gdc.author.scopusid	57224918896
gdc.author.scopusid	57419656500
gdc.author.scopusid	57420178400
gdc.author.scopusid	57420002100
gdc.author.scopusid	57419831800
gdc.author.scopusid	14069326000
gdc.author.scopusid	57419304400
gdc.author.scopusid	57219464962
gdc.bip.impulseclass	C5
gdc.bip.influenceclass	C5
gdc.bip.popularityclass	C5
gdc.coar.access	metadata only access
gdc.coar.type	text::conference output
gdc.collaboration.industrial	false
gdc.description.department	İzmir University of Economics
gdc.description.departmenttemp	Belhan, C., Izmir University of Economics, Izmir, Turkey; Fikirdanis, D., Izmir University of Economics, Izmir, Turkey; Cimen, O., Izmir University of Economics, Izmir, Turkey; Pasinli, P., Izmir University of Economics, Izmir, Turkey; Akgun, Z., Izmir University of Economics, Izmir, Turkey; Yayci, Z.O., Izmir University of Economics, Izmir, Turkey; Turkan, M., Izmir University of Economics, Izmir, Turkey	en_US
gdc.description.endpage	5
gdc.description.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
gdc.description.scopusquality	N/A
gdc.description.startpage	1
gdc.description.wosquality	N/A
gdc.identifier.openalex	W3217103950
gdc.index.type	Scopus
gdc.oaire.diamondjournal	false
gdc.oaire.impulse	1.0
gdc.oaire.influence	2.7836007E-9
gdc.oaire.isgreen	false
gdc.oaire.popularity	3.3216752E-9
gdc.oaire.publicfunded	false
gdc.openalex.collaboration	National
gdc.openalex.fwci	0.3441
gdc.openalex.normalizedpercentile	0.54
gdc.opencitations.count	2
gdc.plumx.crossrefcites	1
gdc.plumx.mendeley	5
gdc.plumx.scopuscites	4
gdc.scopus.citedcount	4
gdc.virtual.author	Türkan, Mehmet
gdc.virtual.author	Türkan, Mehmet
gdc.virtual.author	Yaycı, Zeynep Övgü
local.message.claim	2025-04-17T13:24:29.221+0300\|\|\|rp00186\|\|\|submit_approve\|\|\|dc_contributor_author\|\|\|None	*
relation.isAuthorOfPublication	7a969b6f-8dc6-4730-a7b1-c1dba8089d68
relation.isAuthorOfPublication	76946aef-c81f-4033-be60-a1c814aec77d
relation.isAuthorOfPublication	a845b296-0fea-4bd2-8ca9-6526e72f73c2
relation.isAuthorOfPublication.latestForDiscovery	7a969b6f-8dc6-4730-a7b1-c1dba8089d68
relation.isOrgUnitOfPublication	b02722f0-7082-4d8a-8189-31f0230f0e2f
relation.isOrgUnitOfPublication	26a7372c-1a5e-42d9-90b6-a3f7d14cad44
relation.isOrgUnitOfPublication	e9e77e3e-bc94-40a7-9b24-b807b2cd0319
relation.isOrgUnitOfPublication	a80bcb24-4ed9-4f67-9bd1-0fb405ff202d
relation.isOrgUnitOfPublication	b4714bc5-c5ae-478f-b962-b7204c948b70
relation.isOrgUnitOfPublication.latestForDiscovery	b02722f0-7082-4d8a-8189-31f0230f0e2f

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2605.pdf
Size:: 5.73 MB
Format:: Adobe Portable Document Format

Download

Collections

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection