Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks

dc.contributor.author Belhan C.
dc.contributor.author Fikirdanis D.
dc.contributor.author Cimen O.
dc.contributor.author Pasinli P.
dc.contributor.author Akgun Z.
dc.contributor.author Yayci Z.O.
dc.contributor.author Türkan, Mehmet
dc.date.accessioned 2023-06-16T14:59:33Z
dc.date.available 2023-06-16T14:59:33Z
dc.date.issued 2021
dc.description IEEE SMC Society;IEEE Turkey Section en_US
dc.description 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021 -- 6 October 2021 through 8 October 2021 -- 174400 en_US
dc.description.abstract Lip reading, described as extracting speech data from the observable deeds in the face, particularly the jaws, lips, tongue and teeth, is a very challenging task. It is indeed a beneficial skill that helps people to comprehend and interpret the content of other people's speech, when it is not sufficient to recognize either audio or expression. Even experts require a certain level of experience and need an understanding of visual expressions to interpret spoken words. However, this may not be efficient enough for the process. Nowadays, lip sequences can be converted into expressive words and phrases with the aid of computers. Thus, the usage of neural networks (NNs) is increased rapidly in this field. The main contribution of this study is to use Short-Time Fourier Transformed (STFT) audio data as an extra image input and employing 3D Convolutional NNs (CNNs) for feature extraction. This generates features considering the change in consecutive frames and makes use of visual and auditory data together with the attributes from the image. After testing several experimental scenarios, it turns out to be the proposed method has a strong promise for further development in this research domain. © 2021 IEEE. en_US
dc.identifier.doi 10.1109/ASYU52992.2021.9599016
dc.identifier.isbn 9.78E+12
dc.identifier.scopus 2-s2.0-85123175238
dc.identifier.uri https://doi.org/10.1109/ASYU52992.2021.9599016
dc.identifier.uri https://hdl.handle.net/20.500.14365/3510
dc.language.iso en en_US
dc.publisher Institute of Electrical and Electronics Engineers Inc. en_US
dc.relation.ispartof Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021 en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject 3D convolutional neural network en_US
dc.subject audio-visual speech recognition en_US
dc.subject automatic speech recognition en_US
dc.subject Lip reading en_US
dc.subject short-time Fourier Transform en_US
dc.subject Convolution en_US
dc.subject Speech en_US
dc.subject Speech recognition en_US
dc.subject 3d convolutional neural network en_US
dc.subject Audiovisual speech recognition en_US
dc.subject Automatic speech recognition en_US
dc.subject Convolutional neural network en_US
dc.subject Fourier en_US
dc.subject Lip reading en_US
dc.subject Neural-networks en_US
dc.subject Short time Fourier transforms en_US
dc.subject Speech data en_US
dc.subject Spoken words en_US
dc.subject Convolutional neural networks en_US
dc.title Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.scopusid 57224918896
gdc.author.scopusid 57419656500
gdc.author.scopusid 57420178400
gdc.author.scopusid 57420002100
gdc.author.scopusid 57419831800
gdc.author.scopusid 14069326000
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.collaboration.industrial false
gdc.description.departmenttemp Belhan, C., Izmir University of Economics, Izmir, Turkey; Fikirdanis, D., Izmir University of Economics, Izmir, Turkey; Cimen, O., Izmir University of Economics, Izmir, Turkey; Pasinli, P., Izmir University of Economics, Izmir, Turkey; Akgun, Z., Izmir University of Economics, Izmir, Turkey; Yayci, Z.O., Izmir University of Economics, Izmir, Turkey; Turkan, M., Izmir University of Economics, Izmir, Turkey en_US
gdc.description.endpage 5
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality N/A
gdc.description.startpage 1
gdc.description.wosquality N/A
gdc.identifier.openalex W3217103950
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 1.0
gdc.oaire.influence 2.7836007E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 3.3216752E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration National
gdc.openalex.fwci 0.3441
gdc.openalex.normalizedpercentile 0.54
gdc.opencitations.count 2
gdc.plumx.crossrefcites 1
gdc.plumx.mendeley 5
gdc.plumx.scopuscites 4
gdc.scopus.citedcount 4
gdc.virtual.author Türkan, Mehmet
gdc.virtual.author Türkan, Mehmet
gdc.virtual.author Yaycı, Zeynep Övgü
local.message.claim 2025-04-17T13:24:29.221+0300|||rp00186|||submit_approve|||dc_contributor_author|||None *
relation.isAuthorOfPublication 7a969b6f-8dc6-4730-a7b1-c1dba8089d68
relation.isAuthorOfPublication 76946aef-c81f-4033-be60-a1c814aec77d
relation.isAuthorOfPublication a845b296-0fea-4bd2-8ca9-6526e72f73c2
relation.isAuthorOfPublication.latestForDiscovery 7a969b6f-8dc6-4730-a7b1-c1dba8089d68
relation.isOrgUnitOfPublication b02722f0-7082-4d8a-8189-31f0230f0e2f
relation.isOrgUnitOfPublication 26a7372c-1a5e-42d9-90b6-a3f7d14cad44
relation.isOrgUnitOfPublication e9e77e3e-bc94-40a7-9b24-b807b2cd0319
relation.isOrgUnitOfPublication a80bcb24-4ed9-4f67-9bd1-0fb405ff202d
relation.isOrgUnitOfPublication b4714bc5-c5ae-478f-b962-b7204c948b70
relation.isOrgUnitOfPublication.latestForDiscovery b02722f0-7082-4d8a-8189-31f0230f0e2f

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2605.pdf
Size:
5.73 MB
Format:
Adobe Portable Document Format