Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks
| dc.contributor.author | Belhan C. | |
| dc.contributor.author | Fikirdanis D. | |
| dc.contributor.author | Cimen O. | |
| dc.contributor.author | Pasinli P. | |
| dc.contributor.author | Akgun Z. | |
| dc.contributor.author | Yayci Z.O. | |
| dc.contributor.author | Türkan, Mehmet | |
| dc.date.accessioned | 2023-06-16T14:59:33Z | |
| dc.date.available | 2023-06-16T14:59:33Z | |
| dc.date.issued | 2021 | |
| dc.description | IEEE SMC Society;IEEE Turkey Section | en_US |
| dc.description | 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021 -- 6 October 2021 through 8 October 2021 -- 174400 | en_US |
| dc.description.abstract | Lip reading, described as extracting speech data from the observable deeds in the face, particularly the jaws, lips, tongue and teeth, is a very challenging task. It is indeed a beneficial skill that helps people to comprehend and interpret the content of other people's speech, when it is not sufficient to recognize either audio or expression. Even experts require a certain level of experience and need an understanding of visual expressions to interpret spoken words. However, this may not be efficient enough for the process. Nowadays, lip sequences can be converted into expressive words and phrases with the aid of computers. Thus, the usage of neural networks (NNs) is increased rapidly in this field. The main contribution of this study is to use Short-Time Fourier Transformed (STFT) audio data as an extra image input and employing 3D Convolutional NNs (CNNs) for feature extraction. This generates features considering the change in consecutive frames and makes use of visual and auditory data together with the attributes from the image. After testing several experimental scenarios, it turns out to be the proposed method has a strong promise for further development in this research domain. © 2021 IEEE. | en_US |
| dc.identifier.doi | 10.1109/ASYU52992.2021.9599016 | |
| dc.identifier.isbn | 9.78E+12 | |
| dc.identifier.scopus | 2-s2.0-85123175238 | |
| dc.identifier.uri | https://doi.org/10.1109/ASYU52992.2021.9599016 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14365/3510 | |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
| dc.relation.ispartof | Proceedings - 2021 Innovations in Intelligent Systems and Applications Conference, ASYU 2021 | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | 3D convolutional neural network | en_US |
| dc.subject | audio-visual speech recognition | en_US |
| dc.subject | automatic speech recognition | en_US |
| dc.subject | Lip reading | en_US |
| dc.subject | short-time Fourier Transform | en_US |
| dc.subject | Convolution | en_US |
| dc.subject | Speech | en_US |
| dc.subject | Speech recognition | en_US |
| dc.subject | 3d convolutional neural network | en_US |
| dc.subject | Audiovisual speech recognition | en_US |
| dc.subject | Automatic speech recognition | en_US |
| dc.subject | Convolutional neural network | en_US |
| dc.subject | Fourier | en_US |
| dc.subject | Lip reading | en_US |
| dc.subject | Neural-networks | en_US |
| dc.subject | Short time Fourier transforms | en_US |
| dc.subject | Speech data | en_US |
| dc.subject | Spoken words | en_US |
| dc.subject | Convolutional neural networks | en_US |
| dc.title | Audio-Visual Speech Recognition Using 3d Convolutional Neural Networks | en_US |
| dc.type | Conference Object | en_US |
| dspace.entity.type | Publication | |
| gdc.author.scopusid | 57224918896 | |
| gdc.author.scopusid | 57419656500 | |
| gdc.author.scopusid | 57420178400 | |
| gdc.author.scopusid | 57420002100 | |
| gdc.author.scopusid | 57419831800 | |
| gdc.author.scopusid | 14069326000 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C5 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.departmenttemp | Belhan, C., Izmir University of Economics, Izmir, Turkey; Fikirdanis, D., Izmir University of Economics, Izmir, Turkey; Cimen, O., Izmir University of Economics, Izmir, Turkey; Pasinli, P., Izmir University of Economics, Izmir, Turkey; Akgun, Z., Izmir University of Economics, Izmir, Turkey; Yayci, Z.O., Izmir University of Economics, Izmir, Turkey; Turkan, M., Izmir University of Economics, Izmir, Turkey | en_US |
| gdc.description.endpage | 5 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | en_US |
| gdc.description.scopusquality | N/A | |
| gdc.description.startpage | 1 | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W3217103950 | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 1.0 | |
| gdc.oaire.influence | 2.7836007E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.popularity | 3.3216752E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.openalex.collaboration | National | |
| gdc.openalex.fwci | 0.3441 | |
| gdc.openalex.normalizedpercentile | 0.54 | |
| gdc.opencitations.count | 2 | |
| gdc.plumx.crossrefcites | 1 | |
| gdc.plumx.mendeley | 5 | |
| gdc.plumx.scopuscites | 4 | |
| gdc.scopus.citedcount | 4 | |
| gdc.virtual.author | Türkan, Mehmet | |
| gdc.virtual.author | Türkan, Mehmet | |
| gdc.virtual.author | Yaycı, Zeynep Övgü | |
| local.message.claim | 2025-04-17T13:24:29.221+0300|||rp00186|||submit_approve|||dc_contributor_author|||None | * |
| relation.isAuthorOfPublication | 7a969b6f-8dc6-4730-a7b1-c1dba8089d68 | |
| relation.isAuthorOfPublication | 76946aef-c81f-4033-be60-a1c814aec77d | |
| relation.isAuthorOfPublication | a845b296-0fea-4bd2-8ca9-6526e72f73c2 | |
| relation.isAuthorOfPublication.latestForDiscovery | 7a969b6f-8dc6-4730-a7b1-c1dba8089d68 | |
| relation.isOrgUnitOfPublication | b02722f0-7082-4d8a-8189-31f0230f0e2f | |
| relation.isOrgUnitOfPublication | 26a7372c-1a5e-42d9-90b6-a3f7d14cad44 | |
| relation.isOrgUnitOfPublication | e9e77e3e-bc94-40a7-9b24-b807b2cd0319 | |
| relation.isOrgUnitOfPublication | a80bcb24-4ed9-4f67-9bd1-0fb405ff202d | |
| relation.isOrgUnitOfPublication | b4714bc5-c5ae-478f-b962-b7204c948b70 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | b02722f0-7082-4d8a-8189-31f0230f0e2f |
Files
Original bundle
1 - 1 of 1
