Model-Based Feature Selection Using Structural Equation Modeling for Enhanced Classification Performance in High-Dimensional Datasets

dc.contributor.author Albayrak, Muammer
dc.contributor.author Turhan, Kemal
dc.date.accessioned 2025-11-03T17:01:05Z
dc.date.available 2025-11-03T17:01:05Z
dc.date.issued 2025
dc.description.abstract Feature selection is becoming more and more important for machine learning and data mining. Especially for high dimensional datasets, it is necessary to filter out irrelevant and unnecessary features to overcome the problems of overfitting and multidimensionality. We hypothesized that an effective feature selection can be made with a model-based approach using the Structural Equation Modeling (SEM) method. The dataset consists of 2969 samples and 117 features. First, a measurement model created was tested with confirmatory factor analysis (CFA) and the number of features was reduced to 58 by removing the statistically insignificant features. In SEM analysis, sub-feature sets consisting of 55, 52, 41 and 35 features were obtained by removing the variables whose relationship was below the threshold values determined for the standardized regression coefficient (SRC). The obtained sub-feature sets were tested with a multilayer perceptron (MLP) and their effect on performance was examined. Results were compared with random forest feature importance as baseline method. SEM and random forest have generally performed very closely. While sub-feature sets created with the random forest in two-class classification produced better results, the sub-feature sets created with the suggested SEM-based method in three and five-class classification provided better performance. These results showed that effective feature selection can be made with the proposed model-based approach using SEM. With this approach, it is possible to obtain sub-feature sets that form a model which statistically significant and consistent with field knowledge by including expert knowledge in the modeling process. en_US
dc.identifier.doi 10.35378/gujs.1507978
dc.identifier.issn 2147-1762
dc.identifier.scopus 2-s2.0-105018453663
dc.identifier.uri https://doi.org/10.35378/gujs.1507978
dc.identifier.uri https://hdl.handle.net/20.500.14365/6535
dc.identifier.uri https://search.trdizin.gov.tr/en/yayin/detay/1351593/model-based-feature-selection-using-structural-equation-modeling-for-enhanced-classification-performance-in-high-dimensional-datasets
dc.language.iso en en_US
dc.publisher Gazi University en_US
dc.relation.ispartof Gazi University Journal of Science en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Artificial Neural Networks en_US
dc.subject Feature Importance en_US
dc.subject Feature Selection en_US
dc.subject Structural Equation Modeling en_US
dc.title Model-Based Feature Selection Using Structural Equation Modeling for Enhanced Classification Performance in High-Dimensional Datasets en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.institutional Turhan, Kutsal
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department İzmir Ekonomi Üniversitesi en_US
gdc.description.departmenttemp Karadeniz Teknik Üniversitesi,İzmir Ekonomi Üniversitesi en_US
gdc.description.endpage 1260 en_US
gdc.description.issue 3 en_US
gdc.description.publicationcategory Makale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q3
gdc.description.startpage 1247 en_US
gdc.description.volume 38 en_US
gdc.description.woscitationindex Emerging Sources Citation Index
gdc.description.wosquality Q3
gdc.identifier.openalex W4412758337
gdc.identifier.trdizinid 1351593
gdc.identifier.wos WOS:001576896900006
gdc.index.type WoS
gdc.index.type Scopus
gdc.index.type TR-Dizin
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.impulse 0.0
gdc.oaire.influence 2.4895952E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 2.7494755E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration National
gdc.openalex.fwci 0.0
gdc.openalex.normalizedpercentile 0.22
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.scopuscites 0
gdc.scopus.citedcount 0
gdc.virtual.author Turhan, Kutsal
gdc.virtual.author Turhan, Kemal
gdc.wos.citedcount 0
relation.isAuthorOfPublication 8d56352f-325d-4037-8180-9eafadecc821
relation.isAuthorOfPublication 0af17f77-3168-4aaf-a45e-2cbf889758a7
relation.isAuthorOfPublication.latestForDiscovery 8d56352f-325d-4037-8180-9eafadecc821
relation.isOrgUnitOfPublication fb5b4042-739a-4880-8e29-a9adb71d6492
relation.isOrgUnitOfPublication fbc53f3e-d1d3-4168-afd8-e42cd20bddd9
relation.isOrgUnitOfPublication e9e77e3e-bc94-40a7-9b24-b807b2cd0319
relation.isOrgUnitOfPublication 4cbb0a74-ee1a-438b-b714-b8ef253df94b
relation.isOrgUnitOfPublication.latestForDiscovery fb5b4042-739a-4880-8e29-a9adb71d6492

Files