Browsing by Author "Zincir-Heywood N."

Now showing 1 - 3 of 3

Citation - Scopus: 1
Binary Text Representation for Feature Selection
(Springer Science and Business Media Deutschland GmbH, 2021) Lang N.; Zincir I.; Zincir-Heywood N.
In many real-world applications, a high number of words could result in noisy and redundant information, which could degrade the general performance of text classification tasks. Feature selection techniques with the purpose of eliminating uninformative words have been actively studied. In several information-theoretic approaches, such features are conventionally obtained by maximizing relevance to the class while the redundancy among the features used is minimized. This is an NP-hard problem and still remains to be a challenge. In this work, we propose an alternative feature selection strategy on binary representation data, with the purpose of providing a theoretical lower bound for finding a near optimal solution based on the Maximum Relevance-Minimum Redundancy criterion. In doing so, the proposed strategy can achieve a theoretical approximation ratio of 12 by a naive greedy search. The proposed strategy is validated by empirical experiments on five publicly available datasets, namely, Cora, Citeseer, WebKB, SMS Spam and Spambase. Their effectiveness is shown for binary text classification tasks when compared with well-known filter feature selection methods and mutual information-based methods. © 2021, Springer Nature Switzerland AG.
Citation - Scopus: 1
Can We Detect Malicious Behaviours in Encrypted Dns Tunnels Using Network Flow Entropy?
(River Publishers, 2022) Khodjaeva Y.; Zincir-Heywood N.; Zincir I.
This paper explores the concept of entropy of a flow to augment flow statistical features for encrypted DNS tunnelling detection, specifically DNS over HTTPS traffic. To achieve this, the use of flow exporters, namely Argus, DoHlyzer and Tranalyzer2 are studied. Statistical flow features automatically generated by the aforementioned tools are then augmented with the flow entropy. In this work, flow entropy is calculated using three different techniques: (i) entropy over all packets of a flow, (ii) entropy over the first 96 bytes of a flow, and (iii) entropy over the first n-packets of a flow. These features are provided as input to ML classifiers to detect malicious behaviours over four publicly available datasets. This model is optimized using TPOT-AutoML system, where the Random Forest classifier provided the best performance achieving an average F-measure of 98% over all testing datasets employed. © 2022 River Publishers.
Citation - Scopus: 3
Exploring an Artificial Arms Race for Malware Detection
(Association for Computing Machinery, Inc, 2020) Wilkins Z.; Zincir I.; Zincir-Heywood N.
The Android platform commands a dramatic majority of the mobile market, and this popularity makes it an appealing target for malicious actors. Android malware is especially dangerous because of the versatility in distribution and acquisition of software on the platform. In this paper, we continue to investigate evolutionary Android malware detection systems, implementing new features in an artificial arms race, and comparing different systems' performances on three new datasets. Our evaluations show that the artificial arms race based system achieves the overall best performance on these very challenging datasets. © 2020 ACM.