Android-IoT Malware Classification and Detection Approach Using Deep URL Features Analysis

Farhan Ullah, Xiaochun Cheng, Leonardo Mostarda, Sohail Jabbar

Source Title: Journal of Database Management (JDM) 34(2)

DOI: 10.4018/JDM.318414

Article PDF Download Open access articles are freely available for download

Abstract

Currently, malware attacks pose a high risk to compromise the security of Android-IoT apps. These threats have the potential to steal critical information, causing economic, social, and financial harm. Because of their constant availability on the network, Android apps are easily attacked by URL-based traffic. In this paper, an Android malware classification and detection approach using deep and broad URL feature mining is proposed. This study entails the development of a novel traffic data preprocessing and transformation method that can detect malicious apps using network traffic analysis. The encrypted URL-based traffic is mined to decrypt the transmitted data. To extract the sequenced features, the N-gram analysis method is used, and afterward, the singular value decomposition (SVD) method is utilized to reduce the features while preserving the actual semantics. The latent features are extracted using the latent semantic analysis tool. Finally, CNN-LSTM, a multi-view deep learning approach, is designed for effective malware classification and detection.

Article Preview

Top

1. Introduction

A malware infection can easily attack Android apps for malicious purposes and compromise security (Lu & Da Xu, 2018). Mobile network expansion has increased the number of portable devices. Because of this, financial malware apps threaten mobile users. Despite massive prevention and mitigation efforts, malware remains a major cyber security threat. Thus, in 2016, Symantec discovered 357,019,453, in 2017, 669,974,865, and in 2018, 246,002,762 new malware variants. Yet more malware variants are attempting to bypass anti-virus tools and avoid detection by several malware detection systems.

The rapid expansion of mobile interaction places a significant burden on smartphone security management. According to a recent study^1, the number of apps in the Google Play Store has increased from 16K in Dec. 2009 to over 2 million in Feb. 2016. As a result, mobile traffic has topped 3.7 exabytes. The growth of the mobile ecosystem is seriously compromised by malicious apps. There has been a massive increase in mobile malware, especially targeting Android devices. Devastating digital payment thefts and other attacks threaten mobile security. Despite the Android platforms and mobile antivirus security measures, sophisticated mobile malware continues to infiltrate mobile systems. The widespread use of mobile devices also exposes users to multiple risks. So we urgently need Android-based mobile malware detection systems (McLaughlin et al., 2017). A malware family is a group of malicious apps sharing code. Various malware samples use the malware families' codebase. All samples with the same interpretation are combined.

Faruki et al. (Faruki et al., 2014) explore the characteristics of a huge assortment of malware and categorizes existing mobile malware detection methods into static, dynamic, and traffic-based categories. Static analysis has been used in several previous studies to discover data leakage, malware, and security breaches in Android apps (Zhu et al., 2018). Nevertheless, static analysis of malware is challenging due to code polymorphism and obfuscation. These methods are used to produce malware variants to avoid detection. Numerous different dynamic analysis techniques strive to alter the device's operating system to monitor and access confidential information at runtime (Bader, Lichy, Hajaj, Dubin, & Dvir, 2022; Ucci, Aniello, & Baldoni, 2019). Such methods are helpful, but they necessitate a massive number of executions to encompass all app behavioral patterns (Ahmed, Lin, & Srivastava, 2021).

1.1 Motivation

Many virus detection techniques concentrate on the network traffic generated by mobile apps. Malware is identified by abnormal network behavior patterns. This type of malware detection system has the potential to be effective because the majority of Android malware performs its malicious functions via network traffic (Zhou & Jiang, 2012). The malware must communicate with a remote server over the Internet to carry out malicious tasks. These traces can be used to identify and track down specific malware. Furthermore, malware detection strategies based on network characteristics are more straightforward to design and implement than static or dynamic analytic approaches. For example, methods based on traffic detection can be installed at an access point or gateway. These methods rely solely on user-generated network traffic data, ensuring that users do not lose access to their mobile resources. Furthermore, these solutions do not necessitate any user actions aside from granting licenses to the detection service (W. Li, Bao, Zhang, & Li, 2022; S. Wang et al., 2020). The goal of network traffic-based approaches is to find distinguishing features that can be used to classify malware more effectively. Selecting efficient features, on the other hand, is a difficult task. We concentrate our investigation on malware samples that use the HTTP/HTTPS protocol to send data. Because HTTP accounts for 70% of the network traffic generated by Android apps, we chose it for our research. (Dai, Tongaonkar, Wang, Nucci, & Song, 2013). However, because HTTP traffic is generated in encrypted form, extracting useful information from it is extremely difficult.

Complete Article List

Search this Journal:

Reset

Volume 35: 1 Issue (2024)

Volume 34: 3 Issues (2023)

Volume 33: 5 Issues (2022): 4 Released, 1 Forthcoming

Volume 32: 4 Issues (2021)

Volume 31: 4 Issues (2020)

Volume 30: 4 Issues (2019)

Volume 29: 4 Issues (2018)

Volume 28: 4 Issues (2017)

Volume 27: 4 Issues (2016)

Volume 26: 4 Issues (2015)

Volume 25: 4 Issues (2014)

Volume 24: 4 Issues (2013)

Volume 23: 4 Issues (2012)

Volume 22: 4 Issues (2011)

Volume 21: 4 Issues (2010)

Volume 20: 4 Issues (2009)

Volume 19: 4 Issues (2008)

Volume 18: 4 Issues (2007)

Volume 17: 4 Issues (2006)

Volume 16: 4 Issues (2005)

Volume 15: 4 Issues (2004)

Volume 14: 4 Issues (2003)

Volume 13: 4 Issues (2002)

Volume 12: 4 Issues (2001)

Volume 11: 4 Issues (2000)

Volume 10: 4 Issues (1999)

Volume 9: 4 Issues (1998)

Volume 8: 4 Issues (1997)

Volume 7: 4 Issues (1996)

Volume 6: 4 Issues (1995)

Volume 5: 4 Issues (1994)

Volume 4: 4 Issues (1993)

Volume 3: 4 Issues (1992)

Volume 2: 4 Issues (1991)

Volume 1: 2 Issues (1990)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Android-IoT Malware Classification and Detection Approach Using Deep URL Features Analysis

Abstract

1. Introduction

1.1 Motivation

Complete Article List