Mitigating webshell attacks through machine learning techniques

You Guo, Hector Marco-Gisbert*, Paul Keir

*Corresponding author for this work

Research output: Contribution to journalArticle

17 Downloads (Pure)

Abstract

A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naïve Bayes and opcode sequence) model, which is a combination of naïve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.
Original languageEnglish
Article number12
Number of pages16
JournalFuture Internet
Volume12
Issue number1
DOIs
Publication statusPublished - 14 Jan 2020

Fingerprint

Learning systems
Servers
Learning algorithms
Websites
Classifiers
Experiments

Keywords

  • Webshell attacks
  • Machine learning
  • Naive Bayes
  • Opcode sequence

Cite this

@article{186fce1fd31946ce96b24aa156c26406,
title = "Mitigating webshell attacks through machine learning techniques",
abstract = "A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (na{\"i}ve Bayes and opcode sequence) model, which is a combination of na{\"i}ve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.",
keywords = "Webshell attacks, Machine learning, Naive Bayes, Opcode sequence",
author = "You Guo and Hector Marco-Gisbert and Paul Keir",
year = "2020",
month = "1",
day = "14",
doi = "10.3390/fi12010012",
language = "English",
volume = "12",
journal = "Future Internet",
issn = "1999-5903",
publisher = "Multidisciplinary Digital Publishing Institute",
number = "1",

}

Mitigating webshell attacks through machine learning techniques. / Guo, You; Marco-Gisbert, Hector; Keir, Paul.

In: Future Internet, Vol. 12, No. 1, 12, 14.01.2020.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Mitigating webshell attacks through machine learning techniques

AU - Guo, You

AU - Marco-Gisbert, Hector

AU - Keir, Paul

PY - 2020/1/14

Y1 - 2020/1/14

N2 - A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naïve Bayes and opcode sequence) model, which is a combination of naïve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.

AB - A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naïve Bayes and opcode sequence) model, which is a combination of naïve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.

KW - Webshell attacks

KW - Machine learning

KW - Naive Bayes

KW - Opcode sequence

U2 - 10.3390/fi12010012

DO - 10.3390/fi12010012

M3 - Article

VL - 12

JO - Future Internet

JF - Future Internet

SN - 1999-5903

IS - 1

M1 - 12

ER -