Catania

Detecting DNS Threats: A Deep Learning Model to Rule Them All

Domain Name Service is a central part of Internet regular operation. Such importance has made it a common target of different malicious behaviors such as the application of Domain Generation Algorithms (DGA) for command and control a group of infected computers or Tunneling techniques for bypassing system administrator restrictions. A common detection approach is based on training different models detecting DGA and Tunneling capable of performing a lexicographic discrimination of the domain names. However, since both DGA and Tunneling showed domain names with observable lexicographical differences with normal domains, it is reasonable to apply the same detection approach to both threats. In the present work, we propose a multi-class convolutional network (MC-CNN) capable of detecting both DNS threats. The resulting MC-CNN is able to detect correctly 99% of normal domains, 97% of DGA and 92% of Tunneling, with a False Positive Rate of 2.8%, 0.7% and 0.0015% respectively and the advantage of having 44% fewer trainable parameters than similar models applied to DNS threats detection.

Deep Convolutional Neural Networks for DGA Detection

A Domain Generation Algorithm (DGA) is an algorithm to generate domain names in a deterministic but seemly random way. Malware use DGAs to generate the next domain to access the Command & Control (C&C) communication server. Given the simplicity of the generation process and speed at which the domains are generated, a fast and accurate detection method is required. Convolutional neural network (CNN) are well known for performing real-time detection in fields like image and video recognition. Therefore, they seemed suitable for DGA detection. The present work provides an analysis and comparison of the detection performance of a CNN for DGA detection. A CNN with a minimal architecture complexity was evaluated on a dataset with 51 DGA malware families and normal domains. Despite its simple architecture, the resulting CNN model correctly detected more than 97% of total DGA domains with a false positive rate close to 0.7%.

An Analysis of Convolutional Neural Networks for detecting DGA

A Domain Generation Algorithm (DGA) is an algorithm to generate domain names in a deterministic but seemly random way. Malware use DGAs to generate the next domain to access the Command Control (C&C) communication channel. Given the simplicity and velocity associated to the domain generation process, machine learning detection methods emerged as suitable detection solution. However, since the periodical retraining becomes mandatory, a fast and accurate detection method is needed. Convolutional neural network (CNN) are well known for performing real-time detection in fields like image and video recognition. Therefore, they seem suitable for DGA detection. The present work is a preliminary analysis of the detection performance of CNN for DGA detection. A CNN with a minimal architecture complexity was evaluated on a dataset with 51 DGA malware families as well as normal domains. Despite its simple architecture, the resulting CNN model correctly detected more than 97% of total DGA domains with a false positive rate close to 0.7%.

Detecting DGA malware traffic through behavioral models

Some botnets use special algorithms to generate the domain names they need to connect to their command and control servers. They are refereed as Domain Generation Algorithms. Domain Generation Algorithms generate domain names and tries to resolve their IP addresses. If the domain has an IP address, it is used to connect to that command and control server. Otherwise, the DGA generates a new domain and keeps trying to connect. In both cases it is possible to capture and analyze the special behavior shown by those DNS packets in the network. The behavior of Domain Generation Algorithms is difficult to automatically detect because each domain is usually randomly generated and therefore unpredictable. Hence, it is challenging to separate the DNS traffic generated by malware from the DNS traffic generated by normal computers. In this work we analyze the use of behavioral detection approaches based on Markov Models to differentiate Domain Generation Algorithms traffic from normal DNS traffic. The evaluation methodology of our detection models has focused on a real-time approach based on the use of time windows for reporting the alerts. All the detection models have shown a clear differentiation between normal and malicious DNS traffic and most have also shown a good detection rate. We believe this work is a further step in using behavioral models for network detection and we hope to facilitate the development of more general and better behavioral detection methods of malware traffic.