This blog post was originally published on 01 March, 2014, by Sebastian Garcia, at https://mcfp.weebly.com/analysis/analisis-ofctu-malware-capture-1-zbotoowo.
This capture was done between Thu Sep 5 15:40:07 CEST 2013 and Tue Oct 1 13:38:29 CEST 2013, having a total of 25 days and 21 hours. It corresponds to a binary with the MD5 46b3df3eaf1312f80788abd43343a9d2 of and that was classified by Kaspersky in VirusTotal as Trojan-Spy.Win32.Zbot.oowo. However we are not sure of the name.
The next image shows the complete graph of the traffic during the 25 days. It includes:
- The amount of UDP packets per minute.
- The amount of TCP packets per minute.
- The amount of DNS packets per minute. (UDP and dst port 53)
- The amount of SPAM packets per minute. (TCP and dst port 25)
- The amount of SSH packets per minute. (TCP and dst port 22)
- The amount of WEB packets per minute. (TCP and dst port 80)
- The amount of SSL packets per minute. (TCP and dst port 443)
- The amount of IPV6 packets per minute.
It was generated using the argus flows and RRD.
The image is big enough to zoom and analyze it. The original files of this capture can be found on CTU-Malware-Capture-Botnet-1.
This capture consists of two pcap files, each containing the traffic of one bot.
- 2013-10-01_capture-win12.pcap (5.8G)
- 2013-10-01_capture-win8.pcap (5.6G)
The first action done by the botnet was to send UDP packets to some IPs. There was not DNS resolution before them so the group of IP addresses were hard-coded inside the binary. These UDP connections seems like part of a P2P protocol because they were done in groups, they seem to be encrypted (because they content is statistically random) and they are repeated every X minutes.
One of these UDP connections (to IP 18.104.22.168) was answered (we call them established) and some information was downloaded. As soon as this UDP connection was established a TCP connection was done to the same IP. Also probably encrypted.
Some seconds later the bot started to connect to https://www.google.com (using TLS). The google connections were done during the whole capture. We are not sure what are these connections used for.
After some more UDP established connections and google searches, the bot downloaded some binaries files. The next image show these packets.
The first binary is downloaded from iwvsales[.]com/bc.exe and the second from www.solutics[.]ch/oKnUAf.exe
Eight minutes after the binaries downloads, the bot started to resolve hosts names. Some SMTP servers were connected to try to send SPAM and some web servers were contacted using POST requests. Next image shows how there was an SMTP connection to smtp.live.com and then the POST requests.
As it can be seen in the images, we added a label to every flow. This was done manually for each capture using the ralabel tool. The biargus file in the dataset contains the labels, so you can use them with the ra* client tools. The histogram of labels, so far, is:
One of the main goals of the MCFP is to analyze the behavior of the malware. In this case we will analyze the periodicity of flows using our own behavioral model. This model uses a Markov Chain to represent the changes in the states of each connection.
An example behavioral pattern that we can analyze are all the flows sent by the bot to the IP address 22.214.171.124 and destination port 8033 using the UDP protocol. We can see that the bot is connecting to it every 30 minutes. The following is the basic information about this pattern:
The pattern shows that the bot is connecting to this IP address with a median frequency of 30 minutes. Exactly. Thus, this connection is highly periodic.
The 153 flows of this connection were sent during 3 days and were very short, with a median duration of 0.17s. Also, they were small, with a median size of 386 Bytes. The label only represents the basic characteristics of the flow and was automatically assigned because the manual analysis is still in progress.
The State property uses letters to show the changes in the Markov Chain. All the 'e' and 'E' letters represent the periodic connections between three flows. All the 'v' states represents the loose of periodicity. It is interesting how the flows were changing during time (change of letters), getting more bytes, or getting shorter. It is an indicative that no connection is perfectly stable. More info about this model will be publish later.
The next plot is an analysis of these UDP connections in groups during several days. The plot X axis corresponds to hours and the Y axis to amount of flows every 30 minutes. It can be seen that there are a lot of attempted UDP flows at first, followed by some UDP established connections. However the amount of established UDP connections are less and less until the bot receives a new group of UDP addresses to try (red lines). This behavior is very similar to that of P2P connections.