This blog post was authored by Kamila Babayeva (@_kamifai_), Lisandro Ubiedo (@_lubiedo), and Sebastian Garcia (@eldracote)

The RAT analysis research is part of the Civilsphere Project (https://www.civilsphereproject.org/), which aims to protect the civil society at risk by understanding how the attacks work and how we can stop them. Check the webpage for more information.

This is the first blog post of a series analyzing the network traffic of Android RATs from our Android Mischief Dataset [more information here], a dataset of network traffic from Android phones infected with Remote Access Trojans (RAT). In this blog post we provide an analysis of the network traffic of the RAT01-Android Tester v6.4.6 [download here].

RAT Details and Execution Setup

The goal of each of our RAT experiments is to use the software ourselves and to execute every possible action while capturing all the traffic and storing all the logs. So these RAT captures are functional and were used in real attacks.

The Android Tester v.6.4.6 RAT is a software package that contains the controller software, which receives the connections from the victims, and the builder software, which builds the APKs used for infection. The package was executed on a Windows 7 virtual machine pre-configured with all the needed libraries. The Android Application Package (APK) built by the RAT builder was installed in a Genymotion Android virtual emulator with Android version 8.

While performing different actions on the RAT controller (e.g., upload file, get GPS location, monitor files), we have captured the network traffic on the Android virtual emulator. The details about the network traffic capture are:

The controller IP address: 147.32.83.234
The phone IP address: 10.8.0.61
UTC time of the infection in the capture: 2020-08-07 09:01:59 UTC

Initial Communication and Infection

Once the APK was installed in the phone, it directly tries to establish a TCP connection with the C&C server, which is the RAT controller running in our VirtualBox Windows 7. To connect, the phone uses the IP address and the port that were set by us in the controller when building the APK. In our case, the IP address is 147.32.83.234 and the port is 1337.

Figure 1. Initial 3-way handshake from the RAT in the phone to the Command and Control server to establish a TCP connection with the controller. — ***Figure 1.*** Initial 3-way handshake from the RAT in the phone to the Command and Control server to establish a TCP connection with the controller.

After a 3-way handshake was performed and the connection was established, the phone sends the following data:

Figure 2. Data sent by the phone in the Command and Control channel after establishing the TCP connection with the controller. At the beginning it may seem that there is no structure, but the first number 2969 may be a good indication of meaning — ***Figure 2.*** Data sent by the phone in the Command and Control channel after establishing the TCP connection with the controller. At the beginning it may seem that there is no structure, but the first number 2969 may be a good indication of meaning

In this communication with the C&C server, the phone sends data that may appear without a clear structure, as shown in Figure 2. However, at the beginning of this connection there is a number 2969 (32 39 36 39 in hexadecimal) followed by a NULL (00) byte. Each packet sent and received between the controller and the phone contains a number in the beginning of their packets, in printable ASCII, followed by the byte 00.

Data Decoding and Gzip

After a careful analysis we discovered that this number indicates the length of the data sent, excluding the bytes taken to represent the number and the byte 00. Another example of this encoding in another packet sent by the controller is shown in Figure 3. Thus each packet sent from both the controller and the phone has the following format:

{data length}{delimiter}{data}

Figure 3. Another example of the encoding mechanism used in a packet sent from the controller to the phone together with its format. — *Figure 3*. Another example of the encoding mechanism used in a packet sent from the controller to the phone together with its format.

This discovery allowed us to verify the data part more carefully and to discover that after the number and the delimiter, the bytes 1F 8B were sent to indicate that the gzip file signature or magic number was being used. Figure 4 shows a detail of those bytes. This means that the data being transferred by the device is previously being compressed using the DEFLATE algorithm.

Figure 4. The bytes 1F and 8B represent the magic number header of the Gzip file signature for the DEFLATE protocol. — ***Figure 4.*** The bytes 1F and 8B represent the magic number header of the Gzip file signature for the DEFLATE protocol.

So now the format of packet sent from the phone can be updated to:

{data length}{delimiter}{gzip compressed data}

Figure 5. The form of the packet sent from the phone to the C&C.

Extracting Files from the Traffic

The discovery of the compression header allows us to investigate the traffic and to try to decompress it. This can easily be done using CyberChef tool to decompress the data from the first packet sent after the connection was established.

Figure 6. Decompression of the data sent from the infected phone to the C&C controller  (without the number for data length and delimiter). The packet was decompressed using CyberChef recipe ‘From Hex’ and ‘Gunzip’. — ***Figure 6.*** Decompression of the data sent from the infected phone to the C&C controller (without the number for data length and delimiter). The packet was decompressed using CyberChef recipe ‘From Hex’ and ‘Gunzip’.

In Figure 6 we can see some readable text in the beginning of the output:

1025310249null1024988&false10249w410249510249null & null10249,

and there is also readable data in the end of the output:

10249John10249HMD Global Nokia 6.11024910 & 2910249db004d9769eaadb9102491024910248null.

It was interesting to find that the text “10249” seems to be used like a delimiter. If we delete the string “10249” and combine the text, we will get:

10253null 88&false w4 5 null & null John HMD Global Nokia 6.1 10 & 29 db004d9769eaadb9 10248null.

This data seems to be the one used to initialize phone parameters (client name, phone model, Android version, etc.) when the phone first connects to the controller. Figure 7 shows the screenshot from the controller, when the phone connects, that confirms this suspicion.

Figure 7. The screenshot from the controller when the phone connects to it. — ***Figure 7.*** The screenshot from the controller when the phone connects to it.

Beside these parameters, the phone also sends its background image. In Figure 6 it can be seen that in the output after readable text, there is a Base64 encoded magic number /9j/4A that would indicate that the file type JPEG (JFIF) file format is being used. If we delete the readable text from the output in Figure 5 and decode the remaining Base64 encoded data to binary, like it is done in Figure 8, then we can get the image seen in Figure 9.

Figure 8. Using CyberChef to render images from Base64. — ***Figure 8.*** Using CyberChef to render images from Base64.

Figure 9. Rendered background image sent from the phone to the controller. — ***Figure 9.*** Rendered background image sent from the phone to the controller.

Heartbeat and Long Connections

After sending the first packets with the phone initialization parameters, the phone sent several more packets with the background image and parameters again. Afterwards, it waits for the controller commands.

While waiting for the commands, the controller and the phone exchange packets to check if both of them are alive - a heartbeat - similar to the PING/PONG seen in IRC (Figure 10).

Figure 10. Heartbeat between the controller and the phone. — ***Figure 10.*** Heartbeat between the controller and the phone.

Knowing the format of the messages now we can see that the commands sent from the controller are all in plain text as no compression seems to be necessary (no big data sent from the C&C). An example of the controller command GetExternalStorage:

33.1026110249GetExternalStorage10249

So the packets sent from the controller will get the form:

{data length}{delimiter}{data in plain text}

Figure 11. The format of the packet sent from the C&C to the phone.

An expected property of the C&C channel connections was their length. If we open the Conversations statistics in Wireshark, as shown in Figure 12, several connections between the phone and the controller can be seen. This might happen because the phone was disconnecting from the C&C from time to time. Some of the connections are long, e.g. 2362.4706 seconds (approximately 40 minutes) or 1831.5294 seconds (approximately 31 minutes).

Figure 12. Length of connections between the phone and the controller from Wireshark - Statistics - Conversations. It is clear that some connections are long (+40mins) — ***Figure 12.*** Length of connections between the phone and the controller from Wireshark - Statistics - Conversations. It is clear that some connections are long (+40mins)

Extractor

Extractor [download here] is a tool written in C, created by Lisandro Ubiedo for Android Tester v6.4.6, to extract the messages sent from the phone to the C&C into separate files to easily decompress its data. Each message is extracted into a separate file called ‘stream_xxx.gz’. The GZ file contains the data sent compressed from the phone without data length number and delimiter (Figure 5). Each of those occurrences where the phone sends data to the C&C was called a “stream” and what follows (xxx) is an incremental number to identify this specific stream.

The stream files can be easily decompressed using gzip or gunzip command line tools and then analyzed as seen previously.

How to use extractor?
After cloning or downloading the extractor.c and Makefile into the machine, run make to compile:

$ make

The extractor takes as an input such parameters:
-r pcap to read

-h source IP address

-p source port

-H destination IP address

-P destination port

For the packet capture RAT01.pcap of Android Tester v6.4.6 in the Android Mischief dataset, the command will be:

$ ./extractor -r RAT01.pcap -h 10.8.0.61 -p 37451 -H 147.32.83.234 -P 1337

After executing the command above, the folder out is created containing all the streams:

Figure 13.  The content of the output folder ‘out’ after executing the extractor on RAT01.pcap of Android Tester v6.4.6 from the Android Mischief Dataset. — ***Figure 13.*** The content of the output folder ‘**out’** after executing the extractor on RAT01.pcap of Android Tester v6.4.6 from the Android Mischief Dataset.

Example usage of zcat to decompress the stream_73.gz:

$ zcat stream_73.gz

pump10248null

Conclusion

In this blog we have analyzed the network traffic from a phone infected with the Android Tester v.6.4.6 RAT. We executed the RAT in our own environment and we executed several actions. We were able to understand and decode its communication to extract files transferred from the RAT. It was also clear that the RAT has some distinctive features such as long duration of connection, heartbeat or uncommon ports.

To summarize, the details found in the network traffic of this RAT:

Phone connects directly to the IP address and port specified in APK.
Connection between the phone and the controller is long, i.e. more than 30 minutes.
Packets sent from the phone have the format {data length}{delimiter}{gzip compressed data}.
Packets sent from the controller have a format {data length}{delimiter}{data in plain text}.
There is a heartbeat between the controller and the phone.

YARA rules for Android Tester v6.4.6 created by Lisandro Ubiedo can be found here.

Biographies

Kamila Babayeva

Kamila Babayeva is a bachelor student in the Computer Science and Electrical Engineering program at the Czech Technical University in Prague. She is a researcher in the Civilsphere project, a project dedicated to protecting civil organizations and individuals from targeted attacks. Her research focuses on helping people and protecting their digital rights by developing free software based on machine learning. Currently, Kamila leads the development of the Stratosphere Linux Intrusion Prevent System (Slips), which is used to protect the civil society in the Civilsphere lab.

Lisandro Ubiedo

Lisandro Ubiedo is a malware researcher, programmer and DevOps consultant. He is a collaborator for the Aposemat Project and focuses in reverse engineering and both network and binary malware analysis.

Sebastian Garcia

Sebastian Garcia is a malware researcher and security teacher with experience in applied machine learning on network traffic. He founded the Stratosphere Lab, aiming to do impactful security research to help others using machine learning. He believes that free software and machine learning tools can help better protect users from abuse of our digital rights. He researches on machine learning for security, honeypots, malware traffic detection, social networks security detection, distributed scanning (dnmap), keystroke dynamics, fake news, Bluetooth analysis, privacy protection, intruder detection, and microphone detection with SDR (Salamandra). He co-founded the MatesLab hackspace in Argentina and co-founded the Independent Fund for Women in Tech. @eldracote. https://www.researchgate.net/profile/Sebastian_Garcia6

Dissecting a RAT. Android Tester Trojan Analysis and Decoding.