Writing a SLIPS Module

This blog post was authored by Dita Hollmannová on 2020/01/31

What is SLIPS and why are modules useful

The Stratosphere Linux Intrusion Prevention System (SLIPS) is a machine learning based Intrusion Prevention System for Linux and Mac, developed at Stratosphere Labs. It collects data from several types of log files and saves it in a database. SLIPS uses modules that process the data further, perform additional checks and store more data for other modules to use. It is easy to write a new module for SLIPS to extend its functionality, as modules are python files mostly independent of the complex SLIPS structure.

In this blogpost, we will walk through the process of developing a new SLIPS module: the VirusTotal (VT) module. This module will listen for new IP addresses and check them against VirusTotal API. VirusTotal returns detailed information on each IP, and the module will process this information and save it to the shared database.

Installing and running SLIPS

To run SLIPS, you will need Python 3.7 with libraries (watchdog, redis and maxminddb), Zeek and a running Redis database. Clone the SLIPS repository and try to run SLIPS.

SLIPS can work with existing capture files:

./slips.py -c slips.conf -r capture.pcap

or you can run it in realtime on one of your interfaces (this might require superuser privileges):

./slips.py -c slips.conf -i eth0

It is important to always specify a configuration file, and there is a sample configuration file ready.

Structure of SLIPS

Core

When SLIPS is run, it immediately spawns several child processes: it creates the I/O processes, looks into the ./modules directory and loads all the modules it can find. It also connects to the database.

Input and output

There are two I/O processes in SLIPS: the input process and the output process. The input process reads from files or from an interface and parses the data, and the output process collects outputs from all other processes and handles them. When writing a module, be sure to pass all outputs to output-process instead of printing them to stdout directly. This will ensure that everything is printed correctly according to current verbosity/debug levels.

Database

The Redis database is where processes can share data. Apart from the typical read and write operations, SLIPS takes advantage of Redis channels. Channels are socket-like structures, where some processes may publish data to certain channels, and others can then subscribe to the channels and be notified when a new message arrives. The VT module relies heavily on the ‘new_ip’ channel, as it listens for all new IP addresses that come through SLIPS.

Profiles and time windows

For each IP address that appeared in the communication, SLIPS creates a profile. A profile is a complete behavior of the IP in the traffic. Each profile is divided into time windows. Each time window is 1 hour long by default, and it contains features computed for all connections that start in that time window.

To output data in a comprehensive log structure, SLIPS creates a new folder for each time it is run. The folder appears in the SLIPS directory by default, and contains folders for each profile. In the folder of a profile, there are files for each time window of the profile.

The log files are updated as soon as new information arrives. However, some inputs that SLIPS uses wait for the communication flow to finish before they report it. For long flows, this can mean information about them arrive several hours “late”. One can never be sure that the time window is fully processed.

Handling of different data flows and time windows inside SLIPS is very complex. Fortunately, it is only needed when working with dynamic information about recent connections, and it does not need to be taken into account when the goal of the module is to collect information on an IP address. 

Developing a module

Getting started

To start writing the module, create a fork of the SLIPS repository and a new branch with a descriptive name. The recent development branch is develop, so be sure to use it. For easy module creation, a template is present in the repository. Create a new folder for the new module, in our case it will be:

modules/virustotal/

Once created, copy the contents of ./modules/template on this new folder. There is an empty __init__.py file, without which the module would not be found, and a template.py file, which should be renamed to match the folder name: virustotal.py. In the template, change the class name and fill in the fields with basic information: name of the module, brief description and authors.

There are three functions in the template:

  • __init__(self, outputqueue, config):
    • Starts a separate process, and subscribes to database channels

  • print(self, text, verbose=1, debug=0): 
    • Sends print statements to output-process. This function must be used instead of printing directly

  • run(self):
    • Main loop, the function waits for a message to arrive and processes it

The run function in the template has a sample code that prints number of profiles in the database.

# Main loop function
while True:
    message = self.c1.get_message(timeout=-1)
    if message['channel'] == 'new_ip':
        # Example of printing the number of profiles in the Database every second
        data = len(__database__.getProfiles())
        self.print('Amount of profiles: {}'.format(data))

This is the loop that reads the new messages in the channel with new IPs and processes them. Notice that there is a try catch clause around the loop, and any errors that are not caught and handled will jump out of the loop, crashing the module. SLIPS will not restart crashed modules, so taking care of all errors is important.

In the VT module, the code inside the loop is replaced by the following lines:

ip = message["data"]
ip_score = self.check_ip(ip)

The message received from the database is actually a dictionary, and the data field contains the IP address:

{'type': 'message', 'pattern': None, 'channel': 'new_ip', 'data': '10.0.2.22'}

The IP address is extracted and processed in a function, returning a tuple of values from the VT API. The module continues by saving the values into the database (more on that later).

__database__.set_virustotal_score(ip, ip_score)

The process of getting the values from the VT API will not be described here, as this blogpost focuses on writing general SLIPS modules. 

Reading values from the configuration file

To make calls to the VT API, each user must register and obtain their own API key, which is a 64-char hex string. Without the API key all calls will be rejected. The module needs to read the key from a file, and the best idea is to let the users specify the file themselves. This calls for using a configuration file and saving the value there.

The config file used by SLIPS is in the .conf format, and is parsed by the https://docs.python.org/3/library/configparser.html library. It contains comments (#), sections ([mysection]) and value declarations (key = value). Settings for the VT module are added at the end of the config file:

[virustotal]
# This is the path to the API key. The file should contain the key at the start of the first line, and nothing more.
# If no key is found, VT module will not be started.
api_key_file = modules/virustotal/api_key_slow

Testing the module

There are three ways to run the module while developing it. Obviously, you can run the whole SLIPS, but then there are other modules running in the background, a lot of outputs in the console, and you cannot control what data is sent to the module - there will be traffic from web pages and IM clients. Another option is to use network captures, but that is not very flexible.

An easy way to test with controlled data is to call the core function directly. This is what the testing file for the VT module is doing. To make this possible, the code of the module must be slightly adjusted: there is a special testing mode in the module, and when enabled, a different print function is used (see line 28 in the module). Switching the print function is necessary, since if SLIPS print() was used all the time, it would crash in testing mode (there is no output process), and if Python print() was used all the time, the module would have to be rewritten before deploying. Once the module is ready, it can be initialized as a separate class and the core function, in this case check_ip(), can be called with the desired input values.

The best way to test is to run all the things the module needs in SLIPS, but without SLIPS itself. This includes starting the module as a separate process, starting the output process, reading the configuration file and then sending data to the module through the database using Redis channels, so the data is received just like in SLIPS. This method is implemented in another module, the WhoisIP module, where testing integration in SLIPS was necessary. The demonstration is seen in the testing file on line 128.

Caching data and saving to the database

Before diving into the details, it should be stressed that all database-related actions should be performed in the ./slips/core/database.py file.

Caching

Sometimes, the module will need to use a cache or save some internal information, and the Redis database should be used to do this. The VT module will use caching - once it gets a score for some IP, it will cache it, and when the IP is used again, data may be read from the cache.

__database__.put_ip_to_virustotal_cache(ip, scores)

Note that this is a demonstration of what caching can be used for. In SLIPS it is always guaranteed that an IP from the ‘new_ip’ channel will be unique, so it is not possible that anything that comes from the channel will be found in the cache.

The convenient way to cache data in Redis is hashsets. Each module can create one or more hashsets and store data in String format there. The VT module has one hashset cache named "virustotal-module-ip-cache".

The values are saved (from within the database class) using self.r.hset(hashset_name:string, key:string, stringvalue:string), and similarly read using self.r.hget(hashset_name:string, key:string). The hget function returns the string value if the hashset was found, and None otherwise. If for some reason you wish to get the entire cache, hgetall(hashset_name) can be used.

Saving

The VT module gets new data for each IP, and this data should be reachable in the database for other modules to use. This is what the Database.set_virustotal_score function is doing. The data is moved to a new dictionary with a descriptive name (“VirusTotal”) and it is appended to previous information that was saved about that IP. Information about all IPs is saved in a hashset named 'IPsInfo'.

Redis channels

As was already mentioned before, SLIPS uses channels to send updates from one process to another. Processes may create new channels and publish to them, and similarly, they can subscribe to channels and listen for new messages - which is what the VT module is doing. To go more into detail of Redis channels, see the overview on Redis website: https://redis.io/topics/pubsub.

For a module developer, it is important to know how the messages are being delivered. The module is waiting for a message to come, and then starts processing the data. This may take some time - for example, if the VT API rejects a request because of API limits, the module must wait one minute (more in rare cases) until the request can be run again. The module will return to waiting for new messages on the channel only after processing of the previous message is finished. Fortunately, no messages will be dropped - they are queued and Redis will eventually deliver them in FIFO order (more on that in this StackOverflow thread: https://stackoverflow.com/questions/27745842/redis-pubsub-and-message-queueing). While the capacity of the queue is limited in theory, no messages were lost during the development and testing of the VT module.

Troubleshooting

Most errors occur when running the module inside SLIPS. These errors are hard to resolve, because warnings and debug messages may be hidden under extensive outputs from other modules.

If the module does not start at all, make sure it is not disabled in the slips.conf file. If that is not the case, check that the __init__.py file is present in module directory, and read the outputs - if there were any errors (eg. import errors), they would prevent the module from starting. To make sure that the module is properly detected, check and debug the slips.py file at line 180, where modules are being started.

In case that the module is started, but does not receive any messages from the channel, make sure that:

  • The channel is properly subscribed to

  • Messages are being sent (in case of new IPs, visit a new website)

  • Other modules subscribed to the channel get the message

  • Module is started in time (this should not be an issue in new SLIPS releases)

Conclusion

Adding a new feature to SLIPS is an easy task. The template is ready for everyone to use and there is not much to learn about SLIPS to be able to write a module.

If you wish to add a new module to the SLIPS repository, issue a pull request and wait for a review. The VT module is now merged and can be accessed in the main repository: https://github.com/stratosphereips/StratosphereLinuxIPS .