The New and Improved Attacker IP Prioritizer

aposemat1.jpg

Introduction

In this post, we will be describing the major updates to the AIP Tool that the Aposemat team has been working on, and therefore also the AIP Blacklists which we publish since they are generated using the AIP Tool.

For a full explanation of the old method used by AIP for sorting and scoring, read this blog post[1].

What is AIP?

The Attacker IP Prioritization (AIP) project by the Aposemat team is devoted to using the AIP Tool [3] that we have been developing to generate IPv4 Blacklists[4] using the data collected from the attacks on the honeypots in our IoT lab[9]. This tool is designed to generate 3 separate blacklists that dynamically add and forget IPs that attack our honeypots by updating the database with new data every 24 hours. The AIP Tool uses a statistical sorting method that places IPs with more network traffic higher in the list than others.  The 3 output AIP Blacklists are designed to be smaller and easier to process, making them especially fit for use in IoT devices that have small CPU’s and not much storage space. These three blacklists only list the IPs that have the most attacks connected with them, thus making it so that the majority of attacks are blocked while minimizing resource usage. The three blacklists can be found in the Aposemat public datasets [8]. 

The contents of the three blacklists are as follows:

  1. AIP_historical_blacklist_prioritized_by_repeated_attackers: This blacklist is designed to prioritize the consistent and aggressive IPs from the data we collect. The AIP Tool creates a data-set that is updated every day with data collected in the last 24 hours and then uses a special algorithm to generate a blacklist from it. The algorithm is designed to prioritize consistency, meaning if an IP attacks every day, thus having higher average network statistics, it will remain on the blacklist longer.

  2. AIP_historical_blacklist_prioritized_by_newest_attackers: The AIP Tool generates this blacklist from the same data-set on the same large data set as the first, but uses a different algorithm. The algorithm prioritizes new and aggressive IPs over consistent ones. In the case of the first blacklist, as long as an IP attacks every day, its score will increase over time. With this blacklist, the older an IP gets, whether it is attacking consistently or not, the more its score will decrease in order to make room for the daily IPs that generate large amounts of traffic for a short time.

  3. AIP_blacklist_for_IPs_seen_last_24_hours: This blacklist is designed to include only the new IPs that have been seen in the last 24 hours, and sort them according to how much traffic they produce, quantified by the number of packets, bytes, events, and length of connections. So while the first two blacklists can contain older IPs, this one only contains the newest ones.

AIP Tool Version 2.0 - What's New!

Now we will describe the major changes made to the AIP Tool, which principally were done in the algorithms used for sorting the IPs using the network data. It should be noted that since the AIP Blacklists we publish are generated using this tool, they are going to be affected by these changes 

Aging - Instantaneous to Gradual Score Regain

The first major update is in regards to the aging methodology. By aging, we refer to how in the AIP Tool if an IP stops attacking, its score will be incrementally decreased as time goes on, or aged. In the previous AIP Tool version, after an IP was assigned a score by one of the scoring algorithms mentioned above, the time since the last attack was checked. If the time was greater than a certain amount, the score was aged, and the amount by which the score was aged increased as more days since the last attack went by. However, as soon as the IP attacked again, the score was then immediately restored to its previous value before the aging started. This proved to be problematic due to the fact that an IP that was inactive for a long time could jump to the top of the list by simply attacking once. 

We added aging tracking in order to solve this issue. Now, there is a separate database that keeps track of how much the score of an IP has decreased due to inactivity. As soon as the IP is active again, the aging value for that IP continues to be subtracted from the calculated score, thus making sure that in order for the IP to regain its former position it must attack a similar amount as before.

Scoring - Linear Combination to Normalization

Another issue with Version 1.0 of the AIP Tool was the method implemented for calculating the score in the scoring algorithms. In this post [1] there is a more in-depth explanation regarding the old method. Basically, the scoring method used in version 1.0 was a linear combination of the data types. This approach caused problems because certain data types were orders of magnitude larger than others, thus making the smaller data values play an insignificant role in the scoring.

In version 2.0, we fixed this by normalizing each feature across the entire database. For example, in the database IP-A has an average number of events per day of 50. Instead of using 50 to calculate the score, AIP now normalizes the value to a value between 0 and 1 by comparing it to the IPs in the database that have the highest and lowest average event values. In this way, each data type will play an equal role in the final score.

Whitelisting - IPs that will never be blocked

We added a module to the AIP Tool that allows for customization of the IPs that are not allowed to be blocked by the blacklist. For the blacklists published by the lab, this includes monitoring services and some web service platforms that are known to be harmless[5][10]. These blacklists are generated off of data collected from the incoming connection to the honeypots in the Aposemat IoT lab, thus rendering most other data filtering unnecessary.

Dynamic Size

The last major update was to the size of the blacklist. In version 1.0, the size of the blacklist was set to a static value of 25k IPs. This was done to simplify the analysis. In version 2.0, the blacklist is regulated by the lowest score threshold, namely, an IP is only blacklisted if its threat score is higher than a certain amount. This was done in order to further achieve the goal of making the blacklist as small as possible. Often there are IP addresses that will generate an extremely low amount of traffic that should not be on the blacklist, but they are because it’s score is just high enough to make it to the top 25,000. This lowest score threshold removes this problem. 

All of the AIP data-sets have been regenerated using the new version 2.0 and are available here[2]. If you wish to access the old data-sets, they can be found here [6].  

Conclusion

We have made many important improvements in how AIP runs. AIP has improved aging, normalized data for more balance, a whitelist module to prevent mistakes in data filtration, and a dynamic size in order to help keep it small. Feel free to check out its open-source code in GitHub[7], and download the three daily blacklists[8].

Acknowledgment

This research was done as part of our ongoing collaboration with Avast Software in the Aposemat project. The Aposemat project is funded by Avast Software.

 
logo-avast.png
 

Contact

If you have further questions, don’t hesitate to contact us!

aposemat@aic.fel.cvut.cz

References

  1. Attacker IP Prioritizer Program. Thomas O’Hara November 14, 2019. https://www.stratosphereips.org/blog/2019/11/5/attacker-ip-prioritizer-program

  2. AIP Blacklist Public Datasets https://mcfp.felk.cvut.cz/publicDatasets/CTU-AIPP-BlackList/

  3. AIP Tool Webpage https://www.stratosphereips.org/aip-tool

  4. AIP Blacklist Webpage https://www.stratosphereips.org/attacker-ip-prioritization-blacklist

  5. Uptime Robot IPs https://uptimerobot.com/help/locations/

  6. Version 1.0.3 Blacklist Database https://mcfp.felk.cvut.cz/publicDatasets/CTU-AIPP-BlackList/V1.0.3-Depreciated/

  7. AIP Tool Github Page https://github.com/stratosphereips/AIP-Blacklist-Algorithm

  8. Todays AIP Blacklists https://mcfp.felk.cvut.cz/publicDatasets/CTU-AIPP-BlackList/Todays-Blacklists/

  9. Aposemat IoT Lab Webpage https://www.stratosphereips.org/aposemat/

  10. Official Google Ranges https://md5calc.com/google/ip