Malware Capture Facility Project
The Stratosphere IPS Project has a sister project called the Malware Capture Facility Project that is responsible for making the long-term captures. This project is continually obtaining malware and normal data to feed the Stratosphere IPS.
Why do we capture Malware, Normal, and Mixed traffic?
Machine learning algorithms need to be verified to find out their precise performance in real data. Specially in network computer security it is really important to have good datasets, because the data in the networks is infinite, changing, varied and with a high concept drift. These issues force us to obtain good datasets to train, verify and test the algorithms.
To make a good verification we need three types of traffic: Malware, Normal and Background. The Malware traffic will include all the things we want to detect, specially C&C (Command and Control) connections. The Normal traffic is very important to find out the real performance of our algorithms by computing the False Positives and True Negatives. The Background traffic is necessary to saturate the algorithms, verify its memory/speed performance and to test if the algorithm gets confused with the data.
Our datasets are composed by long term malware captures, manual attacks, normal captures, and mixed captures. These are the normal captures we performed. In each folder there is a description of the behavior captured.
Normal capture of DNS traffic from a real computer. This capture was reduced to include only DNS traffic for privacy issues.
Normal capture of a normal user (User A) in a real computer in a University Network. The user has a Linux Operating System and worked for serveral hours. Only binetflow traffic included for privacy reasons.
Normal capture of a normal user (User B) in a real computer in a University. The user has a Linux Operating System and worked for serveral hours. There are several files with weblogs and information, such as Bro. The pcap file is filtered to have only DNS packets for privacy reasons.
Normal P2P capture from a normal user in a Linux notebook at home. There are several files, such as pcap, weblogs and Bro logs.
Normal capture from a normal user in a Windows computer, executing a file. There are several files, such as pcap, weblogs and Bro logs. (MD5 10a57a1ed06a26988a55d587662acf64)
Normal capture, specially with P2P, from a normal Linux notebook in a xDSL network. There are some webpages accesses.