Network Flow Analysis
... using open source tools
Overview
Network flow information, collected under the Provider Exception of the Pen Register and Tap and Trace statues, has proven to be in invaluable resource. In some instances, it can be used to discover system compromises that did not alert the IDS. In other cases, network flow data provided one of the few sources of reliable information for determining if "personal information was, or is reasonably believed to have been, acquired by an unauthorized person" during a compromise.
As tools become more sophisticated, network flow based anomaly detection will be a grand complement to primarily signature based IDS.
Argus
Argus was the first tool used by our office for network flow analysis. Although it is capable of extracting much more granular information (such as inter-packet arrival times) we are using it simply to estimate bandwidth usage.
The estimate is created by ignoring traffic using protocols other than tcp or udp and then discarding "small" flows. Notably, this technique will not help in detecting low-and-slow or covert network flows (i.e. Stacheldraht). Rather, as implemented, this technique allows us to more easily identify loud-and-obnoxious hosts such as warez servers and spam bots with modest resources.
Back-end processing
Every four hours we...
- Rotate Argus data
- Process data through six flow models
- Reformat data to custom .analysis files.
$ ragator -f $FMODEL -r "$DATA" -c -n -g -p1 -s bytes $FILTER 27 Mar 06 11:33:25.9 116.4 tcp 10.10.15.251.* -> 67.43.175.202.80 409 550 68271 678434 RST 27 Mar 06 11:29:23.2 359.9 icmp 10.10.20.7 <-> 198.188.128.12 3 3 294 294 ECO 27 Mar 06 11:30:03.0 319.8 icmp 199.77.193.9 -> 10.10.30.141 7 0 490 0 TXD 27 Mar 06 11:29:50.0 379.0 udp 10.19.30.141 -> 164.67.194.225.2122 2 0 156 0 TIM 27 Mar 06 11:29:47.0 359.6 udp 128.112.139.80 <-> 10.10.30.141.8089 10 6 3348 2928 CON 27 Mar 06 11:34:48.9 33.9 tcp 10.10.15.14.* -> 70.86.209.146.80 12 13 6543 4279 FIN
$ ragator -f $FMODEL -r "$DATA" -c -n -g -p1 -s bytes $FILTER | ./argus-analysis.pl 27 Mar 06 11:33:25.9 116.4s tcp 10.19.15.251:* -> 67.43.175.202:80 4.6kbps 45.5kbps
Although the Argus binary files are compact in comparison to say a default tcpdump capture, these files can grow to several gigabytes in just a few hours and processing that data can take hours.
Font-end processing
Through a series of custom batch scripts and CGIs, the bandwidth usage data has
been made accessible through a web interface. Among its features, this
web front end plots the estimated Internet bandwidth usage of a host against
the estimated bandwidth usage of the entire campus. In the example, we can see
the top-most host producing nearly one-third of the campus outbound traffic.
IPAudit
Our office began using IPAudit starting late 2005. In addition to
identifying top bandwidth users or "busiest hosts", the software also provides
a graphical representation of network host sweeps, provides metrics like host
and connection counts and does some heuristic (port-based) protocol and server
profiling.
We attempted to use the MySQL reporting feature in hope of developing some queries for heuristic anomaly detection. However, we found that among other issues, the insert batches began to overlap. Maintaining 20 days of flow data amounted in about 57 million rows in the connections table and keep a load average of 2.3 on the database server.
$ gzcat 2006-03-27-08\:00.txt.gz | tail 010.010.040.058 082.054.137.189 6 3897 4662 867794 19677218 8633 15657 08:00:02.7605 08:30:01.1104 2 2 010.010.030.141 128.111.052.062 17 5850 5850 110937 635829 1101 1074 08:00:02.7605 08:28:08.7060 2 2 010.010.040.058 082.135.201.065 6 3906 4662 3753509 19716358 12035 17235 08:00:02.7610 08:30:01.0297 2 2 010.010.020.136 165.254.012.202 6 1813 80 9446 1163 9 7 08:00:02.7616 08:00:03.1129 2 2 010.010.040.224 204.102.114.056 6 1123 80 2615794 74941 1778 1134 08:00:02.7618 08:02:07.0041 2 1
Currently, we're using the out-of-the-box flat-file configuration of IPAudit-Web and have found it to be, so far, much more manageable.
ntop
The last time our office experimented with using ntop to monitor network flow a the campus gateway (mid to late 2003), the software quickly became overwhelmed. However, the software promises to report much more information that either Argus or IPAudit. It is actively being developed and according to their change logs performance and resource issues have been addressed. We plan to reevaluate ntop starting Summer 2006.
