Personal tools
You are here: Home Information Tools Network Flow

Network Flow Analysis

... using open source tools

Overview

Network flow information, collected under the Provider Exception of the Pen Register and Tap and Trace statues, has proven to be in invaluable resource. In some instances, it can be used to discover system compromises that did not alert the IDS. In other cases, network flow data provided one of the few sources of reliable information for determining if "personal information was, or is reasonably believed to have been, acquired by an unauthorized person" during a compromise.

As tools become more sophisticated, network flow based anomaly detection will be a grand complement to primarily signature based IDS.

Argus

Argus Homepage (QoSient)

Argus was the first tool used by our office for network flow analysis. Although it is capable of extracting much more granular information (such as inter-packet arrival times) we are using it simply to estimate bandwidth usage.

The estimate is created by ignoring traffic using protocols other than tcp or udp and then discarding "small" flows. Notably, this technique will not help in detecting low-and-slow or covert network flows (i.e. Stacheldraht). Rather, as implemented, this technique allows us to more easily identify loud-and-obnoxious hosts such as warez servers and spam bots with modest resources.

Back-end processing

Every four hours we...

  1. Rotate Argus data
  2. Process data through six flow models
  3. Reformat data to custom .analysis files.
    $ ragator -f $FMODEL -r "$DATA" -c -n -g -p1 -s bytes $FILTER
    27 Mar 06 11:33:25.9      116.4   tcp    10.10.15.251.*      ->     67.43.175.202.80    409      550       68271        678434      RST
    27 Mar 06 11:29:23.2      359.9  icmp      10.10.20.7       <->    198.188.128.12       3        3         294          294         ECO
    27 Mar 06 11:30:03.0      319.8  icmp    199.77.193.9        ->      10.10.30.141       7        0         490          0           TXD
    27 Mar 06 11:29:50.0      379.0   udp    10.19.30.141        ->    164.67.194.225.2122  2        0         156          0           TIM
    27 Mar 06 11:29:47.0      359.6   udp  128.112.139.80       <->      10.10.30.141.8089  10       6         3348         2928        CON
    27 Mar 06 11:34:48.9       33.9   tcp     10.10.15.14.*      ->     70.86.209.146.80    12       13        6543         4279        FIN
    
    $ ragator -f $FMODEL -r "$DATA" -c -n -g -p1 -s bytes $FILTER | ./argus-analysis.pl
    27 Mar 06 11:33:25.9   116.4s tcp    10.19.15.251:*      ->   67.43.175.202:80        4.6kbps    45.5kbps
    

 

Although the Argus binary files are compact in comparison to say a default tcpdump capture, these files can grow to several gigabytes in just a few hours and processing that data can take hours.

Font-end processing

Graph of ARGUS data (Thumbnail) tn-argus-top-local.png Through a series of custom batch scripts and CGIs, the bandwidth usage data has been made accessible through a web interface. Among its features, this web front end plots the estimated Internet bandwidth usage of a host against the estimated bandwidth usage of the entire campus. In the example, we can see the top-most host producing nearly one-third of the campus outbound traffic.

 

IPAudit

IPAudit Homepage

Busiest Hosts from IPAudit Incoming Scans from IPAudit (Thumbnail) Our office began using IPAudit starting late 2005. In addition to identifying top bandwidth users or "busiest hosts", the software also provides a graphical representation of network host sweeps, provides metrics like host and connection counts and does some heuristic (port-based) protocol and server profiling.

We attempted to use the MySQL reporting feature in hope of developing some queries for heuristic anomaly detection. However, we found that among other issues, the insert batches began to overlap. Maintaining 20 days of flow data amounted in about 57 million rows in the connections table and keep a load average of 2.3 on the database server.

$ gzcat 2006-03-27-08\:00.txt.gz | tail
010.010.040.058 082.054.137.189 6 3897 4662 867794 19677218 8633 15657 08:00:02.7605 08:30:01.1104 2 2
010.010.030.141 128.111.052.062 17 5850 5850 110937 635829 1101 1074 08:00:02.7605 08:28:08.7060 2 2
010.010.040.058 082.135.201.065 6 3906 4662 3753509 19716358 12035 17235 08:00:02.7610 08:30:01.0297 2 2
010.010.020.136 165.254.012.202 6 1813 80 9446 1163 9 7 08:00:02.7616 08:00:03.1129 2 2
010.010.040.224 204.102.114.056 6 1123 80 2615794 74941 1778 1134 08:00:02.7618 08:02:07.0041 2 1

Currently, we're using the out-of-the-box flat-file configuration of IPAudit-Web and have found it to be, so far, much more manageable.

ntop

ntop Homepage

The last time our office experimented with using ntop to monitor network flow a the campus gateway (mid to late 2003), the software quickly became overwhelmed. However, the software promises to report much more information that either Argus or IPAudit. It is actively being developed and according to their change logs performance and resource issues have been addressed. We plan to reevaluate ntop starting Summer 2006.

Document Actions
Helpful Tools