Packet Capture, part 3: Analysis Tools

by O'Reilly Press

In this segment from the O'Reilly book, Network Troubleshooting Tools, you will learn all abut how to use analysis tools for sanitizing the data, reformatting the data, and for presenting and analyzing the data.

Network Troubleshooting Tools
by Joseph D. Sloan

Packet Capture -- Part 3
Network Troubleshooting Tools - click to go to publisher's site

1.5. Analysis Tools

As previously noted, one reason for using tcpdump is the wide variety of support tools that are available for use with tcpdump or files created with tcpdump. There are tools for sanitizing the data, tools for reformatting the data, and tools for presenting and analyzing the data.

1.5.1. sanitize

If you are particularly sensitive to privacy or security concerns, you may want to consider sanitize, a collection of five Bourne shell scripts that reduce or condense tcpdump trace files and eliminate confidential information. The scripts renumber host entries and select classes of packets, eliminating all others. This has two primary uses. First, it reduces the size of the files you must deal with, hopefully focusing your attention on a subset of the original traffic that still contains the traffic of interest. Second, it gives you data that can be distributed or made public (for debugging or network analysis) without compromising individual privacy or revealing too much specific information about your network. Clearly, these scripts won't be useful for everyone. But if internal policies constrain what you can reveal, these scripts are worth looking into.

The five scripts included in sanitize are sanitize-tcp, sanitize-syn-fin, sanitize-udp, sanitize-encap, and sanitize-other. Each script filters out inappropriate traffic and reduces the remaining traffic. For example, all non-TCP packets are removed by sanitize-tcp and the remaining TCP traffic is reduced to six fields -- an unformatted timestamp, a renumbered source address, a renumbered destination address, the source port, a destination address, and the number of data bytes in the packet.

 934303014.772066 > . ack 3259091394 win 8647 (DF)
                          4500 0028 b30c 4000 8006 2d84 cd99 3f1e
                          cd99 3fee 0496 0017 00ff f9b3 c241 c9c2
                          5010 21c7 e869 0000 0000 0000 0000

would be reduced to 934303014.772066 1 2 1174 23 0. Notice that the IP numbers have been replaced with 1 and 2, respectively. This will be done in a consistent manner with multiple packets so you will still be able to compare addresses within a single trace. The actual data reported varies from script to script. Here is an example of the syntax:

bsd1# sanitize-tcp tracefile

This runs sanitize-tcp over the tcpdump trace file tracefile. There are no arguments.

1.5.2. tcpdpriv

The program tcpdpriv is another program for removing sensitive information from tcpdump files. There are several major differences between tcpdpriv and sanitize. First, as a shell script, sanitize should run on almost any Unix system. As a compiled program, this is not true of tcpdpriv. On the other hand, tcpdpriv supports the direct capture of data as well as the analysis of existing files. The captured packets are written as a tcpdump file, which can be subsequently processed.

Also, tcpdpriv allows you some degree of control over how much of the original data is removed or scrambled. For example, it is possible to have an IP address scrambled but retain its class designation. If the -C4 option is chosen, an IP address such as might be replaced with Notice that address classes are preserved -- a class C address is replaced with a class C address.

There are a variety of command-line options that control how data is rewritten, several of which are mandatory. Many of the command-line options will look familiar to tcpdump users. The program does not allow output to be written to a terminal, so it must be written directly to a file or redirected. While a useful program, the number of required command-line options can be annoying. There is some concern that if the options are not selected properly, it may be possible to reconstruct the original data from the scrambled data. In practice, this should be a minor concern.

As an example of using tcpdpriv, the following command will scramble the file tracefile:

bsd1# tcpdpriv -P99 -C4 -M20 -r tracefile -w outfile

The -P99 option preserves (doesn't scramble) the port numbers, -C4 preserves the class identity of the IP addresses, and -M20 preserves multicast addresses. If you want the data output to your terminal, you can pipe the output to tcpdump:

bsd1# tcpdpriv -P99 -C4 -M20 -r tracefile -w- | tcpdump -r-

The last options look a little strange, but they will work.

For normal text-based operations, there are an overwhelming number of options and possibilities. One of the most useful is the -l option. This produces a long listing of summary statistics on a connection-by-connection basis. What follows is an example of the information provided for a single brief Telnet connection:

 TCP connection 2:
         host c:        sloan.lander.edu:1230
         host d:
         complete conn: yes
         first packet:  Wed Aug 11 11:23:25.151274 1999
         last packet:   Wed Aug 11 11:23:53.638124 1999
         elapsed time:  0:00:28.486850
         total packets: 160
         filename:      telnet.trace
    c->d:                              d->c:
      total packets:            96           total packets:            64
      ack pkts sent:            95           ack pkts sent:            64
      pure acks sent:           39           pure acks sent:           10
      unique bytes sent:       119           unique bytes sent:      1197
      actual data pkts:         55           actual data pkts:         52
      actual data bytes:       119           actual data bytes:      1197
      rexmt data pkts:           0           rexmt data pkts:           0
      rexmt data bytes:          0           rexmt data bytes:          0
      outoforder pkts:           0           outoforder pkts:           0
      pushed data pkts:         55           pushed data pkts:         52
      SYN/FIN pkts sent:       1/1           SYN/FIN pkts sent:       1/1
      mss requested:          1460 bytes     mss requested:          1460 bytes
      max segm size:            15 bytes     max segm size:           959 bytes
      min segm size:             1 bytes     min segm size:             1 bytes
      avg segm size:             2 bytes     avg segm size:            23 bytes
      max win adv:            8760 bytes     max win adv:           17520 bytes
      min win adv:            7563 bytes     min win adv:           17505 bytes
      zero win adv:              0 times     zero win adv:              0 times
      avg win adv:            7953 bytes     avg win adv:           17519 bytes
      initial window:           15 bytes     initial window:            3 bytes
      initial window:            1 pkts      initial window:            1 pkts
      ttl stream length:       119 bytes     ttl stream length:      1197 bytes
      missed data:               0 bytes     missed data:               0 bytes
      truncated data:            1 bytes     truncated data:         1013 bytes
      truncated packets:         1 pkts      truncated packets:         7 pkts
      data xmit time:       28.479 secs      data xmit time:       27.446 secs
      idletime max:         6508.6 ms        idletime max:         6709.0 ms
      throughput:                4 Bps       throughput:               42 Bps

This was produced by using tcpdump to capture all traffic into the file telnet.trace and then executing tcptrace to process the data. Here is the syntax required to produce this output:

bsd1# tcptrace -l telnet.trace

Similar output is produced for each TCP connection recorded in the trace file. Obviously, a protocol (like HTTP) that uses many different sessions may overwhelm you with output.

There is a lot more to this program than covered in this brief discussion. If your primary goal is analysis of network performance and related problems rather than individual packet analysis, this is a very useful tool.

Network Troubleshooting Tools - click to go to publisher's site

The next segment from Network Troubleshooting Tools will cover packet analyzers.

This article was originally published on Tuesday Nov 20th 2001