November 2017 – weicode

Basic syntax :
==============

Filtering hosts :
-----------------

- Match any traffic involving 192.168.1.1 as destination or source
# tcpdump -i eth1 host 192.168.1.1

- As soure only
# tcpdump -i eth1 src host 192.168.1.1

- As destination only
# tcpdump -i eth1 dst host 192.168.1.1


Filtering ports :
-----------------

- Match any traffic involving port 25 as source or destination
# tcpdump -i eth1 port 25

- Source
# tcpdump -i eth1 src port 25

- Destination
# tcpdump -i eth1 dst port 25


Network filtering :
-------------------

# tcpdump -i eth1 net 192.168
# tcpdump -i eth1 src net 192.168
# tcpdump -i eth1 dst net 192.168


Protocol filtering :
--------------------

# tcpdump -i eth1 arp
# tcpdump -i eth1 ip

# tcpdump -i eth1 tcp
# tcpdump -i eth1 udp
# tcpdump -i eth1 icmp


Let's combine expressions :
---------------------------

Negation    : ! or "not" (without the quotes)
Concatanate : && or "and"
Alternate   : || or "or" 

- This rule will match any TCP traffic on port 80 (web) with 192.168.1.254 or 192.168.1.200 as destination host
# tcpdump -i eth1 '((tcp) and (port 80) and ((dst host 192.168.1.254) or (dst host 192.168.1.200)))'

- Will match any ICMP traffic involving the destination with physical/MAC address 00:01:02:03:04:05
# tcpdump -i eth1 '((icmp) and ((ether dst host 00:01:02:03:04:05)))'

- Will match any traffic for the destination network 192.168 except destination host 192.168.1.200
# tcpdump -i eth1 '((tcp) and ((dst net 192.168) and (not dst host 192.168.1.200)))'

So tcpdump doesn’t have a GRE pattern matches, but that doesn’t mean you can’t use ip[xx] to inspect things within gre. In a normal GREv0 packet: ip[36:4] is the source IP address, ip[40:4] is the destination and ip[33] is the protocol.

So, to match all TCP packets within a GRE header, write ‘tcpdump ip[33] = 0x06’, or to match anything sent from 127.0.0.1 within GRE, write ‘tcpdump ip[36:4] = 2130706433’ (or tcpdump ip[36] = 127 and ip[37] = 0 and ip[38] = 0 and ip[39] = 1).

Or, on the other hand, just use wireshark.

If you ever debugged network node with GRE you should know that painful feeling of tcpdump output. Thousands IPs and you can not differentiate them, because they are in GRE.

And there is no filters for IP addresses inside GRE. You can not say ‘tcpdump -ni eth0 proto gre and host 192.168.0.1’. Well, you can, but ‘host’ will be used only to filter source or destination of GRE packets, not the incapsulated IP packet.

Unfortunately there is no nice syntax. Fortunately, there is some, at least.

You’ll need to convert IP address to network-byte-ordered integer. For this every octet should be converted to hex and joined together ‘as is’. 100.64.6.7 will become 0x64400607.

For python: there is module ipaddress, but it’s not available in default installation. So we’ll do it manually with minimal code:

>> “0x%x%x%x%x” % tuple(map(int,’192.168.0.1′.split(‘.’)))
‘0xc0a801’

(sorry for mad code, but I wanted to keep it short).

Result of that code is ‘number’ representing IP address (in the example above – 192.168.0.1).

Now we can run tcpdump:

tcpdump -ni eth1 ‘proto gre and (ip[54:4]=0xc0a801 or ip[58:4]=0xc0a801)’

Numbers in the square brackets near ‘ip’ is offset and size of the field. IPv4 address is 4 bytes long. Because GRE add 42 bytes overhead (20 bytes first IP header, 8 bytes GRE header, 14 bytes encapsulated Ethernet header), we taking normal IP source/destination offset (see here) and adding it.

ip[36:4] is the source IP
ip[40:4] is the destination IP

We can verify the incoming traffic to see if they have VLAN tags by using tcpdump with the -e and vlan option.
This will show the details of the VLAN header:

# tcpdump -i bond0 -nn -e  vlan
To capture the issue live.

# tcpdump -i eno1 -nn -e  vlan -w /tmp/vlan.pcap
To write to the capture to a file.

If tcpdump is unable to filter any traffic, whereas running tcpdump unfiltered does show traffic, then the problem may be due to an extra Ethernet header being added, which is typically a VLAN header: 802.1Q VLAN. Use the tcpdump -e option to see this extra header information, which should look like the following:

. . . ethertype 802.1Q, length 64: vlan 128, p 0, ethertype IPv4,
IP 192.168.128.42.8001 > 192.168.128.90.20700:

Port Filtering

Trying to filter using tcpdump fails. An example is to filter on a known port number, such as the following:

tcpdump -ni eth2 port 8001

If tcpdump is unable to provide a filtered output, then the passive capture software is not able to do so either.

If it is VLAN-type traffic, use the vlan expression operator as part of the filter expression:

tcpdump -ni eth2 vlan and port 8001

Other examples of filtering with VLAN packets:

tcpdump -nr tst.dmp 'ether[12:2] = 0x8100'
tcpdump -nr tst.dmp vlan and ip and port 8001

To show both types of traffic:

tcpdump -nr tst.dmp ip or vlan

HowTo – tell tcpdump to filter mixed tagged and untagged VLAN (IEEE 802.1Q) traffic

source

This article explains why applying tcpdump/libpcap BPF filters on mixed tagged VLAN and untagged ethernet traffic requires great caution. There reason for that is the magic ‘vlan’ keyword shifts all filters by 4 bytes to the right.

A week ago I needed to filter VLAN traffic with tcpdump. Everything went well, as long as *only* tagged or *only* untagged traffic was given as input. However, when trying to filter say UDP packets out of traffic that contains both tagged and untagged packets, tcpdump screwed my filters. As I think this situation may happen to some more people, here some input for nerds struggling with the same issue in the future.

Example doomed to fail with mixed traffic:

tcpdump -nn -v udp

This simple BPF filter should basically deliver all UDP packets, regardless whether the traffic is tagged with a VLAN tag or not. But: it doesn’t. The issue is that tagging traffic inserts four more bytes (namely the VLAN ID) to the ethernet (or more precisely IEEE 802.1Q) header. Without specifically asking for VLAN traffic in the BPF filter, every traffic is parsed as untagged traffic. Thus, the specified filter delivers only untagged UDP packets (i.e., their frames) and drops all tagged traffic.

Now watch out: similar things happen if you specify the mysterious ‘vlan’ keyword in the tcpdump filter. After specifiying the ‘vlan’ keyword, the *subsequent* filters are matched against traffic shifted by 4 bytes to the right. Note that this is also true if you specify ‘not vlan’ as filter. The internals of how tcpdump translates the BPF filter are exposed when calling tcpdump with the -b option:

BPF translation of filter ‘not vlan and udp’:

[root@vm-fedora ~]# tcpdump -nn -d not vlan and udp
(000) ldh      [12]
(001) jeq      #0x8100          jt 10   jf 2
(002) ldh      [16]
(003) jeq      #0x86dd          jt 4    jf 6
(004) ldb      [24]
(005) jeq      #0x11            jt 9    jf 10
(006) jeq      #0x800           jt 7    jf 10
(007) ldb      [27]
(008) jeq      #0x11            jt 9    jf 10
(009) ret      #96
(010) ret      #0

What do we see here? Although we explicitly specified to have untagged traffic, our filter fails and matches UDP traffic that has no VLAN tag but is shifter by 4 byte to the right (i.e., it matches nothing). Our fault was to specify the ‘vlan’ keyword, such that all preceding filters (‘udp’) are matched against shifted traffic. To cope with this issue, one should be careful in which order the filter is put together. If we want to match both tagged and untagged UDP traffic, we have to specify the following filter:

Filter UDP traffic, both VLAN tagged and untagged:

[root@vm-fedora ~]# tcpdump -nn -d "udp or (vlan and udp)"

Or, the generic solution:

Generic filter expression that matches VLAN tagged and untagged traffic:

[root@vm-fedora ~]# tcpdump -nn -d "<filter> or (vlan and <filter>)"

If you want to filter only untagged traffic, specify the following:

Generic filter to match only untagged traffic:

[root@vm-fedora ~]# tcpdump -nn -d <filter> and not vlan

Long story short: When using tcpdump (or libpcap), be careful where to put the ‘vlan’ keyword in your expression. In general, it’s a very bad idea to specify the keyword twice, unless you pack VLAN traffic into VLAN traffic. Maybe these examples are more explanative than the quote below taken from the tcpdump manpage: “Note that the first vlan keyword encountered in expression changes the decoding offsets for the remainder of expression on the assumption that the packet is a VLAN packet.” Recall this (admittedly sometimes strange) behavior is not a bug…

Thanks goes to Nuno Paiva, who sent me an example how to solve matching mixed traffic. Thanks to Dan Cox who spotted missing quotes. Thanks to Max Lukoshkov for spotting a language issue (subsequent vs. preceding).

tcpdump Flags:

TCP Flag tcpdump Flag Meaning

SYN S Syn packet, a session establishment request.

ACK A Ack packet, acknowledge sender’s data.

FIN F Finish flag, indication of termination.

RESET R Reset, indication of immediate abort of conn.

PUSH P Push, immediate push of data from sender.

URGENT U Urgent, takes precedence over other data.

NONE A dot . Placeholder, usually used for ACK.

TCP Flag	tcpdump Flag	Meaning
SYN	S	Syn packet, a session establishment request.
ACK	A	Ack packet, acknowledge sender’s data.
FIN	F	Finish flag, indication of termination.
RESET	R	Reset, indication of immediate abort of conn.
PUSH	P	Push, immediate push of data from sender.
URGENT	U	Urgent, takes precedence over other data.
NONE	A dot .	Placeholder, usually used for ACK.

source

For the simple case of iterating over the lines of a file you can do:

open(my $fh, '<', 'foobar.txt')
    || die "Could not open file: $!";
while (<$fh>) 
{ # each line is stored in $_, with terminating newline
  # chomp, short for chomp($_), removes the terminating newline
    chomp; 
    process($_);
}
close $fh;

File encoding can be specified like:

open(my $fh, '< :encoding(UTF-8)', 'foobar.txt')
    || die "Could not open file: $!";

The angle bracket operator < > reads a filehandle line by line. (The angle bracket operator can also be used to open and read from files that match a specific pattern, by putting the pattern in the brackets.)

Without specifying the variable that each line should be put into, it automatically puts it into $_, which is also conveniently the default argument for many Perl functions. If you wanted to use your own variable, you can do something like this:

open(my $fh, '<', 'foobar.txt')
    || die "Could not open file: $!";
while (my $line = <$fh>) 
{
    chomp $line;
    process($line);
}
close $fh;

The special use of the angle bracket operator with nothing inside, will read from all files whose names were specified on the command line:

while (<>) {
    chomp;
    process($_);
}

As noted in perlop.pod under “I/O Operators”, <> opens with the 2-arg open() and so can read from a piped command. This can be convenient but is also very much insecure–a user could supply a file with the name like

perl myscript.pl 'rm -rf / |'

or any other arbitrary command, which will be executed when perl attempts to open a pipe for it. As such, this feature is best reserved for one-liners and is bad practice to use in production code. The same is true for the open(FILEHANDLE, EXPR) form of open as opposed to open(FILEHANDLE, MODE, EXPR). (See perlfunc.pod on the open() function.)

The ARGV::readonly module can defang @ARGV by modifying the names to ensure they are treated only as files by the open().

The readline function can be used instead of < >:

open(my $fh, '<', 'foobar.txt') or die "$!";
while (readline $fh)
{ ... }
 
while (my $line = readline $fh)
{ ... }
close $fh;

The readline function is the internal function used to implement < >, but can be used directly and is useful for conveying programmer intent in certain situations.

The problem is this:

my @array=(<FILE>); ## slurp a whole Terabyte into RAM

Do not read a file like this, unless it is very small or your memory is huge, instead use:

while (<FILE>) { # read each line, then forget about it

chomp;

split /t/;

....

}

In addition your code is extremely inefficient:

You are iterating over all entries of a huge array for each line of a huge file. Hashes in Perl provide constant time access to elements given the hash key. Use a hash of hashes to store your filter table.

You seem to filter a large file by a small file, therefore:

process the small file first (using while construct), parse the small file into a hash using the location as key of a nested hash
process the large file as above and look up each of the entries in the hash created before

As a result you will only need as much memory as is required to store the small file.

In the unique case of deleting lines at the end of a file, you can use tell() and truncate(). The following code snippet deletes the last line of a file without making a copy or reading the whole file into memory:

        open (FH, "+< $file");
        while ( <FH> ) { $addr = tell(FH) unless eof(FH) }
        truncate(FH, $addr);

weicode

Month: November 2017

tcpdump

Port Filtering

HowTo – tell tcpdump to filter mixed tagged and untagged VLAN (IEEE 802.1Q) traffic

Perl Line-by-Line Read