Introduction
The maximum transmission unit is the largest number of network-layer bytes you can send and receive on an interface. ‘Network layer’ implies that the count does not include any link layer framing, such as ethernet headers or SDH segment overhead, but does include network layer headers, such as IP or IPv6 packet headers.
The ethernet MTU is 1500. In practice all ethernet interfaces can send slightly more than this, to allow for link-layer extension headers such as VLAN or MPLS tags.
Many gigabit ethernet interfaces can send much larger packets: typically, but not always, a little over 9000 bytes. These ‘jumbo frames’ are not standardised by the IEEE, as in their view an ethernet should interoperate with all other ethernet. The academic and research networks discussed jumbo frames at an Internet2 Joint Techs Meeting and declared that their networks would pass jumbo frames of 9000 bytes, this decision was adopted by APAN and TERENA and also seems to have become the commercial practice.
That meeting made one other decision: the aim of jumbo frames is to present a 9000 byte path from host interface to host interface. This means the engineering practice with jumbo frames is slightly different to that with standard frames — when running tunnels you engineer the network to present 9000 bytes to traffic passing through the tunnel.
So how to tell if you have a 9000 byte clean path? Use an ICMP Echo Request to send a jumbo frame, setting IP's Do Not Fragment bit. If you get a ICMP Echo Reply then the path cannot have fragmented your jumbo frame.
That should be simple enough. Unfortunately ping programs differ as to what their ‘size’ parameters indicate. Network engineers want the size to be the entire network layer, but it is simpler to implement ‘size’ as the number of bytes in the ICMP Echo Request payload.
An IPv4 header without options is 20 bytes, an ICMP header is 8 bytes, an IPv6 header without options is 40 bytes, an ICMP6 header is 4 bytes.
Linux, configuring MTU
From the command line:
$ ip link set dev eth0 mtu 9000
In Debian edit /etc/network/interfaces:
interface eth0
mtu 9000
In Red Hat edit /etc/sysconfig/network-scripts/ifcfg-eth0:
MTU=9000
In NetworkManager edit /etc/NetworkManager/system-connections/eth0:
[802-3-ethernet]
mtu=9000
There is a DHCP option for Interface-MTU, which many Linux distributions implement. All IPv6 autoconf hosts will automatically set interface's MTU for IPv6 traffic. In practice Linux device drivers do not implement differing MTUs for IPv4 and IPv6 traffic.
$ ip link show dev eth0
1: eth0: mtu 9000 qdisc mq state UP mode DEFAULT qlen 1000
link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
Linux ping, IPv4
size = mtu − icmpv4_header − ipv4_header
size = mtu − 8 − 20
size = mtu − 28
To check a 1500 MTU:
$ ping -c 1 -M do -s 1472 remote.example.edu.au
To check a 9000 MTU:
$ ping -c 1 -M do -s 8972 remote.example.edu.au
Linux ping6, IPv6
size = mtu − icmpv6_echo_header − icmpv6_header − ipv6_header
size = mtu − 4 − 4 − 40
size = mtu − 48
To check a 1500 MTU:
$ ping6 -c 1 -M do -s 1452 remote.example.edu.au
To check a 9000 MTU:
$ ping6 -c 1 -M do -s 8952 remote.example.edu.au
Cisco, configuring MTU
Cisco IOS allows MTU to be set for an interface, which changes the default for all network layer packets which use that interface:
interface Ethernet0
mtu 9000
It also allows the MTU to be specified for each network-layer protocol, in the rare cases where that is desirable:
interface Ethernet0
ip mtu 9000
ipv6 mtu 9000
Router> show interfaces Ethernet 0 | include MTU
MTU 9000 bytes, BW 10000 Kbit, DLY 1 msec,
Router> show ip interface Ethernet 0 | include MTU
MTU is 9000 bytes
Router> show ipv6 interface Ethernet 0 | include MTU
MTU is 9000 bytes
Cisco ping, IPv4
datagram_size = mtu
To check a 1500 MTU:
Router> ping ip
Target IP address: remote.example.edu.au
Repeat count [5]: 1
Datagram size [100]: 1500
Timeout in seconds [2]:
Extended commands [n]: y
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: y
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 192.0.2.1, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 1/1/1 ms
To check a 9000 MTU:
Router> ping ip
Target IP address: remote.example.edu.au
Repeat count [5]: 1
Datagram size [100]: 9000
Timeout in seconds [2]:
Extended commands [n]: y
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: y
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 1, 9000-byte ICMP Echos to 192.0.2.1, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 1/1/1 ms
Cisco ping, IPv6
datagram_size = mtu
Note that Cisco IOS does fragmentation of transmitted ICMP. If you send a ICMP request of 9000 bytes over a 1500 byte egress interface then the Echo Request will appear as multiple 1500 byte packets. If you send a ICMP request of 9000 bytes over a 9000 byte egress interface then the ICMP Echo Request will appear as one packet. In this case, any downstream device with a MTU of 1500 will fail to forward the packet, and thus can be discovered. In short, check the routing and the MTU on the interface prior to sending ICMP Echos.
Router> show ipv6 route 2001:DB8:0001:0001::1
IPv6 Routing Table - 1 entries
Codes: C - Connected, L - Local, S - Static, R - RIP, B - BGP
U - Per-user Static route
I1 - ISIS L1, I2 - ISIS L2, IA - ISIS interarea, IS - ISIS summary
O - OSPF intra, OI - OSPF inter, OE1 - OSPF ext 1, OE2 - OSPF ext 2
ON1 - OSPF NSSA ext 1, ON2 - OSPF NSSA ext 2
C 2001:DB8:0001:0001::/64 [0/0]
via ::, Ethernet0
Router> show ipv6 interface Ethernet 0 | include MTU
MTU is 9000 bytes
To check a 1500 MTU:
Router> ping ipv6
Target IPv6 address: remote.example.edu.au
Repeat count [5]: 1
Datagram size [100]: 1500
Timeout in seconds [2]:
Extended commands? [no]:
Sweep range of sizes? [no]:
Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 2001:DB8:0001:0001::1, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 1/1/1 ms
To check a 9000 MTU:
Router> ping ipv6
Target IPv6 address: remote.example.edu.au
Repeat count [5]: 1
Datagram size [100]: 9000
Timeout in seconds [2]:
Sweep range of sizes? [no]:
Type escape sequence to abort.
Sending 1, 9000-byte ICMP Echos to 2001:DB8:0001:0001::1, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 1/1/1 ms
Juniper, IPv4
ipv6_n = mtu - icmpv6_header
= mtu - 8
JUNOS can globally set the source IP address for packets generated by the router to that of the lo0.0 interface. This saves considerable messing about.
system {
default-address-selection;
}
However when testing a link we want to use the addressing from the link, not from the router's control plane. Then if the remote router's forwarding table isn't populated it can still has a routable address for the ICMP Echo Reply.
Amnesiac> ping bypass-routing interface et-0/0/0.0 …