Prev: Slow networking between windows and Linux
Next: [Fedora 12] Throw out my parallel port flatbed scanner?
From: Rahul on 11 Feb 2010 18:01 I ususally use "jumbo frames" on a computational cluster of machines since all the connected adapters are large MTU capable. THis is a private VLAN and has its own associated subnet. But recently we wanted to have a border-server straddle the network. This has two adapters and the one in the private-VLAN can do large MTU's and the public-internet-adapter could do normal MTU's. But now we were planning on running NAT (via iptables and masqurade) on the border-server. Is this a problem? I know that Jumbo-frames are problematic unless all hardware end-to-end supports large MTU's. Unfortunately I don't know how exactly NAT affects this. If an interior node tries to communicate with the wider internet (via NAT) will it still use a large MTU and cause problems? Does NAT operate at the network layer and hence this will be a problem? Or not? Are there ways of getting around this? -- Rahul
From: Rick Jones on 11 Feb 2010 20:38 Ignoring NAT for a moment, when a JF host tries to establish a TCP connection to a non-JF host, 99 times out of 10, the MSS options in the SYNchronize segments will mean that the JF host will actually use a non-JF MSS for that connection. The real problem arises with UDP communications - there is no MSS exchange there, so when a JF host sends the 9Kish UDP datagram to the non-JF host it will hit a point where the MTU goes non-JF and likely be dropped as a giant frame or somesuch. Now, the above was for a single broadcast domain. If there is a router bewteen the JF and non-JF networks, the same TCP stuff applies, the UDP datagram will be received by the router and then one of two things happens when the router tries to forward it out the non-JF interface. Either DF was *not* set in the IP header, in which case the router will simply fragment the IP datagram carrying the UDP datagram. If DF (don't fragment) is set in the IP header (I'm assuming IPv4 here) then in the router will drop the Ip datagram and may send-back an ICMP Datagram Too Big message. rick jones -- The computing industry isn't as much a game of "Follow The Leader" as it is one of "Ring Around the Rosy" or perhaps "Duck Duck Goose." - Rick Jones these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: Rahul on 11 Feb 2010 23:49 Rick Jones <rick.jones2(a)hp.com> wrote in news:hl2beh$3s0$1 @usenet01.boi.hp.com: Thanks Rick! That explaination is very helpful. > Ignoring NAT for a moment, when a JF host tries to establish a TCP > connection to a non-JF host, 99 times out of 10, the MSS options in > the SYNchronize segments will mean that the JF host will actually use > a non-JF MSS for that connection. The real problem arises with UDP > communications - there is no MSS exchange there, so when a JF host > sends the 9Kish UDP datagram to the non-JF host it will hit a point > where the MTU goes non-JF and likely be dropped as a giant frame or > somesuch. Luckily all my UDP communication is on the private VLAN. I don't expect any UDP to be NAT'ed. So I should be safe. All hosts on the private VLAN are Jumbo-Frame compliant. > > Now, the above was for a single broadcast domain. If there is a > router bewteen the JF and non-JF networks, the same TCP stuff applies, If I do NAT+masquerade via IPtables then that is my "router", I assume? Just making sure the iptables-NAT does not pack any nasty surprises as opposed to a "physical" router. > the UDP datagram will be received by the router and then one of two > things happens when the router tries to forward it out the non-JF > interface. Either DF was *not* set in the IP header, in which case > the router will simply fragment the IP datagram carrying the UDP > datagram. If DF (don't fragment) is set in the IP header (I'm > assuming IPv4 here) then in the router will drop the Ip datagram and > may send-back an ICMP Datagram Too Big message. And, out of curiosity, is there a way to tell routers (e.g. iptables-NAT mode) that: "Even if DF was set for a UDP (or any other) datagram AND Datagram is larger than a certain MTU; ignore the DF bit and please fragment the datagram and send out". Or am I dabbling in fantasy here? (or worse). Disobeying the application layer's DF request seems the lesser evil than dropping the datagram entirely because it was a JF. Or not? [Of course, a worse option for the router is just to send a JF datagram with the DF bit out into the larger world and then have some other non-JF hardware drop it silently. Glad the router is smarter than that!] -- Rahul
From: Rick Jones on 12 Feb 2010 18:38
In comp.os.linux.networking Rahul <nospam(a)nospam.invalid> wrote: > And, out of curiosity, is there a way to tell routers > (e.g. iptables-NAT mode) that: "Even if DF was set for a UDP (or any > other) datagram AND Datagram is larger than a certain MTU; ignore > the DF bit and please fragment the datagram and send out". Or am I > dabbling in fantasy here? (or worse). I have heard that in the past various pieces of broken kit could be configured to behave that way. I don't recall which but I would not touch any of it with a 10 meter pole. > Disobeying the application layer's DF request seems the lesser evil > than dropping the datagram entirely because it was a JF. Or not? NOT! Dropping the datagram, *and* sending the ICMP message about it back to the source IP is the correct thing to do - it is how PathMTU discovery works. It also happens to be what the specs for IPv4 say should be done :) rick jones -- denial, anger, bargaining, depression, acceptance, rebirth... where do you want to be today? these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH... |