Frank McConnell wrote:
Michael B. Brutman wrote:
Does anybody have a good technique for setting up
a simple network
that will result in IP fragments of TCP?
Interpose a router with two interfaces, one configured with a smaller
MTU. I expect that's what you're using the Linux box to do. It's either
that or fiddle with the sending IP stack so that IP knows the network
has a small MTU but TCP doesn't find out.
That is exactly what the Linux box in the middle is supposed to be
doing. One interface is setup for an MTU of 1500, and the other is
setup for an MTU of 576.
If the sending host is setting the DF bit expecting to
get ICMP
messages back for path MTU discovery, hmm, that will make this tricky:
you could get your router to drop those ICMP messages before they go
out, but then your router will give the appearance of silently
dropping those DF'd datagrams, and the sending host's path MTU
discovery may discover the path MTU slowly through timeouts and
backoff, or it may not discover it at all, with the result that the
send just fails.
All of the modern TCP/IP implementations seem to use a combination of
these techniques. The Windows XP machine I'm using for the FTP server
definitely is, and I really don't want to mess with the registry.
Overall this is a good thing for end users, but a pain for those of us
trying to implement.
What you really want to do in that case is have your
router clear the
DF bit, adjust the IP header checksum, and then go on to fragment the
datagram before sending it out the smaller-MTU interface. This isn't
RFC-compliant behavior, but you're wanting to test stuff, right?
And since you're wanting to test stuff, the next things will be to
check that you're assembling the datagram correctly when you've got
all the fragments, and to check that you give up when you don't get
all the fragments after some time, and to check that you don't leak
memory or packet buffers either when you assemble the datagram or when
you give up on the datagram.
I wouldn't use Linux for this, but that's because I've done things
like this before, using FreeBSD, ipfw, dummynet, netgraph, and
a small C program to do the DF-clearing stuff. They're the tools
I'm comfortable with for this sort of thing. You may have a learning
experience either way, and you've already got a penguin handy, you
probably know how to work it better than I do.
I'm hesitant to start writing my own router test code because it is
error prone too and I'll wind up making my code match the test. But
after giving it some more thought, I would get much better error
injection capability than I have now if I started mucking with the
packets myself.
On a related
note, is this even worth it? I don't know of anything
that needs to send fragments except for NFS over UDP. There might be
other applications that send big packets over UDP but those would be
the only class of applications that absolutely require fragment
support. With TCP it is nice, but a user should be able to get around
any problem by setting the local MTU to 576.
It's not that things send fragments, it's that a network link in the
middle has a smaller MTU than the networks on the ends and that
routers having more knowledge of this than the hosts on the ends
fragment large packets as they pass through.
Correct, but actually in the case of large UDP packets the source
machine sends the fragments. TCP is too dang smart and tries all of the
path MTU discovery tricks, but UDP doesn't have that luxury. Which is
great because it does give me an easy way to test - but only with UDP.
I put the gateway in the middle of the machines to try to force TCP
fragments. I may have to get more perverse and use a SLIP connection to
the DOS machine, which has a much smaller MTU. But I suspect that the
TCP path length discovery will get in the way again. (I'm really going
to have to try to turn all of that off.)
The user first has to recognize the problem as one
that can be got
around in this way. That's a learned response, and I'm not sure how
people learn it these days. And yeah, you may be doing this for a
PCjr and not supporting a web browser or NFS, but FTP can trip over
this pretty easily, and Telnet can too if the phase of the moon is
just right.
Right. And that's why I want to add fragmentation support. I get an
occasional report of a program not working, and if it just looks like a
failed connection or a packet loss problem there isn't much I can do. I
set the DF bit on, but ICMP gets filtered too often so I can't rely on
it to tell me that fragmentation is needed. For those users I tell them
to lower the MTU, but I don't know how many people are giving up without
talking to me first. (And I have all of this in the README, but nobody
reads those.)
What I'm really saying here is, if you should
decide that fragment
reassembly isn't worth doing, think instead about having your TCP
stack figure out that it's going to be sending to a non-local host,
and adjust its MSS downward for those connections. And giving the
user a way to turn that behavior off if he's sure it's safe and he
finds the switch.
-Frank McConnell
That's not a bad idea! Local subnet = large MTU is probably safe,
anything else fall back to 576.
The additional expense of the fragmentation reassembly code is 3KB of
code, and anywhere from 4kb to 32kb of data depending on the
configuration. I was pre-allocating the memory for the packet
reassembly; seeing how rare it really is I might go to malloc and deal
with the performance penalty. Once you start having to reassemble
packets your performance has gone to hell anyway.
Thanks - this has been helpful.
Regards,
Mike