TCP/IP MTU WTFery

Back in July, our home internet service provider decided to muck with some network settings which ended up "breaking the internet" for us. Since I work from home, entirely online, a non-functioning internet connection is a bit of a problem. So, we kicked into gear to try to get things working again. I learned some interesting stuff about TCP/IP networking that I didn't know before, so I thought I'd write it up in case anyone else is interested.

Reboot

First step: reboot ALL THE THINGS! We rebooted the ISP's router, and our wireless access point/router, just in case something had gone wrong in either of those, but the network behavior didn't seem to improve.

Strangely, DNS seemed to be working. I could ping several servers. But if I tried to open web pages like Twitter or Facebook, the browser would hang, and eventually complain that the connection had been closed. Interestingly, Google loaded, and could respond to searches.

Then I tried connecting to my Linux server. SSH would connect, but then tmux would automatically start and try to send a window full of text. At that point, the connection would hang, and eventually get dropped. If I tried to SSH and run a command that didn't output too much text, the problem wouldn't happen. (At least, not immediately.)

"Aha! It's failing when it tries to send me lots of data. It sounds like the MTU is too big."

So, I lowered the MTU on the router from its default 1500 down to a conservative 1400, which (unfortunately) required a reboot. When the router came back online, suddenly everything worked. Web sites loaded properly, and my SSH connection would successfully render a page full of text without dropping its connection.

Wait ... how did that fix things?

But then I realized that I wasn't entirely sure how that fixed things. And in my experience, that's a great opportunity to learn something new. First, let me recap my knowledge of networking at that point:

I knew that the MTU was the "Maximum Transmission Unit", the size of the largest data packet that a network can support. Packets larger than that might not work correctly on a network, so should be broken up into smaller units. You can do that in a couple ways: either you just make smaller packets, by putting less data into each one, or you use packet fragmentation to break up an existing large packet into several smaller packets that the other end of the connection will have to reassemble.

Which left me with two pieces that I didn't understand:

  1. I knew that IP routers can perform fragmentation. If they receive packets from one network that are too large for another network's MTU, they can break them up into fragments small enough for the other network. Why wasn't that happening for my network connections?
  2. I also knew that the MTU was the Maximum Transmission Unit. The MTU I set in my router tells the router the maximum packet size it's allowed to send to my ISP. But the problem seemed to be when I was receiving a packet too big. How did lowering my MTU stop me from receiving a packet that was too big?

To Wireshark!

If you're not familiar, Wireshark is an awesome tool that lets you inspect network traffic that your computer is sending and receiving, all the way down to the level of the bits in each packet. So, I reverted my router's MTU back to the "broken" 1500, fired up Wireshark, and tried connecting to my Linux server again.

When I looked at the packets in Wireshark, it helpfully automatically color-coded some of the packets with red text on a black background. That seemed like a good place to start. Here's a summary of what I saw:

  • Packet #57, sent from my server to my laptop, had a sequence ID (tcp.seq) of 3232, and a tcp.len of 1016.
  • Packet #58, also coming from my server, had a tcp.seq=5696, and tcp.len=592

Both had a TCP header length (tcp.hdr_len, in Wireshark) of 32 bytes. But #58 was marked in red, and its "info" line read: "[TCP Previous segment not captured]."

It turns out that a TCP connection can figure out that it's lost a packet by using the TCP packet's sequence ID and length. Since packet #57 had a SEQ of 3232, and a length of 1016, you would expect the next packet to have a sequence ID of 4248 (3232+1016). But instead, the next packet received had SEQ=5696, a full 1448 bytes too far ahead. If you add 32 (the TCP header size that the server seemed to be using), and 20 bytes of IP header, you get a packet of size of exactly 1500 bytes. That seemed to correlate with the MTU setting that was failing.

But I still didn't know how lowering my MTU was reducing the size of the packets that I was receiving, so I lowered the MTU (this time to 1492), started a new Wireshark capture, and connected to my server again. Here's what I saw this time:

  • The connection was surprisingly deterministic, the packets even got the same numbers/sizes!
  • Packet #57 again had SEQ=3232, len=1016.
  • Packet #58 had SEQ=4248, len=1440.

1440, + 32 (TCP header length in use) + 20 (IP header) = 1492, the new MTU. My server started sending me smaller packets!

But how did a server, on the other side of the internet, know that my home router's MTU had changed!? Surely there's some information exchange going on here.

I started inspecting the way that the TCP/IP connections were opened, wondering if they have some way of communicating MTUs to each other. Sure enough, there's quite a bit of information transmitted when you open up a new TCP connection.

IP Fragmentation Considered Harmful

The first thing that I noticed was that my connection had the IP flag "Don't fragment" set on packets coming from both the client and server. Why would anyone disable that!? ... it turns out that the general consensus is that fragmentation is a thing to be avoided. IPv6 doesn't even support IP fragmentation performed by routers!

Well, that would explain why IP fragmentation wasn't saving me. Can't rely on that. Next!

TCP MSS

Then I dug into the TCP packets that set up the connection between hosts. There I found a promising looking MSS ("Maximum Segment Size") flag. It defines the amount of data that you can send in each TCP packet for this connection. There are lots of hand-wavy details about this which get ironed out in RFC 879, but the (very) short version is that, nowadays, at least, MSS = MTU - 40.

OK! So that's how the MTU is getting communicated with the other end! Or, so I thought. But when I compared my "before" and "after" network captures, both tried to set up connections with MSS=1460 (MTU=1500). At this point, I got frustrated and gave up for the day.

Helpful Router Gremlins

But the next day, I couldn't stop thinking about it. The MTU setting for my router was the MTU for the ISP's network, on the outside port of the router. Inside my router, my private network's MTU would be completely different. So of course my laptop couldn't know about the MTU of the outside network. And yet, obviously somehow that data was getting communicated.

I got the idea to capture the network connection from the other side of the network (on the server). And that's when I was surprised to see... the TCP SYN packet that it received had had its MSS lowered! Though my laptop was always sending out the MTU that it knew about (1500), my home router was mangling my outbound SYN packets to let remote servers that they shouldn't send packets bigger than the MTU that it knew about! This seems like a total hack and I can't tell if it's disgusting or beautiful.

I tested another consumer router and found similar behavior, though it seemed to advertise a much more conservative MTU of 1300, so apparently this isn't all that uncommon. I wonder -- if my ISP changed the MTU for their network, shouldn't they have done the same sort of thing?

How Does Anything Ever Work?

Once all this new info was in my head, the state of TCP started looking pretty crappy. How does any of this ever work?

There's a really informative blog post by CloudFlare that filled me in on some things:

  • When "Do Not Fragment" is set, a router that can't forward a packet because it's too big should respond with an ICMP message saying so.
  • Buuut, you can't always rely on that ICMP message getting back to you, because so many routers drop them. (And I'd imagine when you throw NAT into the mix, the likelihood of ICMPs getting lost is even higher.)
  • There's a work-around to have your TCP stack automatically probe the network to find a workable MTU, but it's not enabled by default on Linux. (D'oh!)

Summary

  • TCP/IP is complicated.
  • If many of your internet connections seem to hang and eventually drop, but you can load Google.com, you might be able to fix it by lowering your router's MTU.
  • If you run a server, and you really want to make sure your customers can access your services, use a conservative MTU or enable MTU probing as detailed in the blog post I mentioned above. (That's why Google worked when other popular sites failed!)

Top 5% of Tech Talent!

A couple weeks ago this ad showed up in my Facebook1 feed. It made me roll my eyes so hard, I took a quick screen grab so I could rant about it later:

Screen Shot 2015-01-15 at 1.22.27 PM

This ad is a mess. First: Who is the target audience? The call to action, "Software Engineers [...] Sign Up", might make you think it's aimed at software engineers. But then the bottom claims to be the "Tech Talent Marketplace For The Top 5%". OK. Isn't that like saying that 95% of software engineers should buzz off then? Isn't the current problem in software that we don't have enough people?

Then there's the graphic: Sad guy with lots in his '80s-style paper "To do" inbox. Happy guy with all of his tasks completed.

I don't know about you, but if I'm the employee and all of my tasks are completed and I have no more lined up, I get nervous. I start to worry that my employer is going to realize they don't need me anymore. Or I get bored and look for a new job.

So, obviously, this ad isn't targeted at me. It's targeted at companies who are trying to hire tech talent. And it's perpetuating this meme of the "Golden" (a.k.a.: "10X", "5%", "rockstar", "ninja") developer that I find really annoying. "Hire our developers, and they will get things done SO FAST!"2

Look, if you want a guy to sit at a keyboard and bang through a list of tasks very quickly, you do not need a someone from the "top 5% of tech talent". But if you want a software engineer, who is going to collect requirements and design and implement an extensible, maintainable, stable piece of software, you're likely going to have such big/complicated tasks before him that the inbox metaphor is pretty terrible.

I realize I'm being ridiculously over-sensitive about a silly ad. I guess my point is -- if you want top tech talent, maybe seek it directly, or via an agency that takes it a bit more seriously than "Hired".


  1. Yes, I still use Facebook. Yes, it's an awful walled garden. But it's where people I know post content, so... *shrug* 

  2. I feel like there's an entire blog post to be written about "fast" development. See: "Fast, Cheap, Good -- pick two". It has been my experience in the past that code written very quickly ends up being the hardest to maintain. And yet, those who churn out code/features the most quickly are often the ones seen as "rockstar" or "ninja" developers. Meanwhile, Slow Cody gets to dig through the code, add tests, add documentation, refactor, fix bugs. Nothing nearly as glamorous. Not... that I'm bitter. 

Blogginess

Everyone who has a blog will tell you -- the point of having a blog is talking about having a blog! Or, in my case, the point of having a blog is to rewrite your entire blog back-end (seemingly) without changing anything on the front-end so nobody will notice. Because, what else are you going to do with your weekends?

So, yay! New blog hotness is here! Same as the old hotness! Only, now, instead of being a tangle of PHP I wrote myself, which required SSHing or git push to make a post, I'm using Python 3 and Django which gives me a webby interface for writing posts. And you will never see it. 😜 Oh, and I use Django AllAuth because passwords are so passé.

You might, however, see my new RSS feed. Or you might notice that I added HTTPS to my site. Only I'm a cheap bastard, so you'll just have to accept my self-signed cert.

Why I Want to Write More D

D is my new favorite programming language. I keep bringing it up at work, saying things like "Oh, but in D this would be so much simpler/nicer." In an effort to get a bit of my fanboyishness out of the way, I decided to compile my favorites features into this blog post.

Elevator Pitch

If another programmer asked me to quickly explain why I like D, or why I think more people should write D, here's what I'd say:

  • D has compile-time type safety
  • D has support for generic programming.
  • D creates machine code for whichever platform you're using.
  • BUT! ...
  • D has garbage collection built-in.
  • D has type inference which makes working with code feel like using Python or Ruby.
  • D has tools that let you write and run D code as if it were a scripting language.

Scope Blocks

Scope blocks in D allow you to greatly simplify error handling and resource management. What would otherwise be a tangle of try/catch blocks can be written much more simply. For comparison, let's write a sample Java program that XORs the contents of two files and writes the result to a third.

import java.io.FileInputStream;
import java.io.FileOutputStream;

public class XorFiles 
{
    /** XOR file inputs 1 and 2 and write the output to a third file.
     * @throws Exception if there was an error.
     */
    public static void xorFiles(String input1, String input2, String output)
    throws Exception
    {
        FileInputStream in1 = new FileInputStream(input1); 
        try
        {
            FileInputStream in2 = new FileInputStream(input2);
            try
            {
                FileOutputStream out = new FileOutputStream(output);
                try
                {
                    int byte1, byte2;
                    final int ERROR = -1;
                    while (true)
                    {
                        byte1 = in1.read();
                        byte2 = in2.read();
                        if (byte1 == ERROR || byte2 == ERROR) break;
                        out.write(byte1 ^ byte2);
                    }
                }
                finally { out.close(); }    
            }
            finally { in2.close(); }
        }
        finally { in1.close(); }    
    }

    public static void main(String[] args) throws Exception
    {
        xorFiles(args[0], args[1], args[2]);
    }
}

I don't know about you, but I think that's an awful lot of indentation and noise to do such a simple thing. In my experience, the more indentation in code, the harder it is to read, understand, and maintain. See: Flattening "Arrow Code"

In contrast, here's what the D implementation of the above would be:

#!/usr/bin/env rdmd

import std.stdio;

void xorFiles(string input1, string input2, string output)
{
    auto in1 = File(input1);
    scope(exit) in1.close();

    auto in2 = File(input2);
    scope(exit) in2.close();

    auto outFile = File(output, "wb");
    scope(exit) outFile.close();

    char[1] buffer1, buffer2, outBuff;
    while (true)
    {
        auto byte1 = in1.rawRead(buffer1);
        auto byte2 = in2.rawRead(buffer2);
        if (byte1.length == 0 || byte2.length == 0) break;
        outBuff[0] = byte1[0] ^ byte2[0];
        outFile.rawWrite(outBuff);
    }
}

void main(string[] args)
{
    xorFiles(args[1], args[2], args[3]);
}

I hope you'll agree that that's much easier to read. We open all of the files needed to do our work and handle their cleanup at the beginning of the function, without requiring any extra indentation. Then we're left to write the meat of our function without having to think about that again. It's the try/catch equivalent of a guard clause.

You can think of each scope(exit) ... block as saying "Here is my finally clause.", and the rest of the code is implicitly inside of a try block that precedes the finally. (In fact, that's how this feature is implemented behind the scenes.) There's also a scope(failure) which is analogous to a catch block, in that it only runs if there's an exception, and a scope(success) for completeness, which only runs when there hasn't been an exception.

Note: I learned something while implementing the above: File handles in D are reference counted, and always automatically closed at the end of the scope. So... you don't really need the scope blocks in this case. But you would if you were using some other resource that didn't have reference counting implemented for you, so I'll leave the above code as-is as an illustration of the general case. I could also have used std.range.zip and a foreach to clean the while loop up a bit, but didn't want to diverge tooo far from the Java example.

Compile-time Code Generation

D supports generics via templates. The word "templates" might make you cringe if you've ever had to deal with C++ templates, but it seems to be quite a bit nicer in D. Like C++, templates are used at compile-time to generate code depending on template parameters. So, where in C++ you can have a vector<int>, or in Java you have a List<Integer> in D you can use an Array!(int). (Yes, the syntax looks weird. foo!(...) in D is equivalent to foo<...> in C++.)

But D templates have access to compile-time type information about the arguments they're passed, which they can use to implement things which you would have to resort to doing in runtime in other languages.

For example: You can write a templated function which outputs a string. You can then use that with the mixin function (keyword?) to compile your code as if that generated code were present in-line.

Here's an example:

#!/usr/bin/env rdmd

import std.stdio;
import std.algorithm;

class Doer
{
    void doThis() { writeln("Called Doer.doThis()"); }
    void doThat() { writeln("Called Doer.doThat()"); }
    void doSomethingElse() { writeln("Wheeeeee!"); }

    mixin(makeDoAll!(Doer));
}

void main()
{
    auto d = new Doer();
    d.doAll();
}


// Generate a doAll() method that calls all do*() methods.
// Not production ready, just an example:
string makeDoAll(T)()
{
    string code = "void doAll() {";

    foreach(member; [ __traits(allMembers, T)])
    {
        if (!member.startsWith("do")) continue;
        code ~= member ~ "();";
    }
    code ~= "}";

    return code;
}

If you've worked with Java annotations before, you're probably familiar with using runtime reflection to read those annotations and do different things depending on that annotation data. D has annotations (called "user-defined attributes"), but you can access them at compile-time to do code generation that saves you all the overhead of runtime reflection. I'll leave that as an exercise for the reader. :)

Sane Monkey-Patching

Dynamic languages like Ruby and JavaScript allow developers to "monkey patch" a class that has already been defined to add or modify functionality.

That's usually not a great idea.

For one, it's not very scalable. If you use two libraries that try to monkey patch a class in incompatible ways, what happens? Who's to blame? How do you fix it? Or worse, what if you use some library that modifies an existing method to do something slightly different? What if it breaks the documented functionality of that method? How will you even know which library did that?

OK, I'll upgrade monkey patching from "not a great idea" to "a pretty terrible idea". But I'll give it one thing -- sometimes it would be really handy to add a useful method to a type that you use a lot.

D lets you do this in a way that's safe and scalable, via Universal Function Call Syntax. If you call a method on an object, and that object does not have such a method defined, D will look in your local scope to see if there's a function available that can satisfy the method call. For example:

import std.stdio;
import std.string;

void main()
{
    "World".greet();
    "Cody".greet;

}

void greet(string name)
{
    "Hello, %s!".format(name).writeln();
}

String, of course, does not have a built-in greet(). So D delegates to the local greet(string), passing in the object as the first argument.

Looking at the code, you might think that string has a format() method and a writeln() method, but it doesn't. D's string isn't even a class! (It's a type alias to an array of immutable chars.) However, the std.string package includes some handy functions that operate on strings (including format()), which you can use as if they were methods. And the std.stdio package has a writeln() method that takes a string (and other types), so you can use that as if it were a method too.

This type of extension is safe for a few reasons:

  • If the method exists on an object, it takes precedence. You can't redefine an object's methods.
  • UFCS only applies functions from your local scope (including functions that have been imported) to objects. It's not really a global monkey patch.
  • If you happen to import two functions which cause call ambiguity, it's a compile-time error, so you can't accidentally call the wrong function.

I really like this safe extensibility. The std.algorithm module has tons more methods that can be applied to a variety of types without having to re-implement them. Now I want to do this in every language I use. Dangit, D!

Talks

I learned about 2 of the things I go into detail about (scope blocks, code generation) via Andrei Alexandrescu's talk, Three Unlikely Successful Features of D. He touches on scope blocks again in his talk Three Cool Things About D. I'm not quite sure how I made it through the tutorial without realizing how nice of a feature that is. In my opinion, a good programming language allows (or helps!) the programmer to Do The Right Thing without too much pain, and this feature really helps with that.

If you'd like a higher level overview, I'd recommend the aptly named The D Programming Language. Or the website or the tutorial. People in the #D IRC channel on Freenode seem to be pretty friendly as well.

Oh! And both dmd and rdmd are packaged up and available via Homebrew on OSX, so you can brew install dmd and start playing with the language in seconds! (Hint, hint.)

Married

Today Heiðar and I got married. We had a small civil ceremony in Reykjavík, attended by local friends and family. Afterwards, we headed down to a fancy burger joint called Roadhouse for burgers, champagne, and quesadillas. Then we came home to devour a delicious Irish coffee cake.

While the ceremony was simple, my feelings are not. I feel incredibly lucky to have found the man I want to spend the rest of my life with. I also feel lucky to be living in a time where such a marriage is recognized by both our governments, which will allow us to live together in either country.

And I feel excited and anxious about my plans to move to Iceland. Heiðar has gone back to school and is working on completing a degree in Computer Science. I'll be moving to Iceland for at least the remainder of his studies. There's a lot to do in the next few months to make that happen.

But, mostly, I'm feeling happy. Stupid, head-over-heels-in-love happy.

You can see the more of the wedding photos on Flickr.