Why I Want to Write More D

D is my new favorite programming language. I keep bringing it up at work, saying things like "Oh, but in D this would be so much simpler/nicer." In an effort to get a bit of my fanboyishness out of the way, I decided to compile my favorites features into this blog post.

Elevator Pitch

If another programmer asked me to quickly explain why I like D, or why I think more people should write D, here's what I'd say:

  • D has compile-time type safety
  • D has support for generic programming.
  • D creates machine code for whichever platform you're using.
  • BUT! ...
  • D has garbage collection built-in.
  • D has type inference which makes working with code feel like using Python or Ruby.
  • D has tools that let you write and run D code as if it were a scripting language.

Scope Blocks

Scope blocks in D allow you to greatly simplify error handling and resource management. What would otherwise be a tangle of try/catch blocks can be written much more simply. For comparison, let's write a sample Java program that XORs the contents of two files and writes the result to a third.

import java.io.FileInputStream;
import java.io.FileOutputStream;

public class XorFiles 
{
    /** XOR file inputs 1 and 2 and write the output to a third file.
     * @throws Exception if there was an error.
     */
    public static void xorFiles(String input1, String input2, String output)
    throws Exception
    {
        FileInputStream in1 = new FileInputStream(input1); 
        try
        {
            FileInputStream in2 = new FileInputStream(input2);
            try
            {
                FileOutputStream out = new FileOutputStream(output);
                try
                {
                    int byte1, byte2;
                    final int ERROR = -1;
                    while (true)
                    {
                        byte1 = in1.read();
                        byte2 = in2.read();
                        if (byte1 == ERROR || byte2 == ERROR) break;
                        out.write(byte1 ^ byte2);
                    }
                }
                finally { out.close(); }    
            }
            finally { in2.close(); }
        }
        finally { in1.close(); }    
    }

    public static void main(String[] args) throws Exception
    {
        xorFiles(args[0], args[1], args[2]);
    }
}

I don't know about you, but I think that's an awful lot of indentation and noise to do such a simple thing. In my experience, the more indentation in code, the harder it is to read, understand, and maintain. See: Flattening "Arrow Code"

In contrast, here's what the D implementation of the above would be:

#!/usr/bin/env rdmd

import std.stdio;

void xorFiles(string input1, string input2, string output)
{
    auto in1 = File(input1);
    scope(exit) in1.close();

    auto in2 = File(input2);
    scope(exit) in2.close();

    auto outFile = File(output, "wb");
    scope(exit) outFile.close();

    char[1] buffer1, buffer2, outBuff;
    while (true)
    {
        auto byte1 = in1.rawRead(buffer1);
        auto byte2 = in2.rawRead(buffer2);
        if (byte1.length == 0 || byte2.length == 0) break;
        outBuff[0] = byte1[0] ^ byte2[0];
        outFile.rawWrite(outBuff);
    }
}

void main(string[] args)
{
    xorFiles(args[1], args[2], args[3]);
}

I hope you'll agree that that's much easier to read. We open all of the files needed to do our work and handle their cleanup at the beginning of the function, without requiring any extra indentation. Then we're left to write the meat of our function without having to think about that again. It's the try/catch equivalent of a guard clause.

You can think of each scope(exit) ... block as saying "Here is my finally clause.", and the rest of the code is implicitly inside of a try block that precedes the finally. (In fact, that's how this feature is implemented behind the scenes.) There's also a scope(failure) which is analogous to a catch block, in that it only runs if there's an exception, and a scope(success) for completeness, which only runs when there hasn't been an exception.

Note: I learned something while implementing the above: File handles in D are reference counted, and always automatically closed at the end of the scope. So... you don't really need the scope blocks in this case. But you would if you were using some other resource that didn't have reference counting implemented for you, so I'll leave the above code as-is as an illustration of the general case. I could also have used std.range.zip and a foreach to clean the while loop up a bit, but didn't want to diverge tooo far from the Java example.

Compile-time Code Generation

D supports generics via templates. The word "templates" might make you cringe if you've ever had to deal with C++ templates, but it seems to be quite a bit nicer in D. Like C++, templates are used at compile-time to generate code depending on template parameters. So, where in C++ you can have a vector<int>, or in Java you have a List<Integer> in D you can use an Array!(int). (Yes, the syntax looks weird. foo!(...) in D is equivalent to foo<...> in C++.)

But D templates have access to compile-time type information about the arguments they're passed, which they can use to implement things which you would have to resort to doing in runtime in other languages.

For example: You can write a templated function which outputs a string. You can then use that with the mixin function (keyword?) to compile your code as if that generated code were present in-line.

Here's an example:

#!/usr/bin/env rdmd

import std.stdio;
import std.algorithm;

class Doer
{
    void doThis() { writeln("Called Doer.doThis()"); }
    void doThat() { writeln("Called Doer.doThat()"); }
    void doSomethingElse() { writeln("Wheeeeee!"); }

    mixin(makeDoAll!(Doer));
}

void main()
{
    auto d = new Doer();
    d.doAll();
}


// Generate a doAll() method that calls all do*() methods.
// Not production ready, just an example:
string makeDoAll(T)()
{
    string code = "void doAll() {";

    foreach(member; [ __traits(allMembers, T)])
    {
        if (!member.startsWith("do")) continue;
        code ~= member ~ "();";
    }
    code ~= "}";

    return code;
}

If you've worked with Java annotations before, you're probably familiar with using runtime reflection to read those annotations and do different things depending on that annotation data. D has annotations (called "user-defined attributes"), but you can access them at compile-time to do code generation that saves you all the overhead of runtime reflection. I'll leave that as an exercise for the reader. :)

Sane Monkey-Patching

Dynamic languages like Ruby and JavaScript allow developers to "monkey patch" a class that has already been defined to add or modify functionality.

That's usually not a great idea.

For one, it's not very scalable. If you use two libraries that try to monkey patch a class in incompatible ways, what happens? Who's to blame? How do you fix it? Or worse, what if you use some library that modifies an existing method to do something slightly different? What if it breaks the documented functionality of that method? How will you even know which library did that?

OK, I'll upgrade monkey patching from "not a great idea" to "a pretty terrible idea". But I'll give it one thing -- sometimes it would be really handy to add a useful method to a type that you use a lot.

D lets you do this in a way that's safe and scalable, via Universal Function Call Syntax. If you call a method on an object, and that object does not have such a method defined, D will look in your local scope to see if there's a function available that can satisfy the method call. For example:

import std.stdio;
import std.string;

void main()
{
    "World".greet();
    "Cody".greet;

}

void greet(string name)
{
    "Hello, %s!".format(name).writeln();
}

String, of course, does not have a built-in greet(). So D delegates to the local greet(string), passing in the object as the first argument.

Looking at the code, you might think that string has a format() method and a writeln() method, but it doesn't. D's string isn't even a class! (It's a type alias to an array of immutable chars.) However, the std.string package includes some handy functions that operate on strings (including format()), which you can use as if they were methods. And the std.stdio package has a writeln() method that takes a string (and other types), so you can use that as if it were a method too.

This type of extension is safe for a few reasons:

  • If the method exists on an object, it takes precedence. You can't redefine an object's methods.
  • UFCS only applies functions from your local scope (including functions that have been imported) to objects. It's not really a global monkey patch.
  • If you happen to import two functions which cause call ambiguity, it's a compile-time error, so you can't accidentally call the wrong function.

I really like this safe extensibility. The std.algorithm module has tons more methods that can be applied to a variety of types without having to re-implement them. Now I want to do this in every language I use. Dangit, D!

Talks

I learned about 2 of the things I go into detail about (scope blocks, code generation) via Andrei Alexandrescu's talk, Three Unlikely Successful Features of D. He touches on scope blocks again in his talk Three Cool Things About D. I'm not quite sure how I made it through the tutorial without realizing how nice of a feature that is. In my opinion, a good programming language allows (or helps!) the programmer to Do The Right Thing without too much pain, and this feature really helps with that.

If you'd like a higher level overview, I'd recommend the aptly named The D Programming Language. Or the website or the tutorial. People in the #D IRC channel on Freenode seem to be pretty friendly as well.

Oh! And both dmd and rdmd are packaged up and available via Homebrew on OSX, so you can brew install dmd and start playing with the language in seconds! (Hint, hint.)

comments powered by Disqus