Compiling Templates

June 27, 2009

Templates can be a complex and intimidating feature. I know I’m never comfortable with a feature until I know how it compiles and works down to the bottom level. Any complex process can be broken down into simpler, lower level steps that can be mastered and then understanding built up from there. Rather than get into a long discussion of the arcana of templates, I’ll show the step—by—step way they’re compiled. Often, much of the strangeness of the arcana becomes clearer once the compilation process makes sense.

Given the simple template program that computes a factorial at compile time:

template class factorial
{
  public:
    enum
    {
      result =
         n * factorial::result
    }; 
};

template<> class factorial<1>
{
  public:
    enum { result = 1 };
};

int foo()
{
    return factorial<10>::result;
}

The source code is preprocessed and lexed (converted to a stream of tokens). We’ll skip past that as it’s standard compiler technology and not peculiar to templates.
The two factorial templates are parsed. Parsing means a data structure is created that represents the templates. Our two factorial templates are installed into the symbol table, and since they have the same name, they are often stored in a list hanging off the ‘factorial’ name.
The bodies (the stuff between the { }’s) of the templates are stored as ASTs Abstract Syntax Trees. Sometimes, as in the Digital Mars C++ compiler, they are stored as lists of tokens.
The function foo() is semantically analyzed, meaning that the AST is evaluated, typechecked, symbols defined, etc. The factorial<10> is recognized as an instantiation of a template named factorial with an argument that is a literal integer 10.
factorial is looked up in the symbol table, and two factorial template declarations are found. The compiler must determine which of those templates to instantiate.
The template instantiation arguments, in this case the literal 10, is applied to each template declaration. The list of template declarations that match are called the ‘candidate list’. In this case, the 10 matches the int n parameter of the first factorial template, and does not match the 1 parameter of the second. So there is only one candidate, and that is selected.
If there are more than one candidate templates, the compiler must determine the so—called best match. This is a lot like function overloading. The technique used for finding the best match for both C++ and the D programming language is called partial ordering.
Now we have the specific template declaration that needs to be instantiated and the actual arguments to instantiate it with. But there is a rule in C++ and D that a template with a specific set of arguments can only have one instantiation for the whole program. This means that if the template has already been instantiated with the same arguments, we can simply point to that instantiation and we’re done.
Finding an existing instantiation is done by “mangling” the template name, namespace, and arguments into a string. For factorial<10>, this mangled name looks like (for g++) 9factorialILi10E. (This can be read as a “9 character name factorial with a literal integer 10 argument followed by the end of the arguments.”) None of the template body is needed for this. This string will be unique in the program.

Hung off of the template declaration in the symbol table is another table with those mangled template instantiation names as indices to the corresponding template instantiations.

If the template instantiation isn’t found, then it needs to be instantiated.
A fresh symbol table scope is created. The template declaration parameters are declared in it as symbols initialized to the actual argument values. Type parameters are declared as typedefs (aliases in D), integer parameters are declared as constants. The template declaration body is then parsed (if not already an AST), and semantically analyzed by the regular other bits of the compiler that deal with functions, classes, and declarations.
A reference to the new instantiation is also installed into the template declaration’s table of instantiations.
The declarations created by the template instantiation are now treated as regular types and symbols by the rest of the compiler, which doesn’t know or care they came from templates.
The astute reader will wonder what happens when separate compilation units generate the same template instantiations with the same arguments. Wasn’t there a rule that says only one unique instantiation per program? You’re quite right, and the solution is to be found in that the template instantiations are identified in the object file as “COMDATs”, short for common data blocks, a delightful term which instructs the linker to remove the duplicates. (Early C++ compilers which didn’t have the luxury of COMDAT—supporting linkers generally had a kludgily awful time with this problem.)

I hope this has shed at least a little light on how templates work. If you want to learn more about how real compilers work, I am hosting a seminar in the fall on compiler construction.

Acknowledgements

Thanks to Eric Niebler, Andrei Alexandrescu and Jason House for review a draft of this.

Articles

Compiling Templates

Acknowledgements