written by Walter Bright
February 14, 2009
The short version: it's done and out!
I had thought retargeting the D programming language to the Mac was just an object file format change — a week, maybe two. It turned out to be 6 weeks. I feel like Yosemite Sam walking through a mine field, managing to step on every single mine in it.
So let’s pick up where I left off at the end of my last entry, where I had just gotten the library to compile and a couple sample pieces of code to link to it and run. Now it was time to run the test suite.
It’s essentially impossible to develop a compiler without some sort of test suite. The one I use is an ad-hoc collection of every fixed bug and other assortments of things. The beauty of putting in every fixed bug is that the bugs stay fixed, and the quality of the compiler steadily ratchets forward. Over time, this makes for a formidable test suite. If it passes the suite, I know it’s at least as good as the previous version, and if it shows to not be, that failing gets added to the suite for next time. (I’ve discovered that just throwing volumes of code at the compiler is fairly useless as a test suite. The test code has to be crafted to test specific things and verify correct results.)
The first thing that failed was exception handling. But surprisingly, it only took me about an hour to get that to work. Exception handling is complicated and hard to understand, and I expected a tough slog. The EH design in the back end dates back to when it was a 16 bit compiler (Digital Mars C++ was the only C++ compiler to ever implement exception handling on DOS). The support code moved to 32 bit DOS extenders, then Linux, and now OSX with very little change. The OSX support needed to tweak the assembler bits to keep the stack 16 byte aligned.
(Exception handling for Win32 is completely different, using Microsoft’s Structured Exception Handling scheme.)
The dmd exception handling system is completely independent of g++’s. The two do not interact with each other. D is binary compatible with C++ name mangling and single inheritance, and g++ on OSX uses the same protocol for this as on Linux which dmd is already compatible with, so that was easy.
My next problem was the floating point failed miserably. Some investigation showed that, uniquely, OSX aligns the CPU’s 10 byte reals on 16 byte boundaries. This is 6 bytes of pad for each. I don’t know the reason for this; Linux uses 12 and Win32 uses 10. It’s just that if you have large arrays of reals, it’s going to chew up 60% more space. Oh well, it was easy to account for in the code generation.
The worst problem I had was my own fault. The front end is only few years old. But the back end code generator is about 25 years old (it may be the oldest code generator still in professional use!). Although it is well debugged (sporting a thorough test suite), fast, and generates great code, it uses a lot of global variables to communicate. Problem after problem was traced back to the use of global variables. Over time I’d eliminated a lot of them, but there’s a lot left. It’s hard to change how a function works if there’s a back channel of globals passing state around. Globals break encapsulation, making code difficult to understand.
Of course I wouldn’t write it that way today, hopefully I’ve learned something in the last 25 years. You might ask “why not just rewrite it” and certainly that thought comes to mind. The problem is a code generator is about a zillion special cases, most of which interact with each other. Getting that all adjusted, tweaked and working right across the broad spectrum of code that it must generate is years of work. It’s not something you throw away and rewrite lightly. But there is some hope — the Mach-O generating part is a lot nicer than the Elf generating part, which itself is much nicer than the very old OMF generator. And if I’m doing open heart surgery on a particular section, I’ll refactor it and rely on the test suite to make sure the patient recovers.
Thanks to Sean Kelly for his invaluable help with the more complex OSX system library work. Thanks to Jason House, Andrei Alexandrescu, Sean Kelly and Cristian Vlasceanu for reviewing this.