written by Walter Bright
January 28, 2009
Multiple compilers exist for the D programming language, including the Digital Mars compiler (dmd), the gnu compiler (gdc), the upcoming ldc compiler and even a .net D compiler is in the works. They are all based on the same open source front end code.
dmd has an X86 code generator. It currently targets Windows and Linux. But now that the Mac is on X86 machines, it opens the possibility for a straightforward implementation of dmd on the Mac. The only way to find out how feasible it is is to get a Mac and get to work on it.
Over Christmas I ordered a MacMini from Amazon. Compiler development doesn’t require a powerful machine, or lots of memory or disk space, so a low end machine will do nicely. I’ve never used a Mac before, so I was also curious how the machine would work. It didn’t take long to get it up and running. I keep spare keyboards and mice around (because they break often), and plugged them in along with an old monitor. The monitor and mouse worked perfectly, but the Mac had some trouble with the keyboard, and needed to be configured for it. The machine comes with the gnu dev tools, though they needed to be installed separately. I then figured out how to remotely connect to the Mac over the LAN, and (the Mac people will hate me for this) put the Mac in the basement and operate it remotely with a text window.
From a text window, the Mac is just another unix machine. I put all the source code to the compiler on it and set about trying to compile it. Most of that consisted of finding all the conditional compilation:
and changing it to:
#if linux || __APPLE__
Remarkably, there were very few api differences that needed to be accounted for, a couple gcc differences, and the compiler was running. I thought I’d start by assuming that the C ABI on the Mac was the same as for Linux, and configured dmd appropriately.
The first big problem is that dmd can generate both Intel OMF and ELF object file formats, but the Mac used the Mach-O format. The best way to learn a file format is to write a dumper for it (called dumpobj for object files). Object file specs are usually wrong in some detail or other, and the mach-o spec is no exception. The Mac comes with an object file dumper called otool, unfortunately, otool doesn’t give a clue as to the structure of the object file, nor will it disassemble any code that is not in a __text segment. So I also got the Digital Mars disassembler, obj2asm, converted to work with Mach-O files.
Fortunately, the Mac uses Dwarf for its symbolic debug information format, and dmd has a Dwarf generator, so that should be good to go. But when I first looked at the debug output of gcc, there was a “macinfo” section. Uh-oh, some undocumented Macintosh enhancement. Googling (how indispensible google has become) “macinfo” revealed my mistake — I had forgotten that Dwarf had a special section for information on C preprocessing macros called “macinfo”. I forgot about it because D doesn’t have a text macro preprocessor.
Object files on the Mac are all generated as pic (Position Independent Code), necessary so that shared libraries will work. On Linux, pic is an option. Pic is done completely differently on the Mac, so this was where most of the work was so far.
Some other differences:
- Names get an '_' prepended to them, although when using gdb you have to leave off the '_'.
- There’s no thread local storage mechanism in the object file format. This is a serious shortcoming, and I’ll have to figure out a solution.
- There are special sections for C strings and read-only literals that the linker can compress redundancy out of.
I finally got to the point where dmd would compile “hello world” and using dumpobj to compare object files with that produced by gcc for the C “hello world”, it looked like it should work. The gcc one worked fine, and the dmd generated one crashed with a segmentation fault. I was really pulling what few strands of hair I had left out over that one, as I could not find anything wrong with the fixups or object layout.
Eventually, I noticed that the gcc version was putting some unneeded space on the stack, and I suspected something was up. Put out the same extra stack, and it started working. Googling around some more, I discovered that code on the Mac that calls any library functions must align the stack to 16 bytes. Tweaking the code generator to do this, now I had “hello world” working in D on the Mac.
More to come...
Thanks to Jason House for reviewing this.