digitalmars.D - Converting Optlink from Assembler to C

digitalmars.D - Converting Optlink from Assembler to C - Reboot

Walter Bright (2/2) Nov 30 2009 http://www.reddit.com/r/programming/comments/a9mse/walter_bright_from_as...

bearophile (33/40) Nov 30 2009 Translating a largish program from a language to a different language is...

bearophile (5/6) Dec 01 2009 To improve this situation, when the translation to C is done, the progra...

Walter Bright <newshound1 digitalmars.com> writes:

http://www.reddit.com/r/programming/comments/a9mse/walter_bright_from_assembly_to_c/

Hopefully a working link this time.

Nov 30 2009

bearophile <bearophileHUGS lycos.com> writes:

A nice article, thank you for writing it for us for free (I used to pay to read
similar texts).

I had the job at the time of converting a huge (and very successful) electronic
schematic editor, DASH, from assembler into C. In C it could then be recompiled
for 32 bits, and even ported to other platforms like the Sun workstations. The
conversion took months, but was a big success.<

Sounds like a so much bug-prone job that the probability of having a final
working product seem small. I guess you have translated it part-by-part,
keeping it functional all the time, as you are doing with optlink.


9. Optlink has no unit tests. Writing them would require understanding what the
various functions do, and I won't know precisely what they are supposed to do
until most of the program is converted.<

Translating a largish program from a language to a different language is a good
way to convince most programmers that unit tests are a Good Thing. Unit tests
make this work quite simpler. And many good programs eventually need to be
translated, it's not an uncommon event.


Although I have not run any speed tests, I expect the performance of the
non-I/O bound code to be about 30% slower. Since a linker tends to be I/O
bound, the actual performance loss probably will be about 10%, which I can live
with.<

We'll see. Modern C compilers, like the Intel one, LLVM or GCC 4.4 sometimes
give surprises :-) Modern C compilers also use CPU instructions that were not
available in the past (you have to use compilation flags for Pentium4/Core2 or
similar CPUs of course).


The difference in object code size is primarily due to the assembler having
done register assignments that cross over multiple function calls. There's just
a lot less pushing and popping of parameters.<

In GCC there are annotations (that LLVM doesn't support yet) that allow to nail
variables into registers. One Haskell compiler (GHC) uses this feature a lot,
it's explained here:
http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf

Registered mode is implemented by both the C and NCG back-ends and involves
primarily

two optimisations which significantly increase the performance of generated
code. The first
optimisation is known as TABLES_NEXT_TO_CODE and optimises the data layout of
the
code GHC produces, this is further explained in section 2.5. The second
optimisation though
is far more significant, it deals with the register allocation for the
generated code. As part of
the execution model that GHC uses for Haskell, it defines a variety of virtual
registers which
mirror many of the registers typically found on a CPU, such as a stack pointer
register. These
are examined in more detail in section 2.4.2 but their purpose is to create a
virtual machine
architecture which can be explicitly managed instead of using the C stack for
example. In
unregistered mode these virtual registers are all stored on the heap, needing
memory access for
any reads and writes. Given the frequency of access of these virtual registers,
this creates a
significant amount of memory traffic which becomes a limiting factor on the
performance of
the generated code. In registered mode many of these virtual registers are
instead permanently
assigned to a specific hardware registers (register pinning), greatly improving
performance. The
runtime of code compiled in registered mode as opposed to unregistered mode is
on average 55%
shorter. More details of this can be seen in section 4.<

Eventually this GCC feature can be added to D too.
D may also use a function annotation that tells a function how to manage
certain registers.


There are also a lot of functions with multiple entry points. I don't know of
any current compiler that is able to enregister across a flow graph of function
calls, or is able to tail merge multiple functions.<

Probably no common compiler.


After a few false starts, I remembered the lessons from converting DASH, which
are:<

How I translate from a higher level language to another, for example Java to D
or Python to C:
1) I make sure to have a certain amount of correct input-output pairs for the
original code, and sometimes I even write few unittests.
2) I clean & reformat the original code, because a well formatted code is more
readable. I use the formatting style that I am able to read better. I sometimes
remove excessive comments, and all the parts of the code that don't need to be
translated. I keep testing that the functional tests and unit test keep working.
3) I slowly modify the original code, lowering its style, or just changing it,
to make the code look and work as closer as possible to the destination
language. Often this is a lowering of semantics, to use more basic constructs
that are shared between the two languages. Every change is of course followed
by running the tests. This is the intermediate version.
4) And only now I jump to the destination language, fix things, and hope.
Sometimes I see problems, so I go back to the intermediate version and fix it
so its semantics is closer to the destination language.

Sometimes I even use an intermediate language, for example to translate Python
to C I first use D with my dlibs that allow me to write very Python-like D
code. Then I lower that D version to normal D-style code. And then I slowly
lower the D version to C-style code. Running tests every few minutes. At the
end I convert it to C, and I try to see if it works. D1 language is closer to C
compared to D2 so this is simpler. Some bugs are common here when I translate D
to C, like forgetting to initialize C variables, etc. To solve this problem I
slowly assign "= void" to *all* variables and structs in the D code, so I can
see it crash before going to C.

As you, I have seen that doing such things in the most tidy and regular way
possible saves me time. Even using an intermediate language (like D1 to
translate Python => C) looks like a slow thing, but it allows me to avoid many
bugs along the way, so even if it's boring and it looks like a waste of time,
in the end I save time. D1 is useful to me to perform such translations too (D2
will be less fit for this purpose, so I'll probably keep using D1 for this).

Bye,
bearophile

Nov 30 2009

bearophile <bearophileHUGS lycos.com> writes:

I have forgotten something:

Walter:
Although I have not run any speed tests, I expect the performance of the
non-I/O bound code to be about 30% slower. Since a linker tends to be I/O
bound, the actual performance loss probably will be about 10%, which I can live
with.<

To improve this situation, when the translation to C is done, the program can
be profiled, and the performance critical parts can be modified from spaghetti
code with gotos to structured code that I think C optimizers can digest better.

Bye,
bearophile

Dec 01 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Converting Optlink from Assembler to C - Reboot