www.digitalmars.com         C & C++   DMDScript  

D - Typesafe Assembly Language

reply Mark Evans <Mark_member pathlink.com> writes:
One of the Cyclone people (Greg Morrisett) also works on a "typesafe assembly"
project, "TAL."  His work should be quite interesting to D, because it centers
on implementing high-level (functional) language features in a strongly typed
and compiler-optimized fashion -- all the way down to the back end nitty-gritty
for Intel chips.  We are not talking about virtual machines or JIT here, this is
real compilation.  Functional power is possible in D.  -Mark

Greg Morrisett
http://www.cs.cornell.edu/Info/People/jgm/home.html

TILT Compiler Project
http://www.cs.cornell.edu/home/jgm/tilt.html

TAL
http://www.cs.cornell.edu/talc/

"The goal of the TAL project is to extend the paradigm of type-directed
compilation to its limit. We compile high-level languages that include features
such as higher-order polymorphic functions, datatypes, modules, objects, and
subtyping into a series of typed intermediate languages and finally into a Typed
Assembly Language (TAL). Unlike any other compiler, we not only use typed
intermediate languages but also typed target languages." 

"The assembly language type system is powerful enough to express high-level
abstractions such as closures and abstract datatypes and yet flexible enough to
permit traditional low-level optimizations such as loop invariant removal,
strength reduction, and redundant move elimination. Furthermore, just as the
more familiar typed source and intermediate languages catch programmer errors,
so does our typed assembly language. It helps to ensure that complicated backend
transformations such as the optimizations above as well as register allocation
and more general instruction selection and scheduling are performed correctly." 
Feb 22 2003
next sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
It being "functional" was a JOKE(!!!) connected to them participating in 
fuctional language contest ICFP99. And no, it's not optimised. It's 
simply an assembler with type annotations. That's really all to it.

All optimisations are made in front-end compilers, like cyclone and 
scheme--, which output TAL.
Yes, they were considering it as a suitable back-end for ML-like 
languages. Just point me out a case where assembler is not. Ocaml is 
100% compiled as well, remember? And whether it's compiled for a VM or 
into machine code, is unimportant, as it doesn't change the compiled 
representation once it's established.

Its use is at most to simplify the internal compiler structures, so that 
compilers can build units without regard to the others, and let 
assembler and linker handle type stuff.
A "real" intermediate language was their intended next step.

BTW, a pity ICFP doesn't take place any longer. I bet D would have had 
good chances to win it.

-i.
Feb 22 2003
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:b38sgq$1v2o$1 digitaldaemon.com...
 BTW, a pity ICFP doesn't take place any longer. I bet D would have had
 good chances to win it.
What's ICFP?
Feb 22 2003
parent reply Ilya Minkov <midiclub 8ung.at> writes:
International Conference on Functional Programming - their programming 
contest.

I was been wrong - it *is* being held.
http://icfpcontest.cse.ogi.edu/
The adress has changed. And we've missed it in September. Among 
participating are many functional language entries (Haskell, OCaml), but 
also many Java, C and C++ teams. Popular scripting languages are 
represented as well. The teams with fastest, best, tersest programs win. 
The teams have 3 days if they participate in a "main" contest, and 1 day 
in a "lightning" contest. The task is the same, and is usually easy to 
get started with, but has enough depth and fine detail to work upon for 
years. Some example was an optimiser, for some data and/or expressions.

The contest reports are quite funny to read, partly because the judges 
*must* include funny-sounding phrases about winning people and languages 
like "Xxxx is an extremely cool hacker!". :)

D is in a great advantage over Perl, C and C++ groups, and should be not 
less suited than Python and Ruby.

-i.


Walter wrote:
 "Ilya Minkov" <midiclub 8ung.at> wrote in message
 news:b38sgq$1v2o$1 digitaldaemon.com...
 
BTW, a pity ICFP doesn't take place any longer. I bet D would have had
good chances to win it.
What's ICFP?
Feb 23 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:b3b91q$s7n$1 digitaldaemon.com...
 D is in a great advantage over Perl, C and C++ groups, and should be not
 less suited than Python and Ruby.
D has some great advantages over C/C++ in reducing lines of code and complexity. What I need to do is start writing articles about it.
Feb 23 2003
parent reply Ilya Minkov <midiclub 8ung.at> writes:
Articles!

New HUGI is (finally) coming out in a couple of weeks. It's the major 
DiskMag for demo c0ders, graphicians, musicians, and all. The auditory 
among programmers might reach a thousand of creative, open-minded 
people, awareness among readers would be 100% since no HUGI article ever 
gets ignored. :)

www.hugi.de

I was going to write some articles, among them something short about D, 
  but i haven't gotten to do it.

The chief editor (Claus D. Volko, a.k.a. adok^hugi), still accepts 
articles. I have met him today at IRCnet #coders. A short note might be 
better than nothing now. You can reach him under:

cdvolko gmx.net

-i.

Walter wrote:
 What I need to do is start writing articles about it.
Feb 23 2003
parent reply "Walter" <walter digitalmars.com> writes:
It does look like a great mag. But I'm looking for a wider audience. The
article on D in DDJ was very effective. I'd like to place another one in DDJ
or CUJ. -Walter

"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:b3bqp5$1bf8$1 digitaldaemon.com...
 Articles!

 New HUGI is (finally) coming out in a couple of weeks. It's the major
 DiskMag for demo c0ders, graphicians, musicians, and all. The auditory
 among programmers might reach a thousand of creative, open-minded
 people, awareness among readers would be 100% since no HUGI article ever
 gets ignored. :)

 www.hugi.de

 I was going to write some articles, among them something short about D,
   but i haven't gotten to do it.

 The chief editor (Claus D. Volko, a.k.a. adok^hugi), still accepts
 articles. I have met him today at IRCnet #coders. A short note might be
 better than nothing now. You can reach him under:

 cdvolko gmx.net

 -i.

 Walter wrote:
 What I need to do is start writing articles about it.
Feb 24 2003
parent reply Mark T <Mark_member pathlink.com> writes:
It does look like a great mag. But I'm looking for a wider audience. The
article on D in DDJ was very effective. I'd like to place another one in DDJ
or CUJ. -Walter
I suggest CUJ since you have been a "cover item" in DDJ already, I subscribe to both
Feb 28 2003
parent "Walter" <walter digitalmars.com> writes:
"Mark T" <Mark_member pathlink.com> wrote in message
news:b3nqsd$2p1q$1 digitaldaemon.com...
It does look like a great mag. But I'm looking for a wider audience. The
article on D in DDJ was very effective. I'd like to place another one in
DDJ
or CUJ. -Walter
I suggest CUJ since you have been a "cover item" in DDJ already, I
subscribe to
 both
I think you're right.
Mar 02 2003
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Ilya dismissed TAL while forking the discussion into a contest promotion.  Ilya,
tell those Ph.D. computer scientists, and their funding sources at AFOSR and
NSF, that they are just spinning wheels.  Or, do more homework.  Citations,

http://www.cs.cornell.edu/talc/papers/tal_scale.pdf
http://www.cs.cornell.edu/talc/papers/tal-toplas.pdf
http://www.cs.cornell.edu/talc/papers/talx86-wcsss.pdf
http://www.cs.cornell.edu/talc/papers/compile_rtcg.pdf



All optimisations are made in front-end compilers, like cyclone and 
scheme--, which output TAL.
TAL enables more and better optimizations at the source level plus faster and provably correct compilation. That's the whole point. To answer the false critique more directly, TAL also facilitates "conventional low-level optimizations such as global register allocation, copy propagation, constant folding, and dead-code elimination. Except for a small number of atomic code patterns, TAL also supports code motion optimizations such as instruction scheduling, common-subexpression elimination, and loop-invariant removal."
Ocaml is 100% compiled as well, remember?
In the world according to Ilya: I can dig ditches with my bare hands, so why rent a diesel-powered backhoe? ML compilers do things that are equivalent to digging with bare hands ("20 passes"): http://www.cs.cornell.edu/talc/papers/tal-toplas.pdf "Compilers that manipulate statically typed intermediate languages have compelling advantages over traditional compilers that manipulate untyped program representations. An optimizing compiler for a high-level language such as ML may make as many as 20 passes over a single program, performing sophisticated analyses and transformations such as CPS conversion, closure conversion, unboxing, subsumption elimination, or region inference. Many of these optimizations require type information in order to succeed, and even those that do not, often do benefit from the additional structure supplied by a typing discipline. Moreover, the essence of many of these program transformations can be specified by the corresponding type translation. Types provide concise and yet precise documentation of the compilation process that can be automatically verified by a type checker. In practice, this technique has been invaluable for debugging new transformations and optimizations [Tarditi et al. 1996; Morrisett et al. 1996]. "...ML['s] compiler preserves type information through approximately 80% of compilation, but the remaining 20% is untyped. We show how to recode the untyped portions of a compiler to maintain type information through all phases of compilation and, in so doing, extend the paradigm of compiling with typed intermediate languages to compiling with typed target languages."
And whether it's compiled for a VM or 
into machine code, is unimportant, as it doesn't change the compiled 
representation once it's established.
The target representation is entirely the point. One of their research papers reports that "Semantic errors have been uncovered in the JVML verifier and its English specification." Correctness is a problem. Another is target language generality. It's hard to compile non-Java languages to the Java VM, non-OCaml languages to the OCaml VM, etc. There are also issues of speed, security, JIT flaws. Why not read the research reports for yourself?
A "real" intermediate language was their intended next step.
No, they describe their next steps as runtime code generation and other exotica. TAL is partly an answer to the failings of intermediate languages.
Its use is at most to simplify the internal compiler structures,
so that  compilers can build units without regard to the others,
and let assembler and linker handle type stuff.
No, TAL does not replace the compiler's type system, but extends it all the way through to the target. It's a diesel-powered backhoe. The correctness guarantees alone highly recommend TAL for the D project, let alone its other advantages. D is an alpha compiler tracking an evolving language. TAL uses MASM syntax by the way. "The target-level annotations make it possible to prove easily that the machine code is type-safe, independent of the source code or compiler. To be useful across a range of source languages and compilers, the target-language type system should provide powerful type constructors for encoding higher-level invariants. Unfortunately, it is difficult to engineer such type systems so that annotation sizes are small and verification times are fast. "We quantify the effectiveness of these techniques...which include common-subexpression elimination of types, higher-order type abbreviations, and selective reverification, [and which] can dramatically change certificate size and verification time." "Support for arrays in TALx86 is perhaps the most complicated feature in the language. The critical issue is that array sizes and array indices cannot always be determined statically, yet to preserve type-safety, we must ensure that any index lies between 0 and the physical size of the array. TALx86 provides a very flexible mechanism for tracking the size of an array without requiring that the size be placed in a pre-determined position."
Feb 26 2003
parent Ilya Minkov <midiclub 8ung.at> writes:
Okay okay, you win :)
Thanks for being around and correcting my stupidness :)

Mark Evans wrote:
 Ilya dismissed TAL while forking the discussion into a contest promotion. 
Ilya,
 tell those Ph.D. computer scientists, and their funding sources at AFOSR and
 NSF, that they are just spinning wheels.  Or, do more homework.  Citations,
Feb 27 2003
prev sibling parent Bill Cox <bill viasic.com> writes:
Hi, Mark.

 We are not talking about virtual machines or JIT here, this is
 real compilation.  Functional power is possible in D.  -Mark
I doubt anyone would dispute that. However, in languages like ML, isn't just about everything a function, value, or variable? With the ability to have functions that build functions, you effectively can generate any code you like. Being able to make new functions in D might be cool, but what about classes, enums, structs, modules, etc? If you can't also build them, you don't really get the same power of functional languages. This probably isn't a good idea for D, but in a toy language I've been working on for fun, I would like to do something kind of radical: allow the user to write code that gets compiled into the compiler. I'd convert the language in the language itself (which I can do mostly automatically), and then open the intermeadiate data structures to the user. He could write code that runs off these structures and do anything at all. He could also extend or replace any of the code in the base compiler (this toy language supports redefinition of code). Another use would be to write code generators. The code generator construct would look a lot like templates, but they could be called in while loops, or anywhere else. They're basically templates on steriods without all the compiler complexity. The syntax of the language would also be defined in the language itself, in a yacc-like format, which user's could add to. In this toy language, anything can be extended after the fact just by adding a new definitions. The user could even add new data types to the compiler. Got a new expression type? No problem. Matrix multiplication? Strings? Go nuts. Tired of building new types but not getting any of that cool automatic promotion? Go fix it. The compiler would then compile an aplication specific compiler. The generated compiler would then be used to compile the rest of the user's code. During compilation of user code, the generators would occasionally create new code, which the generated compiler would also compile. It sounds really complicated, but I don't think it's much more complex than writing the compiler using the language the compiler compiles. The biggest addition would be adding a yacc-like capability, but hey, parsing is an important part of most real programs. Building it into the language would encourage people to use it, instead of the fscanf hacks I often see. Basically, I think this is a way of getting the awesome extensibility of languages like Forth or Lisp, but in a modern high-level language. I think it may cover the power you're looking for with first-class functions, but with less compiler complexity (but more brain spanking). Bill
Feb 22 2003