www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - language support for arrays

reply quetzal <quetzal_member pathlink.com> writes:
Is there any rationale to make language support dynamic arrays natively? We got
standart library for that. IMHO C++ has got it almost right with std::vector.
Seriously, there's nothing that prevents one from coding a portable and reliable
implementation of dynamic array class. Why do we need a direct support from
language here?

RTTI, reflection, GOOD compile-time language (not like c++ template
metaprogramming) are the areas where language support is absolutely needed. No
need to add extra complexity into language.

P.S. God, give us that stacktrace feature someday.
Jun 29 2004
next sibling parent reply Andy Friesen <andy ikagames.com> writes:
quetzal wrote:
 Is there any rationale to make language support dynamic arrays natively? We got
 standart library for that. IMHO C++ has got it almost right with std::vector.
 Seriously, there's nothing that prevents one from coding a portable and
reliable
 implementation of dynamic array class. Why do we need a direct support from
 language here?
I think so. By making dynamic arrays first class citizens of the language, they work a bit smoother. Also, D arrays can do a number of things std::vector couldn't hope to, like vector operations. int[] a, b; a[] += b[]; // increment each element in a by the corresponding element in b There's also issues like std::string's inescapably second-classness. You can't concatenate two string literals with the + operator because, in the end, they're char*s, not strings.
 RTTI, reflection, GOOD compile-time language (not like c++ template
 metaprogramming) are the areas where language support is absolutely needed. No
 need to add extra complexity into language.
RTTI - almost there. All we need is TypeInfo for other metatypes, and, ideally, more details. (associative arrays, function pointer types) Reflection - at runtime? It'd be nice, but I could live without. Compile-time language - A thousand times *YES*. It wouldn't even be all that hard. (macros compile to DLLs. Compiler links with these DLLs at compile-time, calls into them when macros are invoked) All that's needed is an API. (some shortcut syntax for composing and decomposing syntax trees would be extremely helpful, though) A good compile-time language would set D far and away as being more powerful than just about any other language of its type. (then we could start laughing at the C++ people in earnest, like the Lisp people have been doing for the past 30 years now) -- andy
Jun 29 2004
next sibling parent reply pragma <EricAnderton at yahoo dot com> <pragma_member pathlink.com> writes:
In article <cbs5ot$281r$1 digitaldaemon.com>, Andy Friesen says...
Compile-time language - A thousand times *YES*.  It wouldn't even be all 
that hard.  (macros compile to DLLs.  Compiler links with these DLLs at 
compile-time, calls into them when macros are invoked)  All that's 
needed is an API. (some shortcut syntax for composing and decomposing 
syntax trees would be extremely helpful, though)

A good compile-time language would set D far and away as being more 
powerful than just about any other language of its type. (then we could 
start laughing at the C++ people in earnest, like the Lisp people have 
been doing for the past 30 years now)
Actually, since I've started work on DSP (http://www.dsource.org/projects/dsp/) something along these lines became aparent to me. DSP performs something similar to what you're proposing. I suppose you could call it "late compilation and binding", where code is generated, compiled into a dll, bound and executed all within the same running context. It sounds to me that what you're suggesting is more like a kind of preprocessor that performs these exact same steps. If it were to keep the D grammar in mind, then I guess it would look like template code, only one step earlier in the compilation process? Then again, perhaps a more generic reflection API akin to whatever). #pragma(compileTime){ }"); .. and one could just as easily throw in hooks for 'pragma(runTime)' for late compilation of code, provided that DMD's location is somehow provided. Of course this "example" assumes that D is more or less self-hosting, which it's not yet. That and it may be *too* powerful for the purposes that you had in mind. A more conscise (and way-more restrictive as to avoid disrupting D's design) grammar is probably what is needed here. - Pragma
Jun 29 2004
parent reply Andy Friesen <andy ikagames.com> writes:
pragma <EricAnderton at yahoo dot com> wrote:
A good compile-time language would set D far and away as being more 
powerful than just about any other language of its type. (then we could 
start laughing at the C++ people in earnest, like the Lisp people have 
been doing for the past 30 years now)
Actually, since I've started work on DSP (http://www.dsource.org/projects/dsp/) something along these lines became aparent to me. DSP performs something similar to what you're proposing. I suppose you could call it "late compilation and binding", where code is generated, compiled into a dll, bound and executed all within the same running context. It sounds to me that what you're suggesting is more like a kind of preprocessor that performs these exact same steps. If it were to keep the D grammar in mind, then I guess it would look like template code, only one step earlier in the compilation process? Then again, perhaps a more generic reflection API akin to whatever).
Sort of, except that the preprocessor is also D. :) Here's some ideas I was scribbling earlier: <http://andy.tadan.us/d/macro_example.d.html> (warning, not close to being fully baked) -- andy
Jun 29 2004
parent reply pragma <EricAnderton at yahoo dot com> <pragma_member pathlink.com> writes:
In article <cbsdjs$2k4n$1 digitaldaemon.com>, Andy Friesen says...
Here's some ideas I was scribbling earlier: 
<http://andy.tadan.us/d/macro_example.d.html> (warning, not close to 
being fully baked)
Gotcha. I like the examples. Really got me thinking. I like your approach, but the syntax still felt too much like perl to me. :) IMO, It didn't really feel like D once you entered the 'meta{}' space (but what a neat example). So I tried meshing our ideas together to see what we get. Andy, feel free to abuse this post. ;) The result is a meta syntax that lets you generate code as string data, which then is wrapped by the compiler to create an extension. The extension is then invoked to add the appropriate handles to the D parser. When a meta symbol is parsed, its handle is invoked which in turn generates a substitute expression, method or whatever. The compiler would expand the meta statement into the following to generate an extension .dll. (it could also make a first pass to collect all meta statements into a single dll if need be). Now the compiler has a chunk of code, in a library, that meshes with the compiler, and modifies the standard parse tree. This means that any call to 'format' will cause an internal expansion to the macro code. - Pragma
Jun 29 2004
parent reply Andy Friesen <andy ikagames.com> writes:
pragma <EricAnderton at yahoo dot com> wrote:
 In article <cbsdjs$2k4n$1 digitaldaemon.com>, Andy Friesen says...
 
Here's some ideas I was scribbling earlier: 
<http://andy.tadan.us/d/macro_example.d.html> (warning, not close to 
being fully baked)
Gotcha. I like the examples. Really got me thinking. I like your approach, but the syntax still felt too much like perl to me. :) IMO, It didn't really feel like D once you entered the 'meta{}' space (but what a neat example). So I tried meshing our ideas together to see what we get.
I suspect the $ symbol is to blame for that. It's funny how a single language can sour someone on something so basic as a single, specific symbol. (I dislike $ for the same reason, even though I know full well that it's completely irrational) At any rate, I'm not particularly attached to that particular construct.
 Andy, feel free to abuse this post.  ;)
Oboy!
 The result is a meta syntax that lets you generate code as string data, which
 then is wrapped by the compiler to create an extension.  The extension is then
 invoked to add the appropriate handles to the D parser.  When a meta symbol is
 parsed, its handle is invoked which in turn generates a substitute expression,
 method or whatever.
 













The problem with this is that code is no longer being handled like code, but like a string of characters. This is certainly a powerful metaprogramming mechanism (it can say anything you can for obvious reasons), but it discards that notion of modelling the code in the abstract sense. For instance, it becomes more cumbersome to filter a macro through another macro because all the compiler gets is a string. Compilers shouldn't have to parse the code they've just created. My suggested meta{} syntax is just shorthand for creating AST nodes directly. For instance Expression e = meta { $x = $x + 1; } would be more or less synonymous with Expression e = new AssignStatement( x, new AddExpression(x, new IntegerLiteral(1)) ); If nothing else, this is going to be a bit faster, as no strings are generated just so they can be reconsumed. The other advantage is that we're staying very close to the problem domain. When generating code, we want to have a "code literal". A meta{} block is precisely that. The $ notation is exactly what it is in Perl: interpolating a code literal with variables. Frankly, though, I can't help but think that all this couldn't itself be implemented as a macro. The core compiler would only have to expose a standard class heirarchy representing its various AST nodes (which would probably coincide with the compiler's internal structures, but doesn't necessarily have to--the compiler merely has to do some extra work to convert the two) It may be worthwhile to require that macros to be "activated" with something like: import macro foobar; // activates all macros defined in foobar.dll This minimizes potential screwups due to unexpected macro expansion.
 The compiler would expand the meta statement into the following to generate an
 extension .dll. (it could also make a first pass to collect all meta statements
 into a single dll if need be).
Proposed massacre: (dropping namespaces for brevity. assume all type names are standard lib type things)
 import std.compiler;
  
 Expression format_metahandle(Expression[] _args) {
 	Expression result;
 	for (Expression arg; args)
 		result = meta {
 			// still using $ because I can't think of anything better
 			( $result ).format( $arg );
 		};
 	}
 	return result;
 }
 
 MacroArgumentList format_arguments =
MacroArgumentList.createVariadicArgumentList();
 
 // DLLNAME is the same name as the .dll file, sans extension.
 extern (C) MacroDefinition[] init_DLLNAME() { 
 	MacroDefinition[] macros;
 
 	// macros are expanded before argument resolution, so
 	// we don't need to talk about whether it has a return type
 	// or what that type is.  We do, however, need to tell the
 	// compiler how the macro is used syntactically.
 
 	macros ~= new MacroDefinition("format", &format_metahandle, format_arguments);
 
 	return macros;
 }
Additionally, I don't think any of this should be automatic at all. The programmer should have to type all this crap out manually for a very, very simple reason: We could write a macro to do it for us. It would be cool if the meta{} syntax could also be written as a macro. If done as such, the compiler's responsibilities would amount to three things: (1) import macro x; (2) Recognizing macro invokations and replacing them with the code they return. (3) Converting the 'public' AST class heirarchy to and from its own internal classes. (easy if they're one and the same) I'm starting to think that the compiler should not compile and run macros within the same project for a few reasons. First is the obvious simplification of the implementation. Requiring that the compiler execute macros defined in the very compilation unit it is working on necessatates either that the compiler be able to compile and link a complete DLL, or implement an interpreter. Relaxing this restriction allows the compiler to remain ignorant of linking. Second, separate compilation makes it abundantly clear when the macro DLLs are used. The compiler needs them to build the software, the resulting application does not need them at all. This goes a long way towards clearing confusion as to what is being compiled and executed at what stage. The last reason is that there is a necessarily huge potential for obfuscation. Extending the language is a pretty big deal and should not be done at the drop of a hat. -- andy
Jun 29 2004
parent reply pragma <EricAnderton at yahoo dot com> <pragma_member pathlink.com> writes:
In article <cbt0kv$dbb$1 digitaldaemon.com>, Andy Friesen says...
The problem with this is that code is no longer being handled like code, 
but like a string of characters.  This is certainly a powerful 
metaprogramming mechanism (it can say anything you can for obvious 
reasons), but it discards that notion of modelling the code in the 
abstract sense.
I'm with you there. My stab at using text to generate an expression was really based on my more practical experience using things like 'eval' in PHP and the like. Personally, I have yet to write anything even approaching a compiler, hence my being somewhat naive on the topic.
For instance, it becomes more cumbersome to filter a macro through 
another macro because all the compiler gets is a string.  Compilers 
shouldn't have to parse the code they've just created.
I agree that it puts more work on the compiler, but I can't help but feel that both styles of code generation have their place. After all, many (scripted) languages have an 'eval' statement somewhere that works with raw text; its had to have some merit to hang on conceptually. Besides, what if you wanted to generate code based on external input, like a file? You could nest arbitrary code-snippets in that external file and they would fold neatly into a generic piece of metacode. Now you can take your code DB buisness-object generation suite and chuck it: D can build code out of virtually anything. family of interfaces, you can work with raw expressions or go all the way down to IL opcodes if you really want to go there.
Frankly, though, I can't help but think that all this couldn't itself be 
implemented as a macro.  The core compiler would only have to expose a 
standard class heirarchy representing its various AST nodes (which would 
probably coincide with the compiler's internal structures, but doesn't 
necessarily have to--the compiler merely has to do some extra work to 
convert the two)
Now this really gets me thinking. Is there wiggle room in the DMD frontend to expose this kind of functionality? Or is this yet another motivation to construct a D-based D compiler from the ground up?
Additionally, I don't think any of this should be automatic at all.  The 
programmer should have to type all this crap out manually for a very, 
very simple reason:

     We could write a macro to do it for us.
Better yet, you could write the entire D language with this kind of facility. In essence, we're really turning the compiler inside out and making the entire mess accessable if not completely mutable. In fact, macros now wouldn't even have to be explicit or opaque. You could even write in those exception-stack-traces that you wanted. :)
I'm starting to think that the compiler should not compile and run 
macros within the same project for a few reasons.  First is the obvious 
simplification of the implementation.  Requiring that the compiler 
execute macros defined in the very compilation unit it is working on 
necessatates either that the compiler be able to compile and link a 
complete DLL, or implement an interpreter.  Relaxing this restriction 
allows the compiler to remain ignorant of linking.

Second, separate compilation makes it abundantly clear when the macro 
DLLs are used. The compiler needs them to build the software, the 
resulting application does not need them at all.  This goes a long way 
towards clearing confusion as to what is being compiled and executed at 
what stage.

The last reason is that there is a necessarily huge potential for 
obfuscation.  Extending the language is a pretty big deal and should not 
be done at the drop of a hat.
Okay, that makes things much more clear. My only motivation for including macros side-by-side with code was that the macro was very tightly coupled to the code that used it. Macros tend to be very domain-specific, at least in C programming. But then again, there probably isn't much merit in sharing C preprocessor macros due to how weak the language is. But I see what you're saying with using 'import macro foobar;'. By explictly prodding the compiler to treat a file as a compiler extension, rather than sourcecode for the current target, it makes things far more manageable. Maybe D might be one of the first C-style languages to get a full-on macro distribution. As long as you have the compiler extension for the macro grammar of your choice, things should compile along just peachy. - Pragma
Jun 29 2004
parent Andy Friesen <andy ikagames.com> writes:
pragma <EricAnderton at yahoo dot com> wrote:

 Besides, what if you wanted to generate code based on external input, like a
 file?
I hadn't thought of that at all. I think you're right.
 Is there wiggle room in the DMD frontend to
 expose this kind of functionality?  Or is this yet another motivation to
 construct a D-based D compiler from the ground up? 
Since the whole point is to transform one piece of code into another, it should be wholly implementable in the frontend, just like templates. The trick is making DMD (which is implemented in C++) talk to D classes. Worst case: inline assembly. Second worst case: (probably best case) lots of C++ and D glue code. (a C API to manipulate AST things and D classes which connect to it) The third choice is to eschew objects in the compile-time API so that C++ can connect to D via the extern (C) ABI. Given that, though, it should be quite doable.
 Better yet, you could write the entire D language with this kind of facility. 
 
Probably, but that'd be a whole lot more work. ;)
 In essence, we're really turning the compiler inside out and making the entire
 mess accessable if not completely mutable.  In fact, macros now wouldn't even
 have to be explicit or opaque.  You could even write in those
 exception-stack-traces that you wanted. :)
There are literally a ton of interesting things you can do. The majority of the Nemerle language syntax is implemented with macros. (this includes the primitive operators and conditional constructs like if() and while()) -- andy
Jun 29 2004
prev sibling parent reply quetzal <quetzal_member pathlink.com> writes:
I think so.  By making dynamic arrays first class citizens of the 
language, they work a bit smoother.
But also programmer loses control. He cant change how memory is managed in array, how array is sorted (bloody .sort) and other stuff like that.
Also, D arrays can do a number of things std::vector couldn't hope to, 
like vector operations.

     int[] a, b;
     a[] += b[]; // increment each element in a by the corresponding 
element in b
This is for sure possible in library array implementation.
There's also issues like std::string's inescapably second-classness. 
You can't concatenate two string literals with the + operator because, 
in the end, they're char*s, not strings.
This is not about strings, and this is not C++. D native string support is ok.
Compile-time language - A thousand times *YES*.  It wouldn't even be all 
that hard.  (macros compile to DLLs.  Compiler links with these DLLs at 
compile-time, calls into them when macros are invoked)  All that's 
needed is an API. (some shortcut syntax for composing and decomposing 
syntax trees would be extremely helpful, though)
A good compile-time language would set D far and away as being more 
powerful than just about any other language of its type. (then we could 
start laughing at the C++ people in earnest, like the Lisp people have 
been doing for the past 30 years now)
Agreed. Lisp-like macros should work quite good.
Jun 29 2004
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <cbsqg4$4mt$1 digitaldaemon.com>, quetzal says...
I think so.  By making dynamic arrays first class citizens of the 
language, they work a bit smoother.
But also programmer loses control. He cant change how memory is managed in array, how array is sorted (bloody .sort) and other stuff like that.
Only memory allocation is at issue. It's simple enough to write your own sort routine and pass an array to that. Sure it's not the default property but so what. Sean
Jun 29 2004
prev sibling parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
quetzal wrote:

I think so.  By making dynamic arrays first class citizens of the
language, they work a bit smoother.
But also programmer loses control. He cant change how memory is managed in array, how array is sorted (bloody .sort) and other stuff like that.
I agree with you, that .sort as a language was not really a good idea. Of course, it seems convenient on the first glimpse, but then - if everything that is convenient is packed into the language, it will become a complete mess. Anyhow, for arrays in general: Nobody hinders you to ignore the language level arrays and do your own.
Jun 29 2004
prev sibling next sibling parent Sean Kelly <sean f4.ca> writes:
In article <cbs2tm$23e7$1 digitaldaemon.com>, quetzal says...
Is there any rationale to make language support dynamic arrays natively? We got
standart library for that. IMHO C++ has got it almost right with std::vector.
Seriously, there's nothing that prevents one from coding a portable and reliable
implementation of dynamic array class. Why do we need a direct support from
language here?
Just to muddy the waters, the last version of the C standard included a primitive dynamic array type. AFAIK the C++ committe plans to include support for this type in the next iteration of the C++ standard as well.
RTTI, reflection, GOOD compile-time language (not like c++ template
metaprogramming) are the areas where language support is absolutely needed. No
need to add extra complexity into language.
I don't think Walter would have built it into D if he thought it were overly complicated :) Sean
Jun 29 2004
prev sibling next sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
There is one aspect of arrays where no library will ever reach native
arrays: vectorizing expressions!

In C++, expression templates go some way in that direction, but they are
still way of what a good vectorizing compiler can do.

Years ago, this would only have been a matter for high-performance
specialists coding for multi-processor number-crunching machines. Nowadays,
every PC has plenty of vectorizing capabilities (super-scalar technology,
etc.), therefore, high-level language elements really are necessary to
allow the compiler to do the work of optimizing the code.


quetzal wrote:

 Is there any rationale to make language support dynamic arrays natively?
 We got standart library for that. IMHO C++ has got it almost right with
 std::vector. Seriously, there's nothing that prevents one from coding a
 portable and reliable implementation of dynamic array class. Why do we
 need a direct support from language here?
 
 RTTI, reflection, GOOD compile-time language (not like c++ template
 metaprogramming) are the areas where language support is absolutely
 needed. No need to add extra complexity into language.
 
 P.S. God, give us that stacktrace feature someday.
Jun 29 2004
parent reply quetzal <quetzal_member pathlink.com> writes:
In article <cbso6e$1g4$1 digitaldaemon.com>, Norbert Nemec says...
There is one aspect of arrays where no library will ever reach native
arrays: vectorizing expressions!

In C++, expression templates go some way in that direction, but they are
still way of what a good vectorizing compiler can do.
Years ago, this would only have been a matter for high-performance
specialists coding for multi-processor number-crunching machines. Nowadays,
every PC has plenty of vectorizing capabilities (super-scalar technology,
etc.), therefore, high-level language elements really are necessary to
allow the compiler to do the work of optimizing the code.
There's nothing that prevents library from implementing dynamic arrays as pointer to data + size (just like language does now). So it can be vectorized just the same way. Also programmer gets control and can fine-tune array implementation for his own needs. I think the way to go is interface based standart library.. if programmer wants to change array behaviour he can just write class that implements given interface and alias it.
Jun 29 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
quetzal wrote:

 In article <cbso6e$1g4$1 digitaldaemon.com>, Norbert Nemec says...
There is one aspect of arrays where no library will ever reach native
arrays: vectorizing expressions!

In C++, expression templates go some way in that direction, but they are
still way of what a good vectorizing compiler can do.
Years ago, this would only have been a matter for high-performance
specialists coding for multi-processor number-crunching machines.
Nowadays, every PC has plenty of vectorizing capabilities (super-scalar
technology, etc.), therefore, high-level language elements really are
necessary to allow the compiler to do the work of optimizing the code.
There's nothing that prevents library from implementing dynamic arrays as pointer to data + size (just like language does now). So it can be vectorized just the same way. Also programmer gets control and can fine-tune array implementation for his own needs. I think the way to go is interface based standart library.. if programmer wants to change array behaviour he can just write class that implements given interface and alias it.
The problem is not the implementation of the array itself, but that of vectorized expressions like: A[] = B[] + 3*C[]; Implementing this in the library would be possible, of course, but you could never get to the same level of optimization as it is possible for a good compiler that knows all the details of the processor. And, to have vectorized expressions in the language, you need, of course, to have the arrys in the language first. Beyond this special example, there is a ton of other examples where the compiler can do optimizations on arrays that are not possible for library-implemented arrays.
Jun 29 2004
prev sibling parent reply Sam McCall <tunah.d tunah.net> writes:
quetzal wrote:

 Is there any rationale to make language support dynamic arrays natively? We got
 standart library for that. IMHO C++ has got it almost right with std::vector.
 Seriously, there's nothing that prevents one from coding a portable and
reliable
 implementation of dynamic array class. Why do we need a direct support from
 language here?
It's not about need, it's about optimisation as others have pointed out. Resizing-in-place particularly. I think if the compiler is clever enough and accessors for array elements were inlined, we might get vectorization anyway? On the other hand, I'm tempted to agree with the desire for a class-based dynamic array as standard, for consistency with other objects. Having arrays in java working the same as class references was very nice. Sam
Jun 29 2004
parent "Walter" <newshound digitalmars.com> writes:
"Sam McCall" <tunah.d tunah.net> wrote in message
news:cbt02j$ci2$1 digitaldaemon.com...
 On the other hand, I'm tempted to agree with the desire for a
 class-based dynamic array as standard, for consistency with other
 objects. Having arrays in java working the same as class references was
 very nice.
I'm pretty familiar with Java arrays, since I implemented a Java compiler and worked on a Java VM. D arrays can do many things Java arrays cannot: 1) can be resized 2) can be sliced 3) can exist as 'lightweight' arrays on the stack 4) integrate seamlessly with C arrays 5) can have bounds checking turned off 6) have no extra length overhead when using static arrays 7) can exist in static data Having arrays in the syntax rather than as a vector<> class offers the advantages: 1) you can declare them with a specialized array syntax, like: int[3] foo; 2) specialized array literal syntax is possible: int[3] foo = [0:1,2,3]; 3) seamless interaction between static and dynamic arrays 4) the compiler knows about arrays, so can give sensible error messages rather than incomprehensible errors related to template implementation internals 5) the compiler knowing they are arrays means better code can be generated, particularly for things like foreach loops 6) vector ops are possible with specialized array syntax 7) arrays and strings can be the same thing, rather than incompatible
Jul 08 2004