www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Musings on the 'auto' keyword

reply Sean Kelly <sean f4.ca> writes:
While thinking about RAII this morning I was suddenly struck by the oddness of
the 'auto' keyword in D.  Simply put, the only purpose of this keyword is to
allow stack-based destruction behavior in D.  Why do we need a keyword for this?
I'm beginning to feel like the way UDTs are designed in D is both awkward and
unnecessarily limiting.  First, we have structs, which seem to be inteded
primarily for C compatibility.  Since structs are a POD type, it makes perfect
sense that they aren't allowed to have a ctor/dtor and that they are not
inheritable (though on some level it seems a tad odd that they can contain
visibility rules, static/dynamic functions, etc, even if that these things don't
actually affect the data footprint).  Second, we have classes which are
full-featured but which may only be constructed on the heap (classes also break
the otherwise universal pointer syntax, so while their use looks stack-based but
it's actually not).

If I want to do RAII in D the solution is simple: use 'auto' with a class type.
This serves as a useful visual aid when scanning code and is probably a useful
hint to the compiler for exception handling.  But what if I want to avoid heap
allocation?  The spec suggests the use of alloca, but unless it guarantees that
this function will be available on all platforms it is not a general solution
(note that alloca is not included in the C or POSIX spec).  Since it is quite
common to desire stack-based allocation for objects whose size is known at
compile-time, I do have one other option: allocate a (properly aligned) array on
the stack and placement-construct my class into it.  But while this is
technically possible, doing so seems awkward and error-prone.

So there are two options: a type that can be allocated on the stack or the heap
(struct), but does not have the capacity to be used for RAII, and a type that
can only be allocated on the heap (class, based on general language support)
which may simulate stack-oriented behavior through the use of a keyword.
Perhaps it's my C/C++ background, but this simply doesn't make sense to me.  And
I'm becoming frustrated by my inability to define implicit copy semantics for
UDTs.  I suppose it's something I'm just going to have to live with, but it's
this aspect of D more than any other that is a constant source of irritation.
Should I just accept that I will need to do dynamic memory allocation in any D
program I write that contains class objects?


Sean
Sep 12 2005
next sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:dg4jj3$27g9$1 digitaldaemon.com...
 While thinking about RAII this morning I was suddenly struck by the 
 oddness of
 the 'auto' keyword in D.  Simply put, the only purpose of this keyword is 
 to
 allow stack-based destruction behavior in D.  Why do we need a keyword for 
 this?
 I'm beginning to feel like the way UDTs are designed in D is both awkward 
 and
 unnecessarily limiting.  First, we have structs, which seem to be inteded
 primarily for C compatibility.  Since structs are a POD type, it makes 
 perfect
 sense that they aren't allowed to have a ctor/dtor and that they are not
 inheritable (though on some level it seems a tad odd that they can contain
 visibility rules, static/dynamic functions, etc, even if that these things 
 don't
 actually affect the data footprint).  Second, we have classes which are
 full-featured but which may only be constructed on the heap (classes also 
 break
 the otherwise universal pointer syntax, so while their use looks 
 stack-based but
 it's actually not).

 If I want to do RAII in D the solution is simple: use 'auto' with a class 
 type.
 This serves as a useful visual aid when scanning code and is probably a 
 useful
 hint to the compiler for exception handling.  But what if I want to avoid 
 heap
 allocation?  The spec suggests the use of alloca, but unless it guarantees 
 that
 this function will be available on all platforms it is not a general 
 solution
 (note that alloca is not included in the C or POSIX spec).  Since it is 
 quite
 common to desire stack-based allocation for objects whose size is known at
 compile-time, I do have one other option: allocate a (properly aligned) 
 array on
 the stack and placement-construct my class into it.  But while this is
 technically possible, doing so seems awkward and error-prone.

Isn't it true that the compiler could optimize auto Foo x = new Foo; to allocate from the stack automatically? Is there a way to initialize an auto var without 'new'? I haven't looked at that part of the language in a while. If that's true then the current behavior of heap alloc for auto vars is an implementation detail that user code shouldn't worry about (aside from performance of heap allocation).
 So there are two options: a type that can be allocated on the stack or the 
 heap
 (struct), but does not have the capacity to be used for RAII, and a type 
 that
 can only be allocated on the heap (class, based on general language 
 support)
 which may simulate stack-oriented behavior through the use of a keyword.
 Perhaps it's my C/C++ background, but this simply doesn't make sense to 
 me.  And
 I'm becoming frustrated by my inability to define implicit copy semantics 
 for
 UDTs.  I suppose it's something I'm just going to have to live with, but 
 it's
 this aspect of D more than any other that is a constant source of 
 irritation.

I'm curious what's the use case you have? The only time I've bumped into small classes that would be nice to have auto on the stack is your ScopedLock that yuo suggested for automatically releasing Lock objects (at least I think that was you).
 Should I just accept that I will need to do dynamic memory allocation in 
 any D
 program I write that contains class objects?

uhh - D code without any dynamic memory allocation is an interesting thought. What's the program you have in mind?
 Sean

Sep 13 2005
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <dg6ntu$1ef6$1 digitaldaemon.com>, Ben Hinkle says...
Isn't it true that the compiler could optimize
 auto Foo x = new Foo;
to allocate from the stack automatically? Is there a way to initialize an 
auto var without 'new'? I haven't looked at that part of the language in a 
while. If that's true then the current behavior of heap alloc for auto vars 
is an implementation detail that user code shouldn't worry about (aside from 
performance of heap allocation).

It's nice to know the cost of creating a new object when writing library code (where premature optimization is the norm). I grant that some GC implementations can make dynamic allocation exceedingly efficient, but that's an implementation detail. Also, some code simply can't use dynamic allocation for performance or safety purposes. Free lists and such are often an option in those cases, but wouldn't stack-based allocation be a convenient alternative?
I'm curious what's the use case you have? The only time I've bumped into 
small classes that would be nice to have auto on the stack is your 
ScopedLock that yuo suggested for automatically releasing Lock objects (at 
least I think that was you).

I use such classes all the time in C++ programming. But what actually got me thinking about this was a paper on two-phase update. There are some instances where timely destruction (across scope boundaries) is important and GC is therefore not a feasible solution. Smart pointers are an obvious replacement in the C++ realm, though the paper suggested two-phase update as a more optimal one. The first time this cropped up for me however was when trying to create a nontrivial value type in D. A struct doesn't work in this case and classes seem somewhat awkward for the purpose.
 Should I just accept that I will need to do dynamic memory allocation in 
 any D
 program I write that contains class objects?

uhh - D code without any dynamic memory allocation is an interesting thought. What's the program you have in mind?

I was thinking of the common objections to GC--embedded code and such--which aren't programs I tend to write. As this affects me personally, this is more of an idealistic issue--the lack of stack-based allocation just doesn't feel right from a syntax standpoint. That said, it also makes template programming more difficult, as classes have different construction semantics from other types in D. This can be solved by using some additional template code, but it is irksome to write code like this: T val = create!(T)( oldval ); just to be sure it will work for classes (which require 'new' even though T is not a 'pointer' type) as well as POD types (which are all not copy constructable). I'll admit that it's probably too late in the game for this, as these are funamental features of the language--it would break a tremendous amount of code if this: MyClass c = new MyClass(); were changed to this: MyClass* c = new MyClass(); Even if the transition would be a one-time affair. Sean
Sep 13 2005
prev sibling next sibling parent "Walter Bright" <newshound digitalmars.com> writes:
"Ben Hinkle" <bhinkle mathworks.com> wrote in message
news:dg6ntu$1ef6$1 digitaldaemon.com...
 Isn't it true that the compiler could optimize
  auto Foo x = new Foo;
 to allocate from the stack automatically?

Yes.
Sep 18 2005
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <dg6ntu$1ef6$1 digitaldaemon.com>, Ben Hinkle says...
 Should I just accept that I will need to do dynamic memory allocation in 
 any D
 program I write that contains class objects?

uhh - D code without any dynamic memory allocation is an interesting thought. What's the program you have in mind?

I just ran into my first such instance today: I wanted to abstract the startup/teardown code in extern(C) main() to be done in an auto object. Since one job of this code is to initialize the GC, DMA isn't an option. And as this has to be portable, neither is alloca. It would be the same issue if a designer wanted to use RAII in the GC code itself, as DMA obviously isn't allowed there either. I suppose my gripe is that basic language features should not be so constrained. Sean
Sep 19 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:dgmpfb$22f2$1 digitaldaemon.com...
 In article <dg6ntu$1ef6$1 digitaldaemon.com>, Ben Hinkle says...
 Should I just accept that I will need to do dynamic memory allocation in
 any D program I write that contains class objects?

uhh - D code without any dynamic memory allocation is an interesting thought. What's the program you have in mind?

I just ran into my first such instance today: I wanted to abstract the startup/teardown code in extern(C) main() to be done in an auto object. Since one job of this code is to initialize the GC, DMA isn't an option. And as this has to be portable, neither is alloca. It would be the same issue if a designer wanted to use RAII in the GC code itself, as DMA obviously isn't allowed there either. I suppose my gripe is that basic language features should not be so constrained. Sean

oh, that's what you meant by "any". I agree if the GC isn't initialized you can't allocate from the GC. That seems like a one-time state, though. What's wrong with try {...} finally {...} ?
Sep 19 2005
parent Sean Kelly <sean f4.ca> writes:
In article <dgmt9o$25vf$1 digitaldaemon.com>, Ben Hinkle says...
oh, that's what you meant by "any". I agree if the GC isn't initialized you 
can't allocate from the GC. That seems like a one-time state, though. What's 
wrong with
 try {...} finally {...}

Design issue. I wanted to conceal runtime-level implementation from other libraries and was considering putting an RAII class in the standard library to aid in creating Windows programs, as it seems odd that the current method requires users to call all sorts of runtime code manually to do this. I was thinking about all of this on my way to work and thought of a few quirks that stack-based allocation might create. Consider the following code: # interface I { void f2(); } # class C { void f1() {} } # class D : C, I { void f1() {} void f2() {} } # # void op1( C val ) {} # void op2( I val ) {} # # void main() { # D val; // on stack # op1( val ); # op2( val ); # } What would be the behavior of this? By C++ standard, op1 would pass the instance of D to C's copy ctor and then pass the new (stripped) object to the function body. The call to f1() would then call C's f1 even though D has a more specialized version available. This seems reasonable but potentially confusing for non-C++ programmers. And 'inout' could be considered D's reference qualifier, though the fans of logical const-ness may well go crazy if this were their only option for pass-by-reference (to pass D as-is). That said, it's no different than things are now, so it wouldn't alter current behavior, only extend it to stack-based UDTs. The more confusing issue is the behavior of op2. As interfaces can clearly not be considered base types, a copy operation simply makes no sense. Interfaces would have to be documented as always behaving as references in D (which I suppose they are). This would render the difference between: void op3( I val ) {} void op4( inout I val ) {} meaningless. Though again, this would merely be exending D's current behavior to stack-based UDTs. Things might get confusing when combining stack and heap-based objects with respect to references however: # void op5( I val ) { # class MyExcept : Exception { # this( I val { m_val = val; } # private I m_val; # } # throw new MyExcept( val ); # } # # void main() { # try { # D* dyna = new D; # D stat; # op5( dyna ); # op5( stat ); # } # catch( Exception e ) {} # } In the above code, op5 would behave correctly when passed a heap-based object but not a stack-based object. That said, the same code would break just as horribly in D as it is now: # void op5( I val ) { # class MyExcept : Exception { # this( I val { m_val = val; } # private I m_val; # } # throw new MyExcept( val ); # } # # void main() { # try { # auto D dnow = new D; # op5( dnow ); # } # catch( Exception e ) {} # } Having said all this, I think that the suggestion of stack-based UDTs could give rise to some confusing issues, but nearly all of these issues already exist. Are there any other issues I haven't considered? Sean
Sep 19 2005
prev sibling parent Dave <Dave_member pathlink.com> writes:
In article <dg4jj3$27g9$1 digitaldaemon.com>, Sean Kelly says...
While thinking about RAII this morning I was suddenly struck by the oddness of
the 'auto' keyword in D.  Simply put, the only purpose of this keyword is to
allow stack-based destruction behavior in D.  Why do we need a keyword for this?
I'm beginning to feel like the way UDTs are designed in D is both awkward and
unnecessarily limiting.  First, we have structs, which seem to be inteded
primarily for C compatibility.  Since structs are a POD type, it makes perfect
sense that they aren't allowed to have a ctor/dtor and that they are not
inheritable (though on some level it seems a tad odd that they can contain
visibility rules, static/dynamic functions, etc, even if that these things don't
actually affect the data footprint).  Second, we have classes which are
full-featured but which may only be constructed on the heap (classes also break
the otherwise universal pointer syntax, so while their use looks stack-based but
it's actually not).

Very good points all... IMO, the ideal would be current D class semantics with the addition of C++ style instantiation (heap or stack) and also the ability to return stack instantiated class objects from functions. Perhaps structs could have even been done away with altogether. - Dave
If I want to do RAII in D the solution is simple: use 'auto' with a class type.
This serves as a useful visual aid when scanning code and is probably a useful
hint to the compiler for exception handling.  But what if I want to avoid heap
allocation?  The spec suggests the use of alloca, but unless it guarantees that
this function will be available on all platforms it is not a general solution
(note that alloca is not included in the C or POSIX spec).  Since it is quite
common to desire stack-based allocation for objects whose size is known at
compile-time, I do have one other option: allocate a (properly aligned) array on
the stack and placement-construct my class into it.  But while this is
technically possible, doing so seems awkward and error-prone.

So there are two options: a type that can be allocated on the stack or the heap
(struct), but does not have the capacity to be used for RAII, and a type that
can only be allocated on the heap (class, based on general language support)
which may simulate stack-oriented behavior through the use of a keyword.
Perhaps it's my C/C++ background, but this simply doesn't make sense to me.  And
I'm becoming frustrated by my inability to define implicit copy semantics for
UDTs.  I suppose it's something I'm just going to have to live with, but it's
this aspect of D more than any other that is a constant source of irritation.
Should I just accept that I will need to do dynamic memory allocation in any D
program I write that contains class objects?


Sean

Sep 13 2005