www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Generic and fundamental language design issue

reply "Tommi" <tommitissari hotmail.com> writes:
I have a fundamental language design talking point for you. It's 
not specific to D. I claim that, most of the time, a programmer 
cannot, and shouldn't have to, make the decision of whether to 
allocate on stack or heap. For example:

void func(T)()
{
     auto t = <allocate T from heap or stack?>
     ...
}

The question of whether variable t should lay in heap or stack, 
depends not only on the sizeof(T), but also on the context in 
which func is called at. If func is called at a context in which 
allocating T on stack would cause a stack overflow, then t should 
be allocated from the heap. On the other hand, if func is called 
at a context where T still fits nicely on the stack, it would 
probably be better and faster to allocate t on stack.

So, the question of whether to allocate that variable t on stack 
or heap is something that only the compiler (or runtime) can 
answer. Is there any language where you have the ability to say 
"allocate T from wherever it's best"?

I wonder if it would be possible in D to let the compiler 
allocate dynamic arrays on stack when it can statically guarantee 
that it's safe to do so (no dangling references, never increasing 
the size of the array, etc).
Nov 04 2012
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/4/12 12:35 PM, Tommi wrote:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

I don't think that claim is valid. As a simple example, polymorphism requires indirection (due to variations in size of the dynamic type compared to the static type) and indirection is strongly correlated with dynamic allocation. Also, the value vs. reference semantics of type are strongly correlated with where objects should go. So on the contrary, quite often heap vs. stack allocation is forced by the nature of what's allocated. Andrei
Nov 04 2012
next sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 04.11.2012 18:41, schrieb Andrei Alexandrescu:
 On 11/4/12 12:35 PM, Tommi wrote:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

I don't think that claim is valid. As a simple example, polymorphism requires indirection (due to variations in size of the dynamic type compared to the static type) and indirection is strongly correlated with dynamic allocation. Also, the value vs. reference semantics of type are strongly correlated with where objects should go. So on the contrary, quite often heap vs. stack allocation is forced by the nature of what's allocated. Andrei

Java and Go work around this issue using escape analysis, in opposite directions though. In Java most JITs can allocate objects in the stack if they see the object does not escape the local scope. Go goes the other way, by allocating in the heap local objects that are usually allocated on the stack but are returned the caller. -- Paulo
Nov 04 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 04/11/2012 18:41, Andrei Alexandrescu a écrit :
 On 11/4/12 12:35 PM, Tommi wrote:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

I don't think that claim is valid. As a simple example, polymorphism requires indirection (due to variations in size of the dynamic type compared to the static type) and indirection is strongly correlated with dynamic allocation. Also, the value vs. reference semantics of type are strongly correlated with where objects should go. So on the contrary, quite often heap vs. stack allocation is forced by the nature of what's allocated.

If it is proven that the object is scoped, then allocating it on stack make sens as well, even if it will be used only by reference.
Nov 04 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-11-05 01:09, deadalnix wrote:

 If it is proven that the object is scoped, then allocating it on stack
 make sens as well, even if it will be used only by reference.

Exactly, I do that sometimes with "scope", but that is deprecated these days. -- /Jacob Carlborg
Nov 04 2012
prev sibling next sibling parent "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Sunday, 4 November 2012 at 17:35:09 UTC, Tommi wrote:
 I wonder if it would be possible in D to let the compiler 
 allocate dynamic arrays on stack when it can statically 
 guarantee that it's safe to do so (no dangling references, 
 never increasing the size of the array, etc).

David Nadlinger was talking about how LDC does an optimization exactly like this, quite recently.
Nov 04 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Sunday, 4 November 2012 at 17:41:17 UTC, Andrei Alexandrescu 
wrote:
 I don't think that claim is valid. As a simple example, 
 polymorphism requires indirection (due to variations in size of 
 the dynamic type compared to the static type) and indirection 
 is strongly correlated with dynamic allocation.

Sure, there are situations where heap allocation is always needed. But couldn't the question of "whether to heap or stack allocate" be considered just an implementation detail of the language. And hide that implementation detail so that the programmer doesn't even need to know what the words heap and stack mean. I mean, wouldn't it be theoretically possible to sometimes even let the compiler allocate class objects in D from the stack, if the compiler can see that it's safe to do so (if that variable's exact type is known at compile time and never changes).
Nov 04 2012
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/4/2012 9:35 PM, Tommi пишет:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

Actually it can and most definitely should. It's scoped lifetime vs infinite lifetime, reference and value semantics. And e.g. RAII structs have to be on stack to have their effect applied usefully. Storing the reference to 't' elsewhere implies heap allocation (be it GC heap or manually managed one). Going further allocation strategy is a deep topic with a lot of considerations and in the end it goes far beyond stack vs GC heap.
 For example:

 void func(T)()
 {
      auto t = <allocate T from heap or stack?>
      ...
 }

 The question of whether variable t should lay in heap or stack, depends
 not only on the sizeof(T), but also on the context in which func is
 called at. If func is called at a context in which allocating T on stack
 would cause a stack overflow, then t should be allocated from the heap.

In fact placing big object on stack just because it fits could introduce stack overflow few hundred calls later. In general compiler can't know beforehand how big stack space you'll need and thusly the portion of it to use safely.
 On the other hand, if func is called at a context where T still fits
 nicely on the stack, it would probably be better and faster to allocate
 t on stack.

Unless you escape references. Compiler could detect it but will stay on the conservative side. In the end it may end up with unexpected heap allocations.
 So, the question of whether to allocate that variable t on stack or heap
 is something that only the compiler (or runtime) can answer.

God no. It may as well depend on the code compiler knows nothing about (nor run-time for this matter). extern(C): int blah(void* ptr); void func(T)() { auto t = <allocate T from heap or stack?> blah(&t); //now what? ... } If blah doesn't store pointer somewhere inside you are fine with stack. If it does then not even GC heap will help you.
 Is there
 any language where you have the ability to say "allocate T from wherever
 it's best"?

In simplest case I'd consider ability to bypass allocation on heap as an optimization.
 I wonder if it would be possible in D to let the compiler allocate
 dynamic arrays on stack when it can statically guarantee that it's safe
 to do so (no dangling references, never increasing the size of the
 array, etc).

-- Dmitry Olshansky
Nov 04 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Sunday, 4 November 2012 at 18:14:36 UTC, Dmitry Olshansky 
wrote:
 extern(C):
 int blah(void* ptr);

 void func(T)()
 {
       auto t = <allocate T from heap or stack?>
       blah(&t); //now what?
       ...
 }

 If blah doesn't store pointer somewhere inside you are fine 
 with stack. If it does then not even GC heap will help you.

In a perfect world, I think, the compiler would see that blah stores ptr and therefore would (implicitly) allocate t on the heap. The deallocation of the memory for t would be handled either by garbage collector or by some kind of implicit reference counting. I guess with extern C functions none of the above is doable.
Nov 04 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Sunday, 4 November 2012 at 18:14:36 UTC, Dmitry Olshansky 
wrote:
 In fact placing big object on stack just because it fits could 
 introduce stack overflow few hundred calls later.

But that's kind of my point also. For example: void func(T)(int n) { if (n <= 0) return; auto t = <create variable t of type T> ... func!(T)(n - 1); } If func is called with a runtime argument n, then the programmer cannot determine the most sensible way to allocate for variable t. What I'd like to happen is that, at runtime, every time a variable t is created, the program would determine the best way to allocate the memory for it, based on the available stack size (prefer stack if there's room there). On Sunday, 4 November 2012 at 18:14:36 UTC, Dmitry Olshansky wrote:
 In general compiler can't know beforehand how big stack space 
 you'll need and thusly the portion of it to use safely.

Yeah, I just realized it can't be done at compile-time. Making the decision between stack or heap allocation should happen at run-time.
Nov 04 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Sunday, 4 November 2012 at 19:42:29 UTC, Tommi wrote:
 void func(T)(int n)
 {
     if (n <= 0)
         return;

     auto t = <create variable t of type T>
     ...
     func!(T)(n - 1);
 }

 If func is called with a runtime argument n, then the 
 programmer cannot determine the most sensible way to allocate 
 for variable t.

The reason for that being that if n is large enough and t gets always allocated on stack, then there's going to be a stack overflow. And if n is small enough so that all those t's could be allocated on stack, then allocating them on heap would just make the code slower.
Nov 04 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Sunday, 4 November 2012 at 17:41:19 UTC, Jakob Ovrum wrote:
 On Sunday, 4 November 2012 at 17:35:09 UTC, Tommi wrote:
 I wonder if it would be possible in D to let the compiler 
 allocate dynamic arrays on stack when it can statically 
 guarantee that it's safe to do so (no dangling references, 
 never increasing the size of the array, etc).

David Nadlinger was talking about how LDC does an optimization exactly like this, quite recently.

Yes, LDC has an optimization pass like this indeed. However, it is currently disabled because its implementation causes a seemingly unrelated, hard-to-diagnose test suite failure. If anyone is interested in doing a little compiler debugging, here is the current state: https://github.com/ldc-developers/ldc/pull/206. I unfortunately won't be able to work on it in the next few days… David
Nov 04 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/4/2012 9:35 AM, Tommi wrote:
 The question of whether variable t should lay in heap or stack, depends not
only
 on the sizeof(T), but also on the context in which func is called at. If func
is
 called at a context in which allocating T on stack would cause a stack
overflow,
 then t should be allocated from the heap. On the other hand, if func is called
 at a context where T still fits nicely on the stack, it would probably be
better
 and faster to allocate t on stack.

You can't determine this, even at runtime, because you don't know what subsequent calls will use enough stack to overflow it. It is true, though, that you can do what is called "escape analysis", which if it can prove that a reference to the object cannot escape the function scope, then it can be allocated on the stack (if it isn't larger than some arbitrarily chosen size). Such escape analysis is entirely possible with D without changing any semantics.
Nov 04 2012
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/4/2012 1:27 PM, Walter Bright wrote:
 Such escape analysis is entirely possible with D without changing any
semantics.

The compiler already does a fairly primitive form of this analysis when deciding to allocate a closure on the stack or the heap.
Nov 04 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/4/2012 3:20 PM, Tommi wrote:
 It's great that optimizations could be used to eliminate unnecessary heap
 allocations. But it could be interesting to imagine a language where it was
 impossible for the programmer to explicitly specify the allocation method.
 Things would go to heap if it was determined necessary by compile-time
analysis,
 and also when the heuristics determine there might a danger of stack overflow.
 Otherwise things would go to stack. Pointers (or rather re-directable
 references) would always be used for referring to existing variables (that
could
 be on heap or not). Pointers would never represent "a variable allocated on
heap".

That's what Java does today.
Nov 04 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Sunday, 4 November 2012 at 21:27:36 UTC, Walter Bright wrote:
 You can't determine this, even at runtime, because you don't 
 know what subsequent calls will use enough stack to overflow it.

Right, I didn't take into account that heap allocations use some stack anyway to store the pointers. Since it's not possible to completely stop the stack from filling any further once it's been filled up, some kind of conservative heuristic should be used then for determining whether things should go on heap or not. That heuristic would be based on both compile-time and run-time analysis of the program. The heuristic cannot guarantee that there won't be a stack overflow, but neither can the programmer. It's great that optimizations could be used to eliminate unnecessary heap allocations. But it could be interesting to imagine a language where it was impossible for the programmer to explicitly specify the allocation method. Things would go to heap if it was determined necessary by compile-time analysis, and also when the heuristics determine there might a danger of stack overflow. Otherwise things would go to stack. Pointers (or rather re-directable references) would always be used for referring to existing variables (that could be on heap or not). Pointers would never represent "a variable allocated on heap".
Nov 04 2012
prev sibling parent deadalnix <deadalnix gmail.com> writes:
Le 04/11/2012 18:35, Tommi a écrit :
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap. For example:

 void func(T)()
 {
 auto t = <allocate T from heap or stack?>
 ...
 }

 The question of whether variable t should lay in heap or stack, depends
 not only on the sizeof(T), but also on the context in which func is
 called at. If func is called at a context in which allocating T on stack
 would cause a stack overflow, then t should be allocated from the heap.
 On the other hand, if func is called at a context where T still fits
 nicely on the stack, it would probably be better and faster to allocate
 t on stack.

 So, the question of whether to allocate that variable t on stack or heap
 is something that only the compiler (or runtime) can answer. Is there
 any language where you have the ability to say "allocate T from wherever
 it's best"?

 I wonder if it would be possible in D to let the compiler allocate
 dynamic arrays on stack when it can statically guarantee that it's safe
 to do so (no dangling references, never increasing the size of the
 array, etc).

BTW, Why can't we allocate 1Gb stack or something insanely big on 64bits systems ? It will not use actual physical memory unless it is used, and much more stuffs can be allocated on stack.
Nov 05 2012