digitalmars.D - Generic and fundamental language design issue

Tommi (24/24) Nov 04 2012 I have a fundamental language design talking point for you. It's

Andrei Alexandrescu (9/13) Nov 04 2012 I don't think that claim is valid. As a simple example, polymorphism

Tommi (12/16) Nov 04 2012 Sure, there are situations where heap allocation is always
Paulo Pinto (9/22) Nov 04 2012 Java and Go work around this issue using escape analysis, in opposite
deadalnix (3/15) Nov 04 2012 If it is proven that the object is scoped, then allocating it on stack

Jacob Carlborg (5/7) Nov 04 2012 Exactly, I do that sometimes with "scope", but that is deprecated these

Jakob Ovrum (3/7) Nov 04 2012 David Nadlinger was talking about how LDC does an optimization

David Nadlinger (9/16) Nov 04 2012 Yes, LDC has an optimization pass like this indeed. However, it

Dmitry Olshansky (31/57) Nov 04 2012 Actually it can and most definitely should. It's scoped lifetime vs

Tommi (8/18) Nov 04 2012 In a perfect world, I think, the compiler would see that blah
Tommi (22/26) Nov 04 2012 But that's kind of my point also. For example:

Tommi (6/17) Nov 04 2012 The reason for that being that if n is large enough and t gets

Walter Bright (8/14) Nov 04 2012 You can't determine this, even at runtime, because you don't know what

Walter Bright (3/4) Nov 04 2012 The compiler already does a fairly primitive form of this analysis when ...
Tommi (20/22) Nov 04 2012 Right, I didn't take into account that heap allocations use some

Walter Bright (2/10) Nov 04 2012 That's what Java does today.

deadalnix (5/29) Nov 05 2012 BTW, Why can't we allocate 1Gb stack or something insanely big on 64bits...

"Tommi" <tommitissari hotmail.com> writes:

I have a fundamental language design talking point for you. It's 
not specific to D. I claim that, most of the time, a programmer 
cannot, and shouldn't have to, make the decision of whether to 
allocate on stack or heap. For example:

void func(T)()
{
     auto t = <allocate T from heap or stack?>
     ...
}

The question of whether variable t should lay in heap or stack, 
depends not only on the sizeof(T), but also on the context in 
which func is called at. If func is called at a context in which 
allocating T on stack would cause a stack overflow, then t should 
be allocated from the heap. On the other hand, if func is called 
at a context where T still fits nicely on the stack, it would 
probably be better and faster to allocate t on stack.

So, the question of whether to allocate that variable t on stack 
or heap is something that only the compiler (or runtime) can 
answer. Is there any language where you have the ability to say 
"allocate T from wherever it's best"?

I wonder if it would be possible in D to let the compiler 
allocate dynamic arrays on stack when it can statically guarantee 
that it's safe to do so (no dangling references, never increasing 
the size of the array, etc).

Nov 04 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/4/12 12:35 PM, Tommi wrote:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

I don't think that claim is valid. As a simple example, polymorphism 
requires indirection (due to variations in size of the dynamic type 
compared to the static type) and indirection is strongly correlated with 
dynamic allocation. Also, the value vs. reference semantics of type are 
strongly correlated with where objects should go. So on the contrary, 
quite often heap vs. stack allocation is forced by the nature of what's 
allocated.

Andrei

Nov 04 2012

"Tommi" <tommitissari hotmail.com> writes:

On Sunday, 4 November 2012 at 17:41:17 UTC, Andrei Alexandrescu 
wrote:
 I don't think that claim is valid. As a simple example, 
 polymorphism requires indirection (due to variations in size of 
 the dynamic type compared to the static type) and indirection 
 is strongly correlated with dynamic allocation.

Sure, there are situations where heap allocation is always 
needed. But couldn't the question of "whether to heap or stack 
allocate" be considered just an implementation detail of the 
language. And hide that implementation detail so that the 
programmer doesn't even need to know what the words heap and 
stack mean.

I mean, wouldn't it be theoretically possible to sometimes even 
let the compiler allocate class objects in D from the stack, if 
the compiler can see that it's safe to do so (if that variable's 
exact type is known at compile time and never changes).

Nov 04 2012

Paulo Pinto <pjmlp progtools.org> writes:

Am 04.11.2012 18:41, schrieb Andrei Alexandrescu:
 On 11/4/12 12:35 PM, Tommi wrote:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

 I don't think that claim is valid. As a simple example, polymorphism
 requires indirection (due to variations in size of the dynamic type
 compared to the static type) and indirection is strongly correlated with
 dynamic allocation. Also, the value vs. reference semantics of type are
 strongly correlated with where objects should go. So on the contrary,
 quite often heap vs. stack allocation is forced by the nature of what's
 allocated.

 Andrei

Java and Go work around this issue using escape analysis, in opposite 
directions though.

In Java most JITs can allocate objects in the stack if they see the 
object does not escape the local scope.

Go goes the other way, by allocating in the heap local objects that are 
usually allocated on the stack but are returned the caller.

--
Paulo

Nov 04 2012

deadalnix <deadalnix gmail.com> writes:

Le 04/11/2012 18:41, Andrei Alexandrescu a écrit :
 On 11/4/12 12:35 PM, Tommi wrote:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

 I don't think that claim is valid. As a simple example, polymorphism
 requires indirection (due to variations in size of the dynamic type
 compared to the static type) and indirection is strongly correlated with
 dynamic allocation. Also, the value vs. reference semantics of type are
 strongly correlated with where objects should go. So on the contrary,
 quite often heap vs. stack allocation is forced by the nature of what's
 allocated.

If it is proven that the object is scoped, then allocating it on stack 
make sens as well, even if it will be used only by reference.

Nov 04 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-05 01:09, deadalnix wrote:

 If it is proven that the object is scoped, then allocating it on stack
 make sens as well, even if it will be used only by reference.

Exactly, I do that sometimes with "scope", but that is deprecated these 
days.

-- 
/Jacob Carlborg

Nov 04 2012

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Sunday, 4 November 2012 at 17:35:09 UTC, Tommi wrote:
 I wonder if it would be possible in D to let the compiler 
 allocate dynamic arrays on stack when it can statically 
 guarantee that it's safe to do so (no dangling references, 
 never increasing the size of the array, etc).

David Nadlinger was talking about how LDC does an optimization 
exactly like this, quite recently.

Nov 04 2012

"David Nadlinger" <see klickverbot.at> writes:

On Sunday, 4 November 2012 at 17:41:19 UTC, Jakob Ovrum wrote:
 On Sunday, 4 November 2012 at 17:35:09 UTC, Tommi wrote:
 I wonder if it would be possible in D to let the compiler 
 allocate dynamic arrays on stack when it can statically 
 guarantee that it's safe to do so (no dangling references, 
 never increasing the size of the array, etc).

 David Nadlinger was talking about how LDC does an optimization 
 exactly like this, quite recently.

Yes, LDC has an optimization pass like this indeed. However, it 
is currently disabled because its implementation causes a 
seemingly unrelated, hard-to-diagnose test suite failure. If 
anyone is interested in doing a little compiler debugging, here 
is the current state: 
https://github.com/ldc-developers/ldc/pull/206. I unfortunately 
won't be able to work on it in the next few days…

David

Nov 04 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11/4/2012 9:35 PM, Tommi пишет:
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap.

Actually it can and most definitely should. It's scoped lifetime vs 
infinite lifetime, reference and value semantics. And e.g. RAII structs 
have to be on stack to have their effect applied usefully. Storing the 
reference to 't' elsewhere implies heap allocation (be it GC heap or 
manually managed one).

Going further allocation strategy is a deep topic with a lot of 
considerations and in the end it goes far beyond stack vs GC heap.

 For example:

 void func(T)()
 {
      auto t = <allocate T from heap or stack?>
      ...
 }

 The question of whether variable t should lay in heap or stack, depends
 not only on the sizeof(T), but also on the context in which func is
 called at. If func is called at a context in which allocating T on stack
 would cause a stack overflow, then t should be allocated from the heap.

In fact placing big object on stack just because it fits could introduce 
stack overflow few hundred calls later.
In general compiler can't know beforehand how big stack space you'll 
need and thusly the portion of it to use safely.

 On the other hand, if func is called at a context where T still fits
 nicely on the stack, it would probably be better and faster to allocate
 t on stack.

Unless you escape references. Compiler could detect it but will stay on 
the conservative side. In the end it may end up with unexpected heap 
allocations.

 So, the question of whether to allocate that variable t on stack or heap
 is something that only the compiler (or runtime) can answer.

God no. It may as well depend on the code compiler knows nothing about 
(nor run-time for this matter).

extern(C):
int blah(void* ptr);

void func(T)()
{
       auto t = <allocate T from heap or stack?>
       blah(&t); //now what?
       ...
}

If blah doesn't store pointer somewhere inside you are fine with stack. 
If it does then not even GC heap will help you.

 Is there
 any language where you have the ability to say "allocate T from wherever
 it's best"?

In simplest case I'd consider ability to bypass allocation on heap as an 
optimization.

 I wonder if it would be possible in D to let the compiler allocate
 dynamic arrays on stack when it can statically guarantee that it's safe
 to do so (no dangling references, never increasing the size of the
 array, etc).


-- 
Dmitry Olshansky

Nov 04 2012

"Tommi" <tommitissari hotmail.com> writes:

On Sunday, 4 November 2012 at 18:14:36 UTC, Dmitry Olshansky 
wrote:
 extern(C):
 int blah(void* ptr);

 void func(T)()
 {
       auto t = <allocate T from heap or stack?>
       blah(&t); //now what?
       ...
 }

 If blah doesn't store pointer somewhere inside you are fine 
 with stack. If it does then not even GC heap will help you.


In a perfect world, I think, the compiler would see that blah 
stores ptr and therefore would (implicitly) allocate t on the 
heap. The deallocation of the memory for t would be handled 
either by garbage collector or by some kind of implicit reference 
counting.

I guess with extern C functions none of the above is doable.

Nov 04 2012

"Tommi" <tommitissari hotmail.com> writes:

On Sunday, 4 November 2012 at 18:14:36 UTC, Dmitry Olshansky 
wrote:
 In fact placing big object on stack just because it fits could 
 introduce stack overflow few hundred calls later.

But that's kind of my point also. For example:

void func(T)(int n)
{
     if (n <= 0)
         return;

     auto t = <create variable t of type T>
     ...
     func!(T)(n - 1);
}

If func is called with a runtime argument n, then the programmer 
cannot determine the most sensible way to allocate for variable 
t. What I'd like to happen is that, at runtime, every time a 
variable t is created, the program would determine the best way 
to allocate the memory for it, based on the available stack size 
(prefer stack if there's room there).

On Sunday, 4 November 2012 at 18:14:36 UTC, Dmitry Olshansky 
wrote:
 In general compiler can't know beforehand how big stack space 
 you'll need and thusly the portion of it to use safely.

Yeah, I just realized it can't be done at compile-time. Making 
the decision between stack or heap allocation should happen at 
run-time.

Nov 04 2012

"Tommi" <tommitissari hotmail.com> writes:

On Sunday, 4 November 2012 at 19:42:29 UTC, Tommi wrote:
 void func(T)(int n)
 {
     if (n <= 0)
         return;

     auto t = <create variable t of type T>
     ...
     func!(T)(n - 1);
 }

 If func is called with a runtime argument n, then the 
 programmer cannot determine the most sensible way to allocate 
 for variable t.

The reason for that being that if n is large enough and t gets 
always allocated on stack, then there's going to be a stack 
overflow. And if n is small enough so that all those t's could be 
allocated on stack, then allocating them on heap would just make 
the code slower.

Nov 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/4/2012 9:35 AM, Tommi wrote:
 The question of whether variable t should lay in heap or stack, depends not
only
 on the sizeof(T), but also on the context in which func is called at. If func
is
 called at a context in which allocating T on stack would cause a stack
overflow,
 then t should be allocated from the heap. On the other hand, if func is called
 at a context where T still fits nicely on the stack, it would probably be
better
 and faster to allocate t on stack.

You can't determine this, even at runtime, because you don't know what 
subsequent calls will use enough stack to overflow it.

It is true, though, that you can do what is called "escape analysis", which if 
it can prove that a reference to the object cannot escape the function scope, 
then it can be allocated on the stack (if it isn't larger than some arbitrarily 
chosen size).

Such escape analysis is entirely possible with D without changing any semantics.

Nov 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/4/2012 1:27 PM, Walter Bright wrote:
 Such escape analysis is entirely possible with D without changing any
semantics.

The compiler already does a fairly primitive form of this analysis when
deciding 
to allocate a closure on the stack or the heap.

Nov 04 2012

"Tommi" <tommitissari hotmail.com> writes:

On Sunday, 4 November 2012 at 21:27:36 UTC, Walter Bright wrote:
 You can't determine this, even at runtime, because you don't 
 know what subsequent calls will use enough stack to overflow it.

Right, I didn't take into account that heap allocations use some 
stack anyway to store the pointers.

Since it's not possible to completely stop the stack from filling 
any further once it's been filled up, some kind of conservative 
heuristic should be used then for determining whether things 
should go on heap or not. That heuristic would be based on both 
compile-time and run-time analysis of the program. The heuristic 
cannot guarantee that there won't be a stack overflow, but 
neither can the programmer.

It's great that optimizations could be used to eliminate 
unnecessary heap allocations. But it could be interesting to 
imagine a language where it was impossible for the programmer to 
explicitly specify the allocation method. Things would go to heap 
if it was determined necessary by compile-time analysis, and also 
when the heuristics determine there might a danger of stack 
overflow. Otherwise things would go to stack. Pointers (or rather 
re-directable references) would always be used for referring to 
existing variables (that could be on heap or not). Pointers would 
never represent "a variable allocated on heap".

Nov 04 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/4/2012 3:20 PM, Tommi wrote:
 It's great that optimizations could be used to eliminate unnecessary heap
 allocations. But it could be interesting to imagine a language where it was
 impossible for the programmer to explicitly specify the allocation method.
 Things would go to heap if it was determined necessary by compile-time
analysis,
 and also when the heuristics determine there might a danger of stack overflow.
 Otherwise things would go to stack. Pointers (or rather re-directable
 references) would always be used for referring to existing variables (that
could
 be on heap or not). Pointers would never represent "a variable allocated on
heap".

That's what Java does today.

Nov 04 2012

deadalnix <deadalnix gmail.com> writes:

Le 04/11/2012 18:35, Tommi a écrit :
 I have a fundamental language design talking point for you. It's not
 specific to D. I claim that, most of the time, a programmer cannot, and
 shouldn't have to, make the decision of whether to allocate on stack or
 heap. For example:

 void func(T)()
 {
 auto t = <allocate T from heap or stack?>
 ...
 }

 The question of whether variable t should lay in heap or stack, depends
 not only on the sizeof(T), but also on the context in which func is
 called at. If func is called at a context in which allocating T on stack
 would cause a stack overflow, then t should be allocated from the heap.
 On the other hand, if func is called at a context where T still fits
 nicely on the stack, it would probably be better and faster to allocate
 t on stack.

 So, the question of whether to allocate that variable t on stack or heap
 is something that only the compiler (or runtime) can answer. Is there
 any language where you have the ability to say "allocate T from wherever
 it's best"?

 I wonder if it would be possible in D to let the compiler allocate
 dynamic arrays on stack when it can statically guarantee that it's safe
 to do so (no dangling references, never increasing the size of the
 array, etc).

BTW, Why can't we allocate 1Gb stack or something insanely big on 64bits 
systems ?

It will not use actual physical memory unless it is used, and much more 
stuffs can be allocated on stack.

Nov 05 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Generic and fundamental language design issue