www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Variable-length stack allocated arrays

reply bearophile <bearophileHUGS lycos.com> writes:
You can see of this as a small DEP :-)

Small variable-length arrays allocated on the stack can be somewhat useful in
D. They avoid most usages of alloca() and are a bit safer than alloca() because
no pointer casting is needed.

alloca() gives some performance improvement compared to normal dynamic arrays
if the array is small and the function (here bar()) is called many times (I
have experimentally seen this in the Levenshtein distance function, that needs
to allocate a temporary buffer. If the length of the two input strings is small
I use a fixed-sized array as buffer, this improves performance some in code
that needs to compute this distance over many small strings. I think a
variable-length array is a bit better here. In this situation often Tango
adopts the strategy of adding an extra function argument that allows to give a
preallocated buffer to a function).

The syntax that can be used for such arrays is nothing strange, it's the same
as fixed-sized arrays, but the length is a run-time variable:

void foo(int[] b) {} // OK
// void foo(int[100] b) {} // Wrong

void bar(int n) {
  int[n] a; // stack-allocated, variable-size
  foo(a);
}

When you give one of such variable-length arrays to another function like
foo(), it must always become a dynamic array, because the length is not
generally unknown at compile time.

An alternative syntax that I don't like (too much long, and it doesn't allow to
add =void):

void bar(int n) {
  scope int[] a = new int[n]; // stack-allocated
  foo(a);
}

The advantage of this scope int[] a =... syntax is that it's quite easy to see
that it's a stack allocation, but I think this is not important.

Bye,
bearophile
Jan 11 2010
next sibling parent reply grauzone <none example.net> writes:
bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory. Maybe D should have focused more on safe and efficient memory managment concepts, like they can be found in languages like Cyclone (though I don't know how useful/feasible this is, but it sounded promising).
Jan 11 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout. Andrei
Jan 11 2010
next sibling parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

I intended to exclude this case with applications "that need to make use of dynamic memory allocation". But I guess this doesn't apply to programs that only allocate on initialization. So, how about programs that allocate/release memory while doing computations?
 
 Andrei

Jan 11 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

I intended to exclude this case with applications "that need to make use of dynamic memory allocation". But I guess this doesn't apply to programs that only allocate on initialization. So, how about programs that allocate/release memory while doing computations?

"Oil is found in the minds of men." Andrei
Jan 11 2010
next sibling parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

I intended to exclude this case with applications "that need to make use of dynamic memory allocation". But I guess this doesn't apply to programs that only allocate on initialization. So, how about programs that allocate/release memory while doing computations?

"Oil is found in the minds of men."

Can you explain what this means in the context of the problem mentioned above?
 Andrei

Jan 11 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

I intended to exclude this case with applications "that need to make use of dynamic memory allocation". But I guess this doesn't apply to programs that only allocate on initialization. So, how about programs that allocate/release memory while doing computations?

"Oil is found in the minds of men."

Can you explain what this means in the context of the problem mentioned above?

It means that if you start with the goal of making high performance applications safe, you may achieve that. One guaranteed way to not get there is to start with a claim that it can't be done. That being said, I like stack-allocated and I asked Walter for a long time to introduce them in the language. They have their problems though. Andrei
Jan 11 2010
parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

I intended to exclude this case with applications "that need to make use of dynamic memory allocation". But I guess this doesn't apply to programs that only allocate on initialization. So, how about programs that allocate/release memory while doing computations?

"Oil is found in the minds of men."

Can you explain what this means in the context of the problem mentioned above?

It means that if you start with the goal of making high performance applications safe, you may achieve that. One guaranteed way to not get there is to start with a claim that it can't be done.

I just wanted to leave that "claim" for anybody to attack. But in the end, solutions to this problem will only be workarounds, and programs will possibly be more convoluted than their C/C++ counterparts. E.g. you could just use freelists (reusing memory without going through the runtime memory manager), but then you'd lose constructors, etc...
 That being said, I like stack-allocated and I asked Walter for a long 
 time to introduce them in the language. They have their problems though.

They can just be disabled in safe mode, if that's the problem.
 
 Andrei

Jan 11 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

I write high-performance code in D without resorting to unsafe techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

I intended to exclude this case with applications "that need to make use of dynamic memory allocation". But I guess this doesn't apply to programs that only allocate on initialization. So, how about programs that allocate/release memory while doing computations?

"Oil is found in the minds of men."

Can you explain what this means in the context of the problem mentioned above?

It means that if you start with the goal of making high performance applications safe, you may achieve that. One guaranteed way to not get there is to start with a claim that it can't be done.

I just wanted to leave that "claim" for anybody to attack. But in the end, solutions to this problem will only be workarounds, and programs will possibly be more convoluted than their C/C++ counterparts.

C++ doesn't have stack-allocated arrays. Andrei
Jan 11 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 That being said, I like stack-allocated and I asked Walter for a long 
 time to introduce them in the language. They have their problems though.


One of the problems is they can easily lead to stack exhaustion. Stack sizes for a thread must all be preallocated.
Jan 11 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 One of the problems is they can easily lead to stack exhaustion. Stack 
 sizes for a thread must all be preallocated.

Yes, that's of course a problem. It's meaningful to allocate arrays on the stack only if they are small. You can exhaust the/a stack even with normal fixed-sized arrays too: int foo(int n) { int[100] a; ... y = foo(k); ... } The main difference is that here you can see better that each nested call to foo() burns 400+ bytes of stack (but as in the variable-length case, it's possible that you don't statically know how many nested calls to foo() will happen at runtime, so the end result is the same: you don't know if you will have a stack overflow at runtime). In truth we can live without variable-length stack allocated arrays, mine may be just feature envy, and I prefer variable-length stack allocated arrays over the raw alloca(). Thank you very much for showing up in this thread, I appreciate a small answer too a lot :-) Bye, bearophile
Jan 12 2010
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 "Oil is found in the minds of men."

I hope your book will put some good oil in my head :-) Bye, bearophile
Jan 11 2010
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 grauzone wrote:
 bearophile wrote:
 void bar(int n) {
   scope int[] a = new int[n]; // stack-allocated
   foo(a);
 }

Why are you making such proposals, when one of the core developers even thought about removing normal "scope"? It's almost 100% guaranteed that nobody will listen. I personally find it a good idea to find new ways to reduce producing memory garbage. The D GC is slow and bad, so you'd better avoid it. Let's make this claim: it is impossible to write high performance applications (that need to make use of dynamic memory allocation) in D without resorting to "unsafe" techniques. That would include allocating memory on the stack, or manually freeing memory.

techniques. Much of my code allocates arrays only in the beginning and uses them throughout.

One thing I've always wondered about people who operate this way is, how do you do it without violating tons of encapsulation? For example, let's say you want foo() to be a well-encapsulated free function: int foo(int num) { int[] temp = new int[num]; // Do stuff. } Now how do you draw a line of abstraction that reuses temp in a well-encapsulated way, one that is invisible to the caller of foo()? Since bugs that occur at higher levels of abstraction are often caused by poor encapsulation and programs that are too hard to reason about, I'd rather have a well-encapsulated "unsafe" speed hack than a poorly encapsulated "safe" one.
Jan 11 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
grauzone:
 Why are you making such proposals, when one of the core developers even 
 thought about removing normal "scope"?

I have a nice implementation of Barnes-Hut algorithm translated from Java. Its running time goes from 18.1 seconds to 7.6 s adding "scope" where some classes are initialized (those classes are 3D vectors, that can also become structs). If scope classes are removed, then it may be necessary to add struct inheritance to D2, because in several situations objects can't be used (and by the way, the very latest Java HotSpot needs about 10.9 s to run the same code, despite it performs Escape Analysis, probably because such Analysis is not as good as my manual scope annotations). On request I can show all the code discussed here. Bye, bearophile
Jan 11 2010
parent bearophile <bearophileHUGS lycos.com> writes:
retard:
 latest = 17.0-b05 ?

It was introduced in 1.6.0_14, but you need to add -XX:+DoEscapeAnalysis to activate it. I am using build 1.7.0-ea-b76 where I think EA is activated by default. Look at Sun docs for more info, I am not an expert on this. Bye, bearophile
Jan 11 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
grauzone:
 Why are you making such proposals, when one of the core developers even 
 thought about removing normal "scope"?

And by the way, in the Python community one of the purposes of PEPs is to act as an archive of refused proposals, so people can avoid wasting brain energy over and over again about permanently refused ideas :-) Sometimes it's worth doing well even wrong things :-) Bye, bearophile
Jan 11 2010
next sibling parent grauzone <none example.net> writes:
bearophile wrote:
 grauzone:
 Why are you making such proposals, when one of the core developers even 
 thought about removing normal "scope"?

And by the way, in the Python community one of the purposes of PEPs is to act as an archive of refused proposals, so people can avoid wasting brain energy over and over again about permanently refused ideas :-) Sometimes it's worth doing well even wrong things :-)

You could try to write a real DIP: http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs
 Bye,
 bearophile

Jan 11 2010
prev sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
On 01/11/2010 11:33 AM, bearophile wrote:
 grauzone:
 Why are you making such proposals, when one of the core developers even
 thought about removing normal "scope"?

And by the way, in the Python community one of the purposes of PEPs is to act as an archive of refused proposals, so people can avoid wasting brain energy over and over again about permanently refused ideas :-) Sometimes it's worth doing well even wrong things :-) Bye, bearophile

That's good, but the newsgroup is hardly an archive and a post is not the same as a PEP. You put a lot of time in here, but this kind of information is fleeting and will be lost in the future. When you want this you should take yourself seriously and organise your posts in the wiki with a link to the ng.
Jan 12 2010
prev sibling parent retard <re tard.com.invalid> writes:
Mon, 11 Jan 2010 05:26:28 -0500, bearophile wrote:

 grauzone:
 Why are you making such proposals, when one of the core developers even
 thought about removing normal "scope"?

I have a nice implementation of Barnes-Hut algorithm translated from Java. Its running time goes from 18.1 seconds to 7.6 s adding "scope" where some classes are initialized (those classes are 3D vectors, that can also become structs). If scope classes are removed, then it may be necessary to add struct inheritance to D2, because in several situations objects can't be used (and by the way, the very latest Java HotSpot

latest = 17.0-b05 ?
 needs about 10.9 s to run
 the same code, despite it performs Escape Analysis, probably because
 such Analysis is not as good as my manual scope annotations). On request
 I can show all the code discussed here.
 
 Bye,
 bearophile

Jan 11 2010
prev sibling next sibling parent reply Chad J <chadjoan __spam.is.bad__gmail.com> writes:
bearophile wrote:
 You can see of this as a small DEP :-)
 
 Small variable-length arrays allocated on the stack can be somewhat useful in
D. They avoid most usages of alloca() and are a bit safer than alloca() because
no pointer casting is needed.
 
 alloca() gives some performance improvement compared to normal dynamic arrays
if the array is small and the function (here bar()) is called many times (I
have experimentally seen this in the Levenshtein distance function, that needs
to allocate a temporary buffer. If the length of the two input strings is small
I use a fixed-sized array as buffer, this improves performance some in code
that needs to compute this distance over many small strings. I think a
variable-length array is a bit better here. In this situation often Tango
adopts the strategy of adding an extra function argument that allows to give a
preallocated buffer to a function).
 

Tango adopting that convention is interesting to me because it seems like a convention that could be used very commonly to make things run faster. Note that it is more general than alloca: it allows one to allocate stack memory in the /caller's/ stack frame. char[] stringModify( char[] someStr, char[] buffer ) { // Expand the buffer, but only if necessary. size_t len = calculateNeededLengthFrom(someStr); if ( buffer.length < len ) buffer = new buffer[len]; // Sometimes you can't calculate the buffer size needed ahead // of time, so you figure it out as you go and use the // buffer until you run out. // Do stuff with someStr, putting the result into buffer. return buffer; // You can do this. buffer is in parent frame. } So, is there any way to allocate memory in the caller's stack frame in D right now? Can an abstraction be made to do this for both the caller and callee without a lot of boilerplate?
 
 ... (btw hijacking ur thread)
 
 Bye,
 bearophile

- Chad
Jan 11 2010
parent reply Chad J <chadjoan __spam.is.bad__gmail.com> writes:
Chad J wrote:
 bearophile wrote:
 You can see of this as a small DEP :-)

 Small variable-length arrays allocated on the stack can be somewhat useful in
D. They avoid most usages of alloca() and are a bit safer than alloca() because
no pointer casting is needed.

 alloca() gives some performance improvement compared to normal dynamic arrays
if the array is small and the function (here bar()) is called many times (I
have experimentally seen this in the Levenshtein distance function, that needs
to allocate a temporary buffer. If the length of the two input strings is small
I use a fixed-sized array as buffer, this improves performance some in code
that needs to compute this distance over many small strings. I think a
variable-length array is a bit better here. In this situation often Tango
adopts the strategy of adding an extra function argument that allows to give a
preallocated buffer to a function).

Tango adopting that convention is interesting to me because it seems like a convention that could be used very commonly to make things run faster. Note that it is more general than alloca: it allows one to allocate stack memory in the /caller's/ stack frame. char[] stringModify( char[] someStr, char[] buffer ) { // Expand the buffer, but only if necessary. size_t len = calculateNeededLengthFrom(someStr); if ( buffer.length < len ) buffer = new buffer[len]; // Sometimes you can't calculate the buffer size needed ahead // of time, so you figure it out as you go and use the // buffer until you run out. // Do stuff with someStr, putting the result into buffer. return buffer; // You can do this. buffer is in parent frame. } So, is there any way to allocate memory in the caller's stack frame in D right now?

Err, I mean without explicit help from the caller.
 Can an abstraction be made to do this for both the caller and callee without a
lot of boilerplate?
 
 ... (btw hijacking ur thread)

 Bye,
 bearophile

- Chad

Jan 11 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Chad J wrote:
 Chad J wrote:
 bearophile wrote:
 You can see of this as a small DEP :-)

 Small variable-length arrays allocated on the stack can be somewhat useful in
D. They avoid most usages of alloca() and are a bit safer than alloca() because
no pointer casting is needed.

 alloca() gives some performance improvement compared to normal dynamic arrays
if the array is small and the function (here bar()) is called many times (I
have experimentally seen this in the Levenshtein distance function, that needs
to allocate a temporary buffer. If the length of the two input strings is small
I use a fixed-sized array as buffer, this improves performance some in code
that needs to compute this distance over many small strings. I think a
variable-length array is a bit better here. In this situation often Tango
adopts the strategy of adding an extra function argument that allows to give a
preallocated buffer to a function).

like a convention that could be used very commonly to make things run faster. Note that it is more general than alloca: it allows one to allocate stack memory in the /caller's/ stack frame. char[] stringModify( char[] someStr, char[] buffer ) { // Expand the buffer, but only if necessary. size_t len = calculateNeededLengthFrom(someStr); if ( buffer.length < len ) buffer = new buffer[len]; // Sometimes you can't calculate the buffer size needed ahead // of time, so you figure it out as you go and use the // buffer until you run out. // Do stuff with someStr, putting the result into buffer. return buffer; // You can do this. buffer is in parent frame. } So, is there any way to allocate memory in the caller's stack frame in D right now?

Err, I mean without explicit help from the caller.

First, it's good to use the idiom buffer.length = len; instead of buffer = new buffer[len]; so there's a chance for in-place expansion of buffer. There's no organized way to do what you want at the moment, but a small API could be provided. For example the STL has get_temporary_buffer() and release_temporary_buffer(): http://www.sgi.com/tech/stl/get_temporary_buffer.html There was a discussion on this group, google for superstack site:digitalmars.com. I wanted to implement that, it just fell through the cracks. I think it should be added to bugzilla so it's not forgotten. Andrei
Jan 11 2010
parent reply Chad J <chadjoan __spam.is.bad__gmail.com> writes:
Andrei Alexandrescu wrote:
 
 First, it's good to use the idiom
 
 buffer.length = len;
 
 instead of
 
 buffer = new buffer[len];
 
 so there's a chance for in-place expansion of buffer.
 

Ah. Good to know.
 There's no organized way to do what you want at the moment, but a small
 API could be provided. For example the STL has get_temporary_buffer()
 and release_temporary_buffer():
 
 http://www.sgi.com/tech/stl/get_temporary_buffer.html
 
 There was a discussion on this group, google for superstack
 site:digitalmars.com. I wanted to implement that, it just fell through
 the cracks. I think it should be added to bugzilla so it's not forgotten.
 
 
 Andrei

That does sound very useful. http://d.puremagic.com/issues/show_bug.cgi?id=3696 I'm realizing now that returning an object allocated on this stack is a tricky task unless there is help from the caller. The lifetime of the object has to be determined somehow. The Tango convention makes it fairly explicit that the parent's scope will be the beginning and the end for the memory. If it needs to live longer then the caller just passes a buffer that isn't stack/scoped memory. Only the caller really knows how long the memory needs to hang around. hmmmmm. Still that SuperStack thing would be useful. It's like alloca but better (look ma, no cast) and slightly more general (ex: it can be freed in parent's scope, if you are willing to play it risky).
Jan 11 2010
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Chad J (chadjoan __spam.is.bad__gmail.com)'s article
 http://d.puremagic.com/issues/show_bug.cgi?id=3696
 I'm realizing now that returning an object allocated on this stack is a
 tricky task unless there is help from the caller.  The lifetime of the
 object has to be determined somehow.  The Tango convention makes it
 fairly explicit that the parent's scope will be the beginning and the
 end for the memory.  If it needs to live longer then the caller just
 passes a buffer that isn't stack/scoped memory.  Only the caller really
 knows how long the memory needs to hang around.  hmmmmm.

The optional buffer idiom is great for stuff that's returned from a function. I use it all the time. On the other hand, for temporary buffers that are used internally, whose mere existence is an implementation detail, having the caller maintain and pass in a buffer is a horrible violation of encapsulation and good API design, even by the standards of performance-oriented APIs. Even if you know it's never going to change, it's still annoying for the caller to even have to think about these kinds of details.
 Still that SuperStack thing would be useful.  It's like alloca but
 better (look ma, no cast) and slightly more general (ex: it can be freed
 in parent's scope, if you are willing to play it risky).

Yeah, for about the past year I've been playing around with TempAlloc, wich was basically me taking Andrei's SuperStack idea when it was first proposed and running with it because I needed it sooner rather than later. It's basically encapsulated in dstats.alloc (http://svn.dsource.org/projects/dstats/docs/alloc.html). Its one **huge** bug that I've been trying to fix is that it's not scanned by the GC because scanning the whole stack would be too inefficient and I can't think of a way to do it more efficiently. However, it's still useful in cases where either you're just rearranging data and there's already a copy somewhere that is scanned by the GC (sorting for non-parametric statistical calculations is my killer use) or your data is pure value types like floats. It's got the following advantages over alloca: 1. Can't overflow unless you're completely out of address space. As a last resort, it allocates another chunk of memory from the heap. 2. Nicer syntax. For example, to allocate an array of 1,000 uints: auto foo = newStack!uint(1_000); 3. Since lifetimes are not bound to any function scope unless you want them to be, you can implement complex abstract data structures (I've got hash tables, hash sets and AVL trees so far) on this stack. This is useful when you need to build a lookup table as part of some algorithm.
Jan 11 2010
next sibling parent reply Chad J <chadjoan __spam.is.bad__gmail.com> writes:
dsimcha wrote:
 
 The optional buffer idiom is great for stuff that's returned from a function. 
I
 use it all the time.  On the other hand, for temporary buffers that are used
 internally, whose mere existence is an implementation detail, having the caller
 maintain and pass in a buffer is a horrible violation of encapsulation and good
 API design, even by the standards of performance-oriented APIs.  Even if you
know
 it's never going to change, it's still annoying for the caller to even have to
 think about these kinds of details.
 

Yes, quite.
Jan 11 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Chad J wrote:
 dsimcha wrote:
 The optional buffer idiom is great for stuff that's returned from a function. 
I
 use it all the time.  On the other hand, for temporary buffers that are used
 internally, whose mere existence is an implementation detail, having the caller
 maintain and pass in a buffer is a horrible violation of encapsulation and good
 API design, even by the standards of performance-oriented APIs.  Even if you
know
 it's never going to change, it's still annoying for the caller to even have to
 think about these kinds of details.

Yes, quite.

I think an API that is memory-safe relies on one global array like this: template SuperStack(T) { private T[] buffer; private enum slackfactor = 2.5; T[] getBuffer(size_t n) { buffer.length = buffer.length + n; return buffer[$ - n .. $]; } void releaseBuffer(T[] b) { enforce(b is buffer[$ - b.length .. $]); buffer.length = buffer.length - b.length; if (gc.capacity(buffer) > buffer.length * slackfactor) { // Reallocate buffer to allow collection of slack buffer = buffer.dup; } } } Further encapsulation is possible e.g. by defining a struct that pairs calls to getBuffer and releaseBuffer. The idea is that the API offers a means to define and use temporary buffers without compromising memory safety. Even if you escape data allocated via getBuffer that persists after releaseBuffer, that will not cause undefined behavior. (It may, however, cause data to be overwritten because another call to getBuffer will reuse memory.) Leaks are also possible if you don't release the buffer. That can be solved by not offering getBuffer and releaseBuffer as they are, but instead only encapsulated in a struct with the appropriate constructor and destructor. Andrei
Jan 11 2010
next sibling parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use temporary 
 buffers without compromising memory safety. Even if you escape data 
 allocated via getBuffer that persists after releaseBuffer, that will not 
 cause undefined behavior. (It may, however, cause data to be overwritten 
 because another call to getBuffer will reuse memory.) Leaks are also 
 possible if you don't release the buffer. That can be solved by not 
 offering getBuffer and releaseBuffer as they are, but instead only 
 encapsulated in a struct with the appropriate constructor and destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.
 
 Andrei

Jan 11 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use temporary 
 buffers without compromising memory safety. Even if you escape data 
 allocated via getBuffer that persists after releaseBuffer, that will 
 not cause undefined behavior. (It may, however, cause data to be 
 overwritten because another call to getBuffer will reuse memory.) 
 Leaks are also possible if you don't release the buffer. That can be 
 solved by not offering getBuffer and releaseBuffer as they are, but 
 instead only encapsulated in a struct with the appropriate constructor 
 and destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.

Normal D must be debuggable D. Andrei
Jan 11 2010
parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use temporary 
 buffers without compromising memory safety. Even if you escape data 
 allocated via getBuffer that persists after releaseBuffer, that will 
 not cause undefined behavior. (It may, however, cause data to be 
 overwritten because another call to getBuffer will reuse memory.) 
 Leaks are also possible if you don't release the buffer. That can be 
 solved by not offering getBuffer and releaseBuffer as they are, but 
 instead only encapsulated in a struct with the appropriate 
 constructor and destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.

Normal D must be debuggable D.

That isn't the case right now and never will be. That said, I'd prefer solutions that follow 1. instead of 2. I'm sure it can work somehow; it also worked with closures (at least partial) and normal stack variables + ref.
 Andrei

Jan 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use temporary 
 buffers without compromising memory safety. Even if you escape data 
 allocated via getBuffer that persists after releaseBuffer, that will 
 not cause undefined behavior. (It may, however, cause data to be 
 overwritten because another call to getBuffer will reuse memory.) 
 Leaks are also possible if you don't release the buffer. That can be 
 solved by not offering getBuffer and releaseBuffer as they are, but 
 instead only encapsulated in a struct with the appropriate 
 constructor and destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.

Normal D must be debuggable D.

That isn't the case right now and never will be.

Then if I were you and I really believed that, I wouldn't waste one more minute hanging out in this group. Andrei
Jan 12 2010
parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use temporary 
 buffers without compromising memory safety. Even if you escape data 
 allocated via getBuffer that persists after releaseBuffer, that 
 will not cause undefined behavior. (It may, however, cause data to 
 be overwritten because another call to getBuffer will reuse 
 memory.) Leaks are also possible if you don't release the buffer. 
 That can be solved by not offering getBuffer and releaseBuffer as 
 they are, but instead only encapsulated in a struct with the 
 appropriate constructor and destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.

Normal D must be debuggable D.

That isn't the case right now and never will be.

Then if I were you and I really believed that, I wouldn't waste one more minute hanging out in this group.

Don't worry, I won't leave so easily. That said, if you use "delete" (or GC.free) incorrectly, you may end up with an undebuggable program. Plus there's no way manual free will ever be removed from D. I'm sure you know that, but it sounds like you're denying that. Whatever.
 Andrei

Jan 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use 
 temporary buffers without compromising memory safety. Even if you 
 escape data allocated via getBuffer that persists after 
 releaseBuffer, that will not cause undefined behavior. (It may, 
 however, cause data to be overwritten because another call to 
 getBuffer will reuse memory.) Leaks are also possible if you don't 
 release the buffer. That can be solved by not offering getBuffer 
 and releaseBuffer as they are, but instead only encapsulated in a 
 struct with the appropriate constructor and destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.

Normal D must be debuggable D.

That isn't the case right now and never will be.

Then if I were you and I really believed that, I wouldn't waste one more minute hanging out in this group.

Don't worry, I won't leave so easily.

I know. Clearly there is something in D that you find irresistibly attractive; I wonder what that is :o).
 That said, if you use "delete" (or GC.free) incorrectly, you may end up 
 with an undebuggable program. Plus there's no way manual free will ever 
 be removed from D. I'm sure you know that, but it sounds like you're 
 denying that. Whatever.

Of course I'm denying it because it's false. SafeD will not allow programs containing manual free. And SafeD is quickly becoming a reality, your pessimism notwithstanding. Andrei
Jan 12 2010
parent grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 Andrei Alexandrescu wrote:
 The idea is that the API offers a means to define and use 
 temporary buffers without compromising memory safety. Even if you 
 escape data allocated via getBuffer that persists after 
 releaseBuffer, that will not cause undefined behavior. (It may, 
 however, cause data to be overwritten because another call to 
 getBuffer will reuse memory.) Leaks are also possible if you 
 don't release the buffer. That can be solved by not offering 
 getBuffer and releaseBuffer as they are, but instead only 
 encapsulated in a struct with the appropriate constructor and 
 destructor.

That's an interesting point. You can have two things: 1. Correct memory managment (one can never access an object that's supposed to be free'd) 2. Safe memory managment (event if you access an object that's supposed to be free'd, it's memory safe) In safe mode, 1. can't be allowed, and 2. is better than nothing. In normal D, I'd say 2. is quite useless, except as an debugging option.

Normal D must be debuggable D.

That isn't the case right now and never will be.

Then if I were you and I really believed that, I wouldn't waste one more minute hanging out in this group.

Don't worry, I won't leave so easily.

I know. Clearly there is something in D that you find irresistibly attractive; I wonder what that is :o).

It's like a narcotic drug.
 That said, if you use "delete" (or GC.free) incorrectly, you may end 
 up with an undebuggable program. Plus there's no way manual free will 
 ever be removed from D. I'm sure you know that, but it sounds like 
 you're denying that. Whatever.

Of course I'm denying it because it's false. SafeD will not allow programs containing manual free. And SafeD is quickly becoming a reality, your pessimism notwithstanding.

Does this imply you'Re going to make SafeD default, and discourage use of "unsafe" D?
 
 Andrei

Jan 12 2010
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 template SuperStack(T) {
     private T[] buffer;
     private enum slackfactor = 2.5;
 
     T[] getBuffer(size_t n) {
         buffer.length = buffer.length + n;
         return buffer[$ - n .. $];
     }
 
     void releaseBuffer(T[] b) {
         enforce(b is buffer[$ - b.length .. $]);
         buffer.length = buffer.length - b.length;
         if (gc.capacity(buffer) > buffer.length * slackfactor) {
             // Reallocate buffer to allow collection of slack
             buffer = buffer.dup;
         }
     }
 }

Broken design is broken. SuperStack!int stack; auto buffer0 = stack.getBuffer(100); auto buffer1 = stack.getBuffer(10000000); // Reallocates internal array. stack.releaseBuffer(buffer1); stack.releaseBuffer(buffer0); // Assertion failure. -- Rainer Deyke - rainerd eldwood.com
Jan 12 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 template SuperStack(T) {
     private T[] buffer;
     private enum slackfactor = 2.5;

     T[] getBuffer(size_t n) {
         buffer.length = buffer.length + n;
         return buffer[$ - n .. $];
     }

     void releaseBuffer(T[] b) {
         enforce(b is buffer[$ - b.length .. $]);
         buffer.length = buffer.length - b.length;
         if (gc.capacity(buffer) > buffer.length * slackfactor) {
             // Reallocate buffer to allow collection of slack
             buffer = buffer.dup;
         }
     }
 }

Broken design is broken. SuperStack!int stack; auto buffer0 = stack.getBuffer(100); auto buffer1 = stack.getBuffer(10000000); // Reallocates internal array. stack.releaseBuffer(buffer1); stack.releaseBuffer(buffer0); // Assertion failure.

Yikes, thanks. Now something happened with my news reader - the part of your message containing your improved solution got cut off :o). Andrei
Jan 12 2010
parent Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 Yikes, thanks. Now something happened with my news reader - the part of
 your message containing your improved solution got cut off :o).

template SuperStack(T) { private T[][] buffers; T[] getBuffer(size_t n) { if (buffers.length == 0) { buffers.length = 1; buffers[0].length = n; return buffers[0]; } else if (gc.capacity(buffers[$ - 1]) < buffers.length + n) { buffers.length = buffers.length + 1; // Prealloacte space, then truncate. buffers[$ - 1].length = max(buffers[$ - 2].length, n); buffers[$ - 1].length = n; return buffers[$ - 1]; } else { buffers[$ - 1].length = buffers[$ - 1].length + n; return buffers[$ - 1][$ - n .. $]; } } void releaseBuffer(T[] b) { enforce(b is buffers[$ - 1][$ - b.length .. $]); buffers[$ - 1].length = buffer[$ - 1].length - b.length; if (buffers[$ - 1].length == 0) { buffers.length = buffers.length - 1; } } } -- Rainer Deyke - rainerd eldwood.com
Jan 12 2010
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
dsimcha wrote:
 == Quote from Chad J (chadjoan __spam.is.bad__gmail.com)'s article
 http://d.puremagic.com/issues/show_bug.cgi?id=3696
 I'm realizing now that returning an object allocated on this stack is a
 tricky task unless there is help from the caller.  The lifetime of the
 object has to be determined somehow.  The Tango convention makes it
 fairly explicit that the parent's scope will be the beginning and the
 end for the memory.  If it needs to live longer then the caller just
 passes a buffer that isn't stack/scoped memory.  Only the caller really
 knows how long the memory needs to hang around.  hmmmmm.

The optional buffer idiom is great for stuff that's returned from a function. I use it all the time. On the other hand, for temporary buffers that are used internally, whose mere existence is an implementation detail, having the caller maintain and pass in a buffer is a horrible violation of encapsulation and good API design, even by the standards of performance-oriented APIs. Even if you know it's never going to change, it's still annoying for the caller to even have to think about these kinds of details.

The design I'm using for SciD is just that, double[] algo(params, double[] buffer=null, double[] workspace=null); and it's already starting to annoy me. I never remember how much workspace memory is required for each algorithm, and constantly have to look it up. I definitely have to do something about it. (The worst are the LAPACK linear algebra functions. There you have a minimum workspace size, which is easily found in the docs, but for many of them you also have an *optimal* workspace size, which you have to make LAPACK calculate for you. And even the minimum size depends on a lot of factors, like matrix size (naturally), whether the matrix is real or complex, symmetric or hermitian, triangular, etc.)
 Still that SuperStack thing would be useful.  It's like alloca but
 better (look ma, no cast) and slightly more general (ex: it can be freed
 in parent's scope, if you are willing to play it risky).

Yeah, for about the past year I've been playing around with TempAlloc, wich was basically me taking Andrei's SuperStack idea when it was first proposed and running with it because I needed it sooner rather than later. It's basically encapsulated in dstats.alloc (http://svn.dsource.org/projects/dstats/docs/alloc.html).

TempAlloc looks really interesting. I was actually thinking about writing some kind of "preallocated memory pool" mechanism myself, to get rid of all the workspace[]s. If you don't mind, I'll look into using TempAlloc for SciD.
 [...]
 It's got the following advantages over alloca:
 
 1.  Can't overflow unless you're completely out of address space.  As a last
 resort, it allocates another chunk of memory from the heap.

When all chunks in a block have been TempAlloc.free()'ed, is the block GC.free()'ed as well? Or is the memory retained for future use? (And if the answer to the last question is 'no', is there a reason not to retain the memory until the program exits, or at least to include an option to?) -Lars
Jan 12 2010
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Lars T. Kyllingstad (public kyllingen.NOSPAMnet)'s > > It's got
the
following advantages over alloca:
 1.  Can't overflow unless you're completely out of address space.  As a last
 resort, it allocates another chunk of memory from the heap.

GC.free()'ed as well? Or is the memory retained for future use? (And if the answer to the last question is 'no', is there a reason not to retain the memory until the program exits, or at least to include an option to?) -Lars

The memory is freed according to heuristics. Here's a brief description: 1. All state is thread-local. Unless TempAlloc needs to go to the main heap, there is never any type of synchronization. 2. Each chunk is 4 MB. I've realized that this may be overkill and am thinking of reducing it to 1 MB or 512K. This is controlled by an enum, so it would be a trivial change. 3. Chunks are stored in a stack (yes, a stack of stacks). Half of the unused chunks are freed anytime there are 2x as many unused chunks as used chunks. Usually 4MB per thread in temp buffer space is plenty, though.
Jan 12 2010
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
dsimcha wrote:
 == Quote from Lars T. Kyllingstad (public kyllingen.NOSPAMnet)'s > > It's got
the
 following advantages over alloca:
 1.  Can't overflow unless you're completely out of address space.  As a last
 resort, it allocates another chunk of memory from the heap.

GC.free()'ed as well? Or is the memory retained for future use? (And if the answer to the last question is 'no', is there a reason not to retain the memory until the program exits, or at least to include an option to?) -Lars

The memory is freed according to heuristics. Here's a brief description: 1. All state is thread-local. Unless TempAlloc needs to go to the main heap, there is never any type of synchronization. 2. Each chunk is 4 MB. I've realized that this may be overkill and am thinking of reducing it to 1 MB or 512K. This is controlled by an enum, so it would be a trivial change.

I suppose it depends on what you're using it for. For maximum flexibility, you could do this: struct CustomTempAlloc(uint blockSize) { // current TempAlloc code here } // Default is 4MB alias CustomTempAlloc!(4U * 1024U * 1024U) TempAlloc;
 3.  Chunks are stored in a stack (yes, a stack of stacks).  Half of the unused
 chunks are freed anytime there are 2x as many unused chunks as used chunks.
 Usually 4MB per thread in temp buffer space is plenty, though.

Thanks for the explanation! -Lars
Jan 12 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Lars T. Kyllingstad:
    struct CustomTempAlloc(uint blockSize)
    { // current TempAlloc code here }
 
    
    alias CustomTempAlloc!(4U * 1024U * 1024U) TempAlloc;

Or: struct TempAlloc(uint blockSize=4U * 1024U * 1024U) { // Default is 4MB // current TempAlloc code here } Bye, bearophile
Jan 12 2010
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
bearophile, el 12 de enero a las 04:18 me escribiste:
 Walter Bright:
 One of the problems is they can easily lead to stack exhaustion. Stack 
 sizes for a thread must all be preallocated.

Yes, that's of course a problem. It's meaningful to allocate arrays on the stack only if they are small. You can exhaust the/a stack even with normal fixed-sized arrays too: int foo(int n) { int[100] a; ... y = foo(k); ... }

Or with structs: struct S { int[100000] a; } void foo() { S s; } See this post: http://www.yosefk.com/blog/the-c-sucks-series-petrifying-functions.html For a nice real story about this problem ;) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- This homeless guy asked me for some money the other day. And I was gonna give it to him but then I thought you're just gonna use it on drugs or alcohol. And then I thought, that's what I'm gonna use it on. Why am I judging this poor bastard.
Jan 12 2010