www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - TempAlloc Overhaul

reply dsimcha <dsimcha yahoo.com> writes:
I've overhauled my TempAlloc proposal based on some of the suggestions 
I've received.  Here are the major changes:

1.  I've reconsidered and decided TempAlloc belongs in its own Phobos 
module (std.tempalloc) instead of in core.memory, mainly because it uses 
Phobos in ways that aren't easy to get rid of.

2.  Move uninitializedArray (formerly newVoid) to its own pull request 
for inclusion in std.array.  This keeps the TempAlloc proposal more 
tightly focused.

3.  Make alignedMalloc and friends private for now, again to make the 
proposal more tightly focused.

4.  Rename tempdup to tempArray to emphasize that is semantics are more 
similar to std.array.array than .dup w.r.t. narrow strings.

5.  Move newStack into the TempAlloc namespace and rename it 
TempAlloc.newArray.

6.  TempAlloc.newArray now handles multidimensional arrays.  Its syntax 
is slightly modified to accommodate this.  Before:

double[] foo = newStack!double(100);

After:

double[] foo = TempAlloc.newArray!(double[])(100);

Code:

https://github.com/dsimcha/TempAlloc

Docs (NOT the same as the old URL):

http://cis.jhu.edu/~dsimcha/d/phobos/std_tempalloc.html
Jun 11 2011
next sibling parent Jose Armando Garcia <jsancio gmail.com> writes:
On Sat, Jun 11, 2011 at 1:26 PM, dsimcha <dsimcha yahoo.com> wrote:
 I've overhauled my TempAlloc proposal based on some of the suggestions I'=

 received. =A0Here are the major changes:

 1. =A0I've reconsidered and decided TempAlloc belongs in its own Phobos m=

 (std.tempalloc) instead of in core.memory, mainly because it uses Phobos =

 ways that aren't easy to get rid of.

 2. =A0Move uninitializedArray (formerly newVoid) to its own pull request =

 inclusion in std.array. =A0This keeps the TempAlloc proposal more tightly
 focused.

 3. =A0Make alignedMalloc and friends private for now, again to make the
 proposal more tightly focused.

 4. =A0Rename tempdup to tempArray to emphasize that is semantics are more
 similar to std.array.array than .dup w.r.t. narrow strings.

 5. =A0Move newStack into the TempAlloc namespace and rename it
 TempAlloc.newArray.

 6. =A0TempAlloc.newArray now handles multidimensional arrays. =A0Its synt=

 slightly modified to accommodate this. =A0Before:

 double[] foo =3D newStack!double(100);

 After:

 double[] foo =3D TempAlloc.newArray!(double[])(100);

 Code:

 https://github.com/dsimcha/TempAlloc

 Docs (NOT the same as the old URL):

 http://cis.jhu.edu/~dsimcha/d/phobos/std_tempalloc.html

Just a high-level comment. Now that you moved it to its own namespace do you have a need for a struct with only static method? I think it is an abuse of the language to use struct as a way of namespacing.
Jun 11 2011
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Sat, 11 Jun 2011 12:35:43 -0400, Jose Armando Garcia  
<jsancio gmail.com> wrote:

 On Sat, Jun 11, 2011 at 1:26 PM, dsimcha <dsimcha yahoo.com> wrote:
 I've overhauled my TempAlloc proposal based on some of the suggestions  
 I've
 received.  Here are the major changes:

 1.  I've reconsidered and decided TempAlloc belongs in its own Phobos  
 module
 (std.tempalloc) instead of in core.memory, mainly because it uses  
 Phobos in
 ways that aren't easy to get rid of.

 2.  Move uninitializedArray (formerly newVoid) to its own pull request  
 for
 inclusion in std.array.  This keeps the TempAlloc proposal more tightly
 focused.

 3.  Make alignedMalloc and friends private for now, again to make the
 proposal more tightly focused.

 4.  Rename tempdup to tempArray to emphasize that is semantics are more
 similar to std.array.array than .dup w.r.t. narrow strings.

 5.  Move newStack into the TempAlloc namespace and rename it
 TempAlloc.newArray.

 6.  TempAlloc.newArray now handles multidimensional arrays.  Its syntax  
 is
 slightly modified to accommodate this.  Before:

 double[] foo = newStack!double(100);

 After:

 double[] foo = TempAlloc.newArray!(double[])(100);

 Code:

 https://github.com/dsimcha/TempAlloc

 Docs (NOT the same as the old URL):

 http://cis.jhu.edu/~dsimcha/d/phobos/std_tempalloc.html

Just a high-level comment. Now that you moved it to its own namespace do you have a need for a struct with only static method? I think it is an abuse of the language to use struct as a way of namespacing.

First, TempAlloc was originally developed in its own namespace. Second, there are no name conflicts between core.memory and TempAlloc which would require a sub-namespace. Third, Structs don't provide a true namespace. What you're neglecting is that each of those function names can't stand by itself. For example, people assume malloc and free always refer to the C functions. So, the TempAlloc functions would normally be TempAllocMalloc or TempAllocFree. But once the number of functions like this grows past a certain size, using TempAlloc.free, becomes syntactically nicer than TempAllocFree.
Jun 11 2011
prev sibling next sibling parent Jose Armando Garcia <jsancio gmail.com> writes:
On Sat, Jun 11, 2011 at 4:12 PM, Robert Jacques <sandford jhu.edu> wrote:
 On Sat, 11 Jun 2011 12:35:43 -0400, Jose Armando Garcia <jsancio gmail.co=

 wrote:

 On Sat, Jun 11, 2011 at 1:26 PM, dsimcha <dsimcha yahoo.com> wrote:
 I've overhauled my TempAlloc proposal based on some of the suggestions
 I've
 received. =A0Here are the major changes:

 1. =A0I've reconsidered and decided TempAlloc belongs in its own Phobos
 module
 (std.tempalloc) instead of in core.memory, mainly because it uses Phobo=



 in
 ways that aren't easy to get rid of.

 2. =A0Move uninitializedArray (formerly newVoid) to its own pull reques=



 for
 inclusion in std.array. =A0This keeps the TempAlloc proposal more tight=



 focused.

 3. =A0Make alignedMalloc and friends private for now, again to make the
 proposal more tightly focused.

 4. =A0Rename tempdup to tempArray to emphasize that is semantics are mo=



 similar to std.array.array than .dup w.r.t. narrow strings.

 5. =A0Move newStack into the TempAlloc namespace and rename it
 TempAlloc.newArray.

 6. =A0TempAlloc.newArray now handles multidimensional arrays. =A0Its sy=



 is
 slightly modified to accommodate this. =A0Before:

 double[] foo =3D newStack!double(100);

 After:

 double[] foo =3D TempAlloc.newArray!(double[])(100);

 Code:

 https://github.com/dsimcha/TempAlloc

 Docs (NOT the same as the old URL):

 http://cis.jhu.edu/~dsimcha/d/phobos/std_tempalloc.html

Just a high-level comment. Now that you moved it to its own namespace do you have a need for a struct with only static method? I think it is an abuse of the language to use struct as a way of namespacing.

First, TempAlloc was originally developed in its own namespace. Second, there are no name conflicts between core.memory and TempAlloc which would require a sub-namespace. Third, Structs don't provide a true namespace. W=

 you're neglecting is that each of those function names can't stand by
 itself. For example, people assume malloc and free always refer to the C
 functions. So, the TempAlloc functions would normally be TempAllocMalloc =

 TempAllocFree. But once the number of functions like this grows past a
 certain size, using TempAlloc.free, becomes syntactically nicer than
 TempAllocFree.

Not exactly sure what your argument for using a struct is. The argument that the function names collide with existing functions and the assumption about user semantic of such names says something about the name of the function and not the use of a struct as namespacing construct. Having said that we don't have to call them malloc and free. It is a stack so maybe pushMemory() and popMemory() is better. Again, looking at the implementation I don't see a need for struct. True OO languages use classes/struct with only static methods when they want to expose an non-OO API. In those cases you can argue their decision on the fact that the language has limitation expressing non-OO concepts. D doesn't have such limitations. You can clearly implement this using a OOD so using struct and class is okay. But the API that is expose is not OO so I don't see the need for it. D provides so many features to get around this naming problem. Use them. Don't overload the struct concept.
Jun 11 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I don't see what's so bad about it. core.memory.GC is a struct which
wraps malloc and free. If that were naked, and TempAlloc was too, then
you would have erratic behavior if you missplaced one import with
another by accident.
Jun 11 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/11/11 11:26 AM, dsimcha wrote:
 I've overhauled my TempAlloc proposal based on some of the suggestions
 I've received.

Review: * Mention no safety at disadvantages: deallocating memory may leave pointers dangling. * The synopsis should use scope(exit) TempAlloc.frameFree(); as that is a robust idiom. * newArray -> "For performance reasons, the returned array is not initialized." Then call it newUninitializedArray. Make safety the default and lack thereof the verbose alternative. * newArray -> "It is not scanned for pointers by the garbage collector unless GC.addRange is called." I'd agree that malloc() shouldn't do so because it doesn't know the type, but with newArray you do know the type so you could automatically call GC.addRange appropriately. If the client doesn't want that, they can obtain it with extra syntax. Since now you have two things (uninitialized and noscan) you may want to make them function parameters instead of part of the function name. * I'm a bit unsure about free(). People already have the pointer, so free() should use it as a cheap way to check that they're freeing the right thing. There should be free(void*), freeLastAllocated(), and possibly destroyAndFree(T)(T *). * segmentSlack() -> segmentAvailable(). * stackCat() is a bit surprising. A better way would be something like a TempAlloc appender that accepts a chain(). * In fact looking at tempArray() just below that's all - it can accept chain(). No need for stackCat. * mixin newFrame should be renamed to scopedFrame. Andrei
Jun 16 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
On 6/16/2011 11:46 PM, Andrei Alexandrescu wrote:
 On 6/11/11 11:26 AM, dsimcha wrote:
 I've overhauled my TempAlloc proposal based on some of the suggestions
 I've received.

Review:

Thank you for the review.
 * Mention no safety at disadvantages: deallocating memory may leave
 pointers dangling.

Good point.
 * The synopsis should use scope(exit) TempAlloc.frameFree(); as that is
 a robust idiom.

Good point.
 * newArray -> "For performance reasons, the returned array is not
 initialized." Then call it newUninitializedArray. Make safety the
 default and lack thereof the verbose alternative.

Here I'm going to have to strongly disagree with you. The whole point of TempAlloc is that it's a performance hack. In general I agree it's better to do the safe thing by default, but libraries specifically written as performance hacks are an exception. In TempAlloc it makes sense to do the efficient thing by default. I personally would never use newArray as opposed to newUninitializedArray and TempAlloc.newUninitializedArray (31 characters) feels **ridiculously** verbose and would clutter the API.
 * newArray -> "It is not scanned for pointers by the garbage collector
 unless GC.addRange is called." I'd agree that malloc() shouldn't do so
 because it doesn't know the type, but with newArray you do know the type
 so you could automatically call GC.addRange appropriately. If the client
 doesn't want that, they can obtain it with extra syntax. Since now you
 have two things (uninitialized and noscan) you may want to make them
 function parameters instead of part of the function name.

Again, I'm going to have to strongly disagree with you here. Adding the range to the GC would require taking the GC lock, which is to be avoided like the plague. In a library that exists specifically as a performance hack, things like this should be explicit if there's any doubt about the user's intent. This should come before safety. Once you require this explicitness, there's no need for the TempAlloc API to handle it anymore, because the user can just call GC.addRange. Furthermore, the GC's management of ranges isn't very efficient (I think it's O(N)) so if you have a whole bunch of TempAlloc-allocated arrays, things are going to grind to a halt. Lastly, extra bookkeeping would be necessary to decide whether to call removeRange().
 * I'm a bit unsure about free(). People already have the pointer, so
 free() should use it as a cheap way to check that they're freeing the
 right thing. There should be free(void*), freeLastAllocated(), and
 possibly destroyAndFree(T)(T *).

Probably a good idea. So free(void*) would just do a check and then call freeLastAllocated()? I'm not sure about destroyAndFree(). What semantics are you suggesting for it? Calling the d'tor if any and then freeing?
 * segmentSlack() -> segmentAvailable().

I guess that's a little more clear.
 * stackCat() is a bit surprising. A better way would be something like a
 TempAlloc appender that accepts a chain().

 * In fact looking at tempArray() just below that's all - it can accept
 chain(). No need for stackCat.

Good point. I kind of didn't like stackCat anyhow.
 * mixin newFrame should be renamed to scopedFrame.

Hmm. I kind of like newFrame. Why is scopedFrame a better name? If I understood why it's better, I'd be more inclined to change it.
Jun 17 2011
next sibling parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 17.06.2011 16:12, dsimcha wrote:
 On 6/16/2011 11:46 PM, Andrei Alexandrescu wrote:
 * The synopsis should use scope(exit) TempAlloc.frameFree(); as that is
 a robust idiom.

Good point.

Please note that "scope(exit) { ... }" has its own significant performance overhead since it is implemented using the exception-handling stack-unwinding mechanics even in the uninterrupted case where no exception is thrown. I think dmd needs to improve on this.
Jun 17 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Jose Armando Garcia:

 Hmm. If this is a hack then what is it doing in the std library for
 the language?!

A standard, well debugged, commented, and known "hack" is better (and causes less bugs) than a hack written by yourself in haste, when the need arises :-) Bye, bearophile
Jun 17 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/17/11 11:50 AM, Jose Armando Garcia wrote:
 On Fri, Jun 17, 2011 at 11:12 AM, dsimcha<dsimcha yahoo.com>  wrote:
 On 6/16/2011 11:46 PM, Andrei Alexandrescu wrote:
 * newArray ->  "For performance reasons, the returned array is not
 initialized." Then call it newUninitializedArray. Make safety the
 default and lack thereof the verbose alternative.

Here I'm going to have to strongly disagree with you. The whole point of TempAlloc is that it's a performance hack. In general I agree it's better to do the safe thing by default, but libraries specifically written as performance hacks are an exception. In TempAlloc it makes sense to do the efficient thing by default. I personally would never use newArray as opposed to newUninitializedArray and TempAlloc.newUninitializedArray (31 characters) feels **ridiculously** verbose and would clutter the API.

Hmm. If this is a hack then what is it doing in the std library for the language?!

For speed purposes. We're trying to minimize patently unsafe constructs in the language proper, but I see nothing wrong with having library artifacts that sacrifice safety for speed when the gain is significant. Andrei
Jun 17 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 17.06.2011 19:59, schrieb Jose Armando Garcia:
 On Fri, Jun 17, 2011 at 2:40 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 On 6/17/11 11:50 AM, Jose Armando Garcia wrote:
 On Fri, Jun 17, 2011 at 11:12 AM, dsimcha<dsimcha yahoo.com>  wrote:
 On 6/16/2011 11:46 PM, Andrei Alexandrescu wrote:
 * newArray ->  "For performance reasons, the returned array is not
 initialized." Then call it newUninitializedArray. Make safety the
 default and lack thereof the verbose alternative.

Here I'm going to have to strongly disagree with you. The whole point of TempAlloc is that it's a performance hack. In general I agree it's better to do the safe thing by default, but libraries specifically written as performance hacks are an exception. In TempAlloc it makes sense to do the efficient thing by default. I personally would never use newArray as opposed to newUninitializedArray and TempAlloc.newUninitializedArray (31 characters) feels **ridiculously** verbose and would clutter the API.

Hmm. If this is a hack then what is it doing in the std library for the language?!

For speed purposes. We're trying to minimize patently unsafe constructs in the language proper, but I see nothing wrong with having library artifacts that sacrifice safety for speed when the gain is significant.

Hmm. I don't know. Maybe we can create an internal package (std.internal) that std modules can use in their implementation. The daring programmer can also use if they dig around but maybe we shouldn't put the gun on the kitchen counter. Modules in std.internal should be ddoc but they are not linked from d-programming-language.org. This is also in reply to bearophile's comment. -Jose

I think it's a bad idea to not link documentation on d-programming-language.org just because its about a "unsafe" feature.. D is a system programming language after all. If we want to "hide" unsafe parts of the language or library we should as well remove the documentation for the std.c bindings, the documentation on pointers, ... Cheers, - Daniel
Jun 17 2011
prev sibling parent reply David Nadlinger <see klickverbot.at> writes:
On 6/17/11 7:59 PM, Jose Armando Garcia wrote:
 Hmm. I don't know. Maybe we can create an internal package
 (std.internal) that std modules can use in their implementation. The
 daring programmer can also use if they dig around but maybe we
 shouldn't put the gun on the kitchen counter. Modules in std.internal
 should be ddoc but they are not linked from
 d-programming-language.org.

I don't quite see how std.internal relates to the TempAlloc proposal. std.internal is a namespace for modules that are, well, Phobos-internal, and not part of the public API – hence, they are not publicly documented at d-programming-language.org as well. No guarantees about API stability, etc. are made, they should never be used from outside Phobos. On the other hand, TempAlloc is generally useful if you are writing performance-sensitive code. It is only a »hack« insofar as it allows you to break some guarantees in order to be able to gain performance, not because it has an unstable API or something like that. Obviously, you _can_ shoot yourself in the foot with it, but there isn't quite a way this could be avoided. One could even argue that for »unsafe« things, having well documented and tested library artifacts is even more important than for basic, »safe« things. David
Jun 17 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/17/11 1:28 PM, David Nadlinger wrote:
 On 6/17/11 7:59 PM, Jose Armando Garcia wrote:
 Hmm. I don't know. Maybe we can create an internal package
 (std.internal) that std modules can use in their implementation. The
 daring programmer can also use if they dig around but maybe we
 shouldn't put the gun on the kitchen counter. Modules in std.internal
 should be ddoc but they are not linked from
 d-programming-language.org.

I don't quite see how std.internal relates to the TempAlloc proposal. std.internal is a namespace for modules that are, well, Phobos-internal, and not part of the public API – hence, they are not publicly documented at d-programming-language.org as well. No guarantees about API stability, etc. are made, they should never be used from outside Phobos. On the other hand, TempAlloc is generally useful if you are writing performance-sensitive code. It is only a »hack« insofar as it allows you to break some guarantees in order to be able to gain performance, not because it has an unstable API or something like that. Obviously, you _can_ shoot yourself in the foot with it, but there isn't quite a way this could be avoided. One could even argue that for »unsafe« things, having well documented and tested library artifacts is even more important than for basic, »safe« things. David

The one thing std.tempalloc should do is make it very clear it is unsafe, with the expected liabilities. I think the current version is very coy about it. Andrei
Jun 17 2011
prev sibling next sibling parent Jose Armando Garcia <jsancio gmail.com> writes:
On Fri, Jun 17, 2011 at 11:12 AM, dsimcha <dsimcha yahoo.com> wrote:
 On 6/16/2011 11:46 PM, Andrei Alexandrescu wrote:
 * newArray -> "For performance reasons, the returned array is not
 initialized." Then call it newUninitializedArray. Make safety the
 default and lack thereof the verbose alternative.

Here I'm going to have to strongly disagree with you. =A0The whole point =

 TempAlloc is that it's a performance hack. =A0In general I agree it's bet=

 to do the safe thing by default, but libraries specifically written as
 performance hacks are an exception. =A0In TempAlloc it makes sense to do =

 efficient thing by default. =A0I personally would never use newArray as
 opposed to newUninitializedArray and TempAlloc.newUninitializedArray (31
 characters) feels **ridiculously** verbose and would clutter the API.

Hmm. If this is a hack then what is it doing in the std library for the language?!
Jun 17 2011
prev sibling parent Jose Armando Garcia <jsancio gmail.com> writes:
On Fri, Jun 17, 2011 at 2:40 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 On 6/17/11 11:50 AM, Jose Armando Garcia wrote:
 On Fri, Jun 17, 2011 at 11:12 AM, dsimcha<dsimcha yahoo.com> =A0wrote:
 On 6/16/2011 11:46 PM, Andrei Alexandrescu wrote:
 * newArray -> =A0"For performance reasons, the returned array is not
 initialized." Then call it newUninitializedArray. Make safety the
 default and lack thereof the verbose alternative.

Here I'm going to have to strongly disagree with you. =A0The whole poin=



 TempAlloc is that it's a performance hack. =A0In general I agree it's
 better
 to do the safe thing by default, but libraries specifically written as
 performance hacks are an exception. =A0In TempAlloc it makes sense to d=



 the
 efficient thing by default. =A0I personally would never use newArray as
 opposed to newUninitializedArray and TempAlloc.newUninitializedArray (3=



 characters) feels **ridiculously** verbose and would clutter the API.

Hmm. If this is a hack then what is it doing in the std library for the language?!

For speed purposes. We're trying to minimize patently unsafe constructs i=

 the language proper, but I see nothing wrong with having library artifact=

 that sacrifice safety for speed when the gain is significant.

Hmm. I don't know. Maybe we can create an internal package (std.internal) that std modules can use in their implementation. The daring programmer can also use if they dig around but maybe we shouldn't put the gun on the kitchen counter. Modules in std.internal should be ddoc but they are not linked from d-programming-language.org. This is also in reply to bearophile's comment. -Jose
Jun 17 2011