www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.allocator issues

reply Steven Schveighoffer <schveiguy yahoo.com> writes:
I'm trying to use the std.experimental.allocator API more in my new io 
library, and I'm having a few stumbling points:

1. GCAllocator only allocates void, which is marked as containing 
pointers. This is no good for a stream buffer, and can severely harm 
performance.

2. GCAllocator provides a free function. There's no way for me to tell 
generically that it's OK *not* to free allocations, though. I understand 
we should provide a hook if there exists a way to do so, but there needs 
to be a way to free this. I'd prefer not to free data if it's not 
strictly necessary.

3. GCAllocator needs a mechanism to specify type info for allocations so 
the dtors are properly run in the GC. While not critical for my 
purposes, this is going to be very important, and should be figured out 
before merged into mainline Phobos.

4. Various functions in std.allocator take a ref void[]. This is no 
good. I don't want to store a void[] in my type, because that's not very 
useful, but if I want to reallocate, or expand, then I need to do a 
dance where I copy the array to a void[], do the reallocation, and then 
copy it back (if successful). This fully defeats the point of having a 
ref in the first place. I understand there are wrappers that do this for 
reallocating, but I want to use the other tools as well (expand in 
particular). But principally, to have an API that is mostly unusable 
seems unnecessary.

I really do like the API, and it's fitting in quite nicely! I can work 
around all these at the moment, but I'd love to see these improvements 
in place before making official phobos.

-Steve
Feb 19 2016
next sibling parent rsw0x <anonymous anonymous.com> writes:
On Saturday, 20 February 2016 at 02:21:00 UTC, Steven 
Schveighoffer wrote:
 I'm trying to use the std.experimental.allocator API more in my 
 new io library, and I'm having a few stumbling points:

 1. GCAllocator only allocates void, which is marked as 
 containing pointers. This is no good for a stream buffer, and 
 can severely harm performance.
This is pretty bad, GCAllocator should ideally be able to optionally take type information when allocating and forward it to the GC. GC.malloc interface can take a typeinfo object, btw
Feb 19 2016
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/19/16 9:21 PM, Steven Schveighoffer wrote:
 I'm trying to use the std.experimental.allocator API more in my new io
 library, and I'm having a few stumbling points:
Another thought: Allocators return void[] when requesting allocations, reallocations, etc. For C malloc, the assumption must be that what you requested is what you got, because the actual block size isn't given. However, for GC (and I assume other allocators), we have mechanisms to know upon allocation the amount of data received. I would assume since we are returning both pointer and length, it would be possible for an allocator to return more data than requested (why waste it?). But it appears that the GC allocator just returns the amount of data requested, even though it could return the extra data that was received. Should the API assumptions allow returning more data than requested? It means code has to be wary of this, but creating a wrapper allocator that truncates the data would be trivial, no? I want to write some PRs to fix this, but I'm unclear what is expected. -Steve
Feb 19 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/19/2016 10:12 PM, Steven Schveighoffer wrote:
 On 2/19/16 9:21 PM, Steven Schveighoffer wrote:
 I'm trying to use the std.experimental.allocator API more in my new io
 library, and I'm having a few stumbling points:
Another thought: Allocators return void[] when requesting allocations, reallocations, etc. For C malloc, the assumption must be that what you requested is what you got, because the actual block size isn't given. However, for GC (and I assume other allocators), we have mechanisms to know upon allocation the amount of data received. I would assume since we are returning both pointer and length, it would be possible for an allocator to return more data than requested (why waste it?). But it appears that the GC allocator just returns the amount of data requested, even though it could return the extra data that was received. Should the API assumptions allow returning more data than requested? It means code has to be wary of this, but creating a wrapper allocator that truncates the data would be trivial, no? I want to write some PRs to fix this, but I'm unclear what is expected. -Steve
Allocators are guaranteed to return the size requested. This is by design. -- Andrei
Feb 19 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/20/16 12:22 AM, Andrei Alexandrescu wrote:
 On 02/19/2016 10:12 PM, Steven Schveighoffer wrote:
 On 2/19/16 9:21 PM, Steven Schveighoffer wrote:
 I'm trying to use the std.experimental.allocator API more in my new io
 library, and I'm having a few stumbling points:
Another thought: Allocators return void[] when requesting allocations, reallocations, etc. For C malloc, the assumption must be that what you requested is what you got, because the actual block size isn't given. However, for GC (and I assume other allocators), we have mechanisms to know upon allocation the amount of data received. I would assume since we are returning both pointer and length, it would be possible for an allocator to return more data than requested (why waste it?). But it appears that the GC allocator just returns the amount of data requested, even though it could return the extra data that was received. Should the API assumptions allow returning more data than requested? It means code has to be wary of this, but creating a wrapper allocator that truncates the data would be trivial, no? I want to write some PRs to fix this, but I'm unclear what is expected.
Allocators are guaranteed to return the size requested. This is by design. -- Andrei
Given that there is "goodAllocSize", this seems reasonable. But for ease-of-use (and code efficiency), it may be more straightforward to allow returning more data than requested (perhaps through another interface function? allocateAtLeast?) and then wrap that with the other allocation functions that simply slice the result down to size. In the GC, expanding is done a page at a time. If I request an expansion of 1 byte, that's 4095 wasted bytes. I don't expect goodAllocSize to help me here. BTW, just pushed an update to my i/o library that uses allocators exclusively for buffers. But I make the default a custom GC allocator that has the properties I need. I'm hoping there will be a way to get the desired behavior from the phobos version at some point. -Steve
Feb 19 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/20/2016 12:39 AM, Steven Schveighoffer wrote:
 Given that there is "goodAllocSize", this seems reasonable. But for
 ease-of-use (and code efficiency), it may be more straightforward to
 allow returning more data than requested (perhaps through another
 interface function? allocateAtLeast?) and then wrap that with the other
 allocation functions that simply slice the result down to size.
Actually I confess that was the exact design for a while, and I, too, found it ingenious. You ask me for 100 bytes but I'll allocate 256 anyway, why not I just return you the whole 256? But then that puts burden everywhere. Does the user need to remember 256 so they can pass me the whole buffer for freeing? That's bad for the user. Does the allocator accept 100, 256, or any size in between? That complicates the specification. Then consider things like cast(T[]) a.allocate(T.sizeof * n). Folks would legitimately expect to then have an array of just n objects. If not, they end up with more than they asked for, or a misalignment exception. It's simpler with what we have: you ask me for n bytes, I give you n bytes. You give me n bytes back to free.
 In the GC, expanding is done a page at a time. If I request an expansion
 of 1 byte, that's 4095 wasted bytes. I don't expect goodAllocSize to
 help me here.
If you call goodAllocSize(n + 1) you should get the right thing. Then you can pass it to the call to expand().
 BTW, just pushed an update to my i/o library that uses allocators
 exclusively for buffers. But I make the default a custom GC allocator
 that has the properties I need. I'm hoping there will be a way to get
 the desired behavior from the phobos version at some point.
Sounds great. Did you measure performance? Andrei
Feb 20 2016
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/20/16 8:47 AM, Andrei Alexandrescu wrote:
 On 02/20/2016 12:39 AM, Steven Schveighoffer wrote:
 Given that there is "goodAllocSize", this seems reasonable. But for
 ease-of-use (and code efficiency), it may be more straightforward to
 allow returning more data than requested (perhaps through another
 interface function? allocateAtLeast?) and then wrap that with the other
 allocation functions that simply slice the result down to size.
Actually I confess that was the exact design for a while, and I, too, found it ingenious. You ask me for 100 bytes but I'll allocate 256 anyway, why not I just return you the whole 256? But then that puts burden everywhere. Does the user need to remember 256 so they can pass me the whole buffer for freeing?
I hadn't considered that aspect, it's a good point. Some allocators may care what you pass for length on free. Using goodAllocSize, you can probably create wrapper primitives for all the allocators that do what I want, so the building blocks you have are likely the right choice.
 In the GC, expanding is done a page at a time. If I request an expansion
 of 1 byte, that's 4095 wasted bytes. I don't expect goodAllocSize to
 help me here.
If you call goodAllocSize(n + 1) you should get the right thing. Then you can pass it to the call to expand().
Expand takes the delta, so it would be goodAllocSize(n + 1) - n, but it's doable. I found some bugs in GCAllocator.expand, will submit a PR.
 BTW, just pushed an update to my i/o library that uses allocators
 exclusively for buffers. But I make the default a custom GC allocator
 that has the properties I need. I'm hoping there will be a way to get
 the desired behavior from the phobos version at some point.
Sounds great. Did you measure performance?
Performance hasn't gotten worse, if that's any comfort. I wouldn't expect the calls to Allocator to matter, since the library is built to minimize calls for allocation :) The current byline performance beats Phobos, likely not c getline, but I do support UTF delimiters, whereas c getline and Phobos do not. I'd like to get the performance better, but I need to write a test for getline to see what I'm up against. The performance for converting UTF data is good (no way to compare to Phobos, since it only supports UTF8 streams), I think it beats my previous incarnations of the i/o library that did the same thing. My sample zip program is roughly the same performance as the gzip command-line command. I'm pretty happy with the performance for as little time as I spent tuning. The zip pipes were pretty interesting, and I'm going to make some more building blocks to generalize the concept of copying from one buffer to another under the hood. -Steve
Feb 20 2016
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/20/2016 10:00 AM, Steven Schveighoffer wrote:
 I found some bugs in GCAllocator.expand, will submit a PR.
Thx! -- Andrei
Feb 20 2016
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 20 Feb 2016 08:47:47 -0500
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 On 02/20/2016 12:39 AM, Steven Schveighoffer wrote:
 Given that there is "goodAllocSize", this seems reasonable. But for
 ease-of-use (and code efficiency), it may be more straightforward to
 allow returning more data than requested (perhaps through another
 interface function? allocateAtLeast?) and then wrap that with the other
 allocation functions that simply slice the result down to size.
Actually I confess that was the exact design for a while, and I, too, found it ingenious. You ask me for 100 bytes but I'll allocate 256 anyway, why not I just return you the whole 256? But then that puts burden everywhere. Does the user need to remember 256 so they can pass me the whole buffer for freeing? That's bad for the user. Does the allocator accept 100, 256, or any size in between? That complicates the specification.
I found it ingenious, too. The question how much room was really made free is common in programming. Aside from memory allocations it can also appear in single reader/writer circular buffers, where the reader consumes a chunk of memory and the waiting writer uses `needAtLeast()` to wait until at least X bytes are available. The writer then caches the actual number of free bytes and can potentially write several more entries without querying the free size again, avoiding synchronization if reader and writer are threads. Most raw memory allocators overallocate, be it due to fixed pools or alignment and the extra bytes can be used at a higher level (in typed allocators or containers) to grow a data structure or for potential optimizations. I think for simplicity's sake they should return the overallocated buffer and expect that length when returning it in `free()`. So in the above case, 256 would have to be remembered. This is not a conceptual burden, as we are already used to the 3 properties: ptr, length = 100, capacity = 256 By the way: jemalloc has `mallocx()` to allocate at least N bytes and `sallocx()` to ask for the actual size of an allocation. -- Marco
Mar 07 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/7/16 11:53 PM, Marco Leise wrote:
 By the way: jemalloc has `mallocx()` to allocate at least N
 bytes and `sallocx()` to ask for the actual size of an
 allocation.
I know. Jason added them at my behest. -- Andrei
Mar 08 2016
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 8 Mar 2016 16:35:45 -0500
schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:

 On 3/7/16 11:53 PM, Marco Leise wrote:
 By the way: jemalloc has `mallocx()` to allocate at least N
 bytes and `sallocx()` to ask for the actual size of an
 allocation.
I know. Jason added them at my behest. -- Andrei
No further questions, your honor. :) -- Marco
Mar 11 2016
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/19/2016 09:21 PM, Steven Schveighoffer wrote:
 I'm trying to use the std.experimental.allocator API more in my new io
 library, and I'm having a few stumbling points:

 1. GCAllocator only allocates void, which is marked as containing
 pointers. This is no good for a stream buffer, and can severely harm
 performance.
Yah, some work on the porcelain is needed. There is TypedAllocator and a few primitives that provide a start.
 2. GCAllocator provides a free function. There's no way for me to tell
 generically that it's OK *not* to free allocations, though. I understand
 we should provide a hook if there exists a way to do so, but there needs
 to be a way to free this. I'd prefer not to free data if it's not
 strictly necessary.
Indeed that's an interesting matter. I was thinking of simply having two types - GCAllocator and GCSafeAllocator that doesn't have free (and probably other unsafe functions such as realloc). They'd use alias this to share implementation.
 3. GCAllocator needs a mechanism to specify type info for allocations so
 the dtors are properly run in the GC. While not critical for my
 purposes, this is going to be very important, and should be figured out
 before merged into mainline Phobos.
That interface between types and the GC-based allocator is at this time embryonic. I'll need at some point to sit down and figure out a strategy that works for GCAllocator but also for other allocators.
 4. Various functions in std.allocator take a ref void[]. This is no
 good. I don't want to store a void[] in my type, because that's not very
 useful, but if I want to reallocate, or expand, then I need to do a
 dance where I copy the array to a void[], do the reallocation, and then
 copy it back (if successful). This fully defeats the point of having a
 ref in the first place. I understand there are wrappers that do this for
 reallocating, but I want to use the other tools as well (expand in
 particular). But principally, to have an API that is mostly unusable
 seems unnecessary.
Yeah, that's work in the porcelain - typed wrappers etc.
 I really do like the API, and it's fitting in quite nicely! I can work
 around all these at the moment, but I'd love to see these improvements
 in place before making official phobos.
Sounds good! Andrei
Feb 19 2016