www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - assumeSafeAppend on slice of static array?

reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
I'm going through some code and thinking of ways to reduce GC pressure,
and came across a bit that needed to append some items to an array:

	T[] args;
	lex.expect("(");
	args ~= parseSingleItem(lex);
	while (!lex.empty) {
		lex.expect(",");
		args ~= parseSingleItem(lex);
	}
	lex.expect(")");
	return computeResult(args);

Now obviously, in the general case (with arbitrarily many number of
items) some GC allocations will be needed, but the most common use-cases
are actually only 1 or 2 items each time. Allocating lots of small
arrays seem to be rather wasteful, so I thought to use a static array as
a buffer instead.

The question is, is there a way to take a slice of the static array, set
the length to zero, and append to it with ~= such that when it runs out
of space in the static buffer, it will reallocate a longer array on the
GC heap? Or is this a bad idea?


T

-- 
Without geometry, life would be pointless. -- VS
Apr 22 2014
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Apr 2014 14:10:30 -0400, H. S. Teoh via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 I'm going through some code and thinking of ways to reduce GC pressure,
 and came across a bit that needed to append some items to an array:

 	T[] args;
 	lex.expect("(");
 	args ~= parseSingleItem(lex);
 	while (!lex.empty) {
 		lex.expect(",");
 		args ~= parseSingleItem(lex);
 	}
 	lex.expect(")");
 	return computeResult(args);

 Now obviously, in the general case (with arbitrarily many number of
 items) some GC allocations will be needed, but the most common use-cases
 are actually only 1 or 2 items each time. Allocating lots of small
 arrays seem to be rather wasteful, so I thought to use a static array as
 a buffer instead.

 The question is, is there a way to take a slice of the static array, set
 the length to zero, and append to it with ~= such that when it runs out
 of space in the static buffer, it will reallocate a longer array on the
 GC heap? Or is this a bad idea?

TL;DR: Yes, use Appender :) The reason appending even works is because of the metadata stored in the heap. Obviously, stack frames and fixed-length arrays do not have this data. When that metadata cannot be found, it reallocates because that's the only option. However, Appender, initialized with a static buffer, knows the length of its data, and stores its own capacity, separate from the heap. I'm not sure if it will fit your needs, give it a look. -Steve
Apr 22 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Apr 22, 2014 at 02:24:41PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On Tue, 22 Apr 2014 14:10:30 -0400, H. S. Teoh via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 
I'm going through some code and thinking of ways to reduce GC
pressure, and came across a bit that needed to append some items to
an array:

	T[] args;
	lex.expect("(");
	args ~= parseSingleItem(lex);
	while (!lex.empty) {
		lex.expect(",");
		args ~= parseSingleItem(lex);
	}
	lex.expect(")");
	return computeResult(args);

Now obviously, in the general case (with arbitrarily many number of
items) some GC allocations will be needed, but the most common
use-cases are actually only 1 or 2 items each time. Allocating lots
of small arrays seem to be rather wasteful, so I thought to use a
static array as a buffer instead.

The question is, is there a way to take a slice of the static array,
set the length to zero, and append to it with ~= such that when it
runs out of space in the static buffer, it will reallocate a longer
array on the GC heap? Or is this a bad idea?

TL;DR: Yes, use Appender :) The reason appending even works is because of the metadata stored in the heap. Obviously, stack frames and fixed-length arrays do not have this data. When that metadata cannot be found, it reallocates because that's the only option. However, Appender, initialized with a static buffer, knows the length of its data, and stores its own capacity, separate from the heap.

[...] Unfortunately, the whole point of this exercise was to eliminate GC allocations for small arrays -- but since Appender's implementation allocates a private Data struct in its ctor, that kinda defeats the purpose. For the common case of 1 or 2 items only, it doesn't seem like Appender will perform any better, and it will introduce extra GC allocations to boot. :-( T -- What are you when you run out of Monet? Baroque.
Apr 22 2014
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 22 Apr 2014 14:31:07 -0400, H. S. Teoh via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On Tue, Apr 22, 2014 at 02:24:41PM -0400, Steven Schveighoffer via  
 Digitalmars-d wrote:
 On Tue, 22 Apr 2014 14:10:30 -0400, H. S. Teoh via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

I'm going through some code and thinking of ways to reduce GC
pressure, and came across a bit that needed to append some items to
an array:

	T[] args;
	lex.expect("(");
	args ~= parseSingleItem(lex);
	while (!lex.empty) {
		lex.expect(",");
		args ~= parseSingleItem(lex);
	}
	lex.expect(")");
	return computeResult(args);

Now obviously, in the general case (with arbitrarily many number of
items) some GC allocations will be needed, but the most common
use-cases are actually only 1 or 2 items each time. Allocating lots
of small arrays seem to be rather wasteful, so I thought to use a
static array as a buffer instead.

The question is, is there a way to take a slice of the static array,
set the length to zero, and append to it with ~= such that when it
runs out of space in the static buffer, it will reallocate a longer
array on the GC heap? Or is this a bad idea?

TL;DR: Yes, use Appender :) The reason appending even works is because of the metadata stored in the heap. Obviously, stack frames and fixed-length arrays do not have this data. When that metadata cannot be found, it reallocates because that's the only option. However, Appender, initialized with a static buffer, knows the length of its data, and stores its own capacity, separate from the heap.

[...] Unfortunately, the whole point of this exercise was to eliminate GC allocations for small arrays -- but since Appender's implementation allocates a private Data struct in its ctor, that kinda defeats the purpose. For the common case of 1 or 2 items only, it doesn't seem like Appender will perform any better, and it will introduce extra GC allocations to boot. :-(

I advocated a long time ago that Appender should have a stack-based version. I still think that's the case. Because really what Appender has as an advantage over builtin slices is that it keeps a local copy of the capacity, so no heap metadata lookups need to occur. It's not conceptually that much added. Of course, back then, I'm not sure we had disable this(this), which such an appender should have. -Steve
Apr 22 2014
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 22 April 2014 at 18:52:44 UTC, Steven Schveighoffer 
wrote:
 On Tue, 22 Apr 2014 14:31:07 -0400, H. S. Teoh via 
 Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Tue, Apr 22, 2014 at 02:24:41PM -0400, Steven Schveighoffer 
 via Digitalmars-d wrote:
 On Tue, 22 Apr 2014 14:10:30 -0400, H. S. Teoh via 
 Digitalmars-d
 <digitalmars-d puremagic.com> wrote:

I'm going through some code and thinking of ways to reduce GC
pressure, and came across a bit that needed to append some 
items to
an array:

	T[] args;
	lex.expect("(");
	args ~= parseSingleItem(lex);
	while (!lex.empty) {
		lex.expect(",");
		args ~= parseSingleItem(lex);
	}
	lex.expect(")");
	return computeResult(args);

Now obviously, in the general case (with arbitrarily many 
number of
items) some GC allocations will be needed, but the most 
common
use-cases are actually only 1 or 2 items each time. 
Allocating lots
of small arrays seem to be rather wasteful, so I thought to 
use a
static array as a buffer instead.

The question is, is there a way to take a slice of the 
static array,
set the length to zero, and append to it with ~= such that 
when it
runs out of space in the static buffer, it will reallocate a 
longer
array on the GC heap? Or is this a bad idea?

TL;DR: Yes, use Appender :) The reason appending even works is because of the metadata stored in the heap. Obviously, stack frames and fixed-length arrays do not have this data. When that metadata cannot be found, it reallocates because that's the only option. However, Appender, initialized with a static buffer, knows the length of its data, and stores its own capacity, separate from the heap.

[...] Unfortunately, the whole point of this exercise was to eliminate GC allocations for small arrays -- but since Appender's implementation allocates a private Data struct in its ctor, that kinda defeats the purpose. For the common case of 1 or 2 items only, it doesn't seem like Appender will perform any better, and it will introduce extra GC allocations to boot. :-(

I advocated a long time ago that Appender should have a stack-based version. I still think that's the case. Because really what Appender has as an advantage over builtin slices is that it keeps a local copy of the capacity, so no heap metadata lookups need to occur. It's not conceptually that much added. Of course, back then, I'm not sure we had disable this(this), which such an appender should have. -Steve

What I said in the other thread, my ScopeAppender is stack based. Very soon to being finished :)
Apr 22 2014
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
22-Apr-2014 22:10, H. S. Teoh via Digitalmars-d пишет:
 I'm going through some code and thinking of ways to reduce GC pressure,
 and came across a bit that needed to append some items to an array:

 	T[] args;
 	lex.expect("(");
 	args ~= parseSingleItem(lex);
 	while (!lex.empty) {
 		lex.expect(",");
 		args ~= parseSingleItem(lex);
 	}
 	lex.expect(")");
 	return computeResult(args);

 Now obviously, in the general case (with arbitrarily many number of
 items) some GC allocations will be needed, but the most common use-cases
 are actually only 1 or 2 items each time. Allocating lots of small
 arrays seem to be rather wasteful, so I thought to use a static array as
 a buffer instead.

 The question is, is there a way to take a slice of the static array, set
 the length to zero, and append to it with ~= such that when it runs out
 of space in the static buffer, it will reallocate a longer array on the
 GC heap? Or is this a bad idea?

Should be a canonical use case for ScopeBuffer https://github.com/D-Programming-Language/phobos/blob/master/std/internal/scopebuffer.d Except that it has crippled usability e.g. you need to call free manually. -- Dmitry Olshansky
Apr 22 2014
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 22 April 2014 at 18:47:16 UTC, Dmitry Olshansky wrote:
 22-Apr-2014 22:10, H. S. Teoh via Digitalmars-d пишет:
 I'm going through some code and thinking of ways to reduce GC 
 pressure,
 and came across a bit that needed to append some items to an 
 array:

 	T[] args;
 	lex.expect("(");
 	args ~= parseSingleItem(lex);
 	while (!lex.empty) {
 		lex.expect(",");
 		args ~= parseSingleItem(lex);
 	}
 	lex.expect(")");
 	return computeResult(args);

 Now obviously, in the general case (with arbitrarily many 
 number of
 items) some GC allocations will be needed, but the most common 
 use-cases
 are actually only 1 or 2 items each time. Allocating lots of 
 small
 arrays seem to be rather wasteful, so I thought to use a 
 static array as
 a buffer instead.

 The question is, is there a way to take a slice of the static 
 array, set
 the length to zero, and append to it with ~= such that when it 
 runs out
 of space in the static buffer, it will reallocate a longer 
 array on the
 GC heap? Or is this a bad idea?

Should be a canonical use case for ScopeBuffer https://github.com/D-Programming-Language/phobos/blob/master/std/internal/scopebuffer.d Except that it has crippled usability e.g. you need to call free manually.

I've been working on a "ScopedAppender" that is *bit* slower than ScopeBuffer, but can be used on any generic types, and is nothrow/ctfe/pure/"sometimes safe". I'm giving it the "finishing touches". But in the meantime, normal appender+clear can work: int[50] buf; auto app = appender(buf[]); app.clear(); //app is ready for use. The "issue" though is that appender itself as a reference type, so just declaring it allocates, which kind of gets in the way of setting up a local scratch space to begin with.
Apr 22 2014
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Apr 22, 2014 at 06:59:05PM +0000, monarch_dodra via Digitalmars-d wrote:
 On Tuesday, 22 April 2014 at 18:47:16 UTC, Dmitry Olshansky wrote:
22-Apr-2014 22:10, H. S. Teoh via Digitalmars-d пишет:


[...]
The question is, is there a way to take a slice of the static array,
set the length to zero, and append to it with ~= such that when it
runs out of space in the static buffer, it will reallocate a longer
array on the GC heap? Or is this a bad idea?

Should be a canonical use case for ScopeBuffer https://github.com/D-Programming-Language/phobos/blob/master/std/internal/scopebuffer.d Except that it has crippled usability e.g. you need to call free manually.

I've been working on a "ScopedAppender" that is *bit* slower than ScopeBuffer, but can be used on any generic types, and is nothrow/ctfe/pure/"sometimes safe". I'm giving it the "finishing touches". But in the meantime, normal appender+clear can work: int[50] buf; auto app = appender(buf[]); app.clear(); //app is ready for use. The "issue" though is that appender itself as a reference type, so just declaring it allocates, which kind of gets in the way of setting up a local scratch space to begin with.

Yeah, I'm already using appender for the general case of n items; right now I'm just trying to optimize for the very common case where there are only 1 or two items in the list by eliminating GC allocations for that case. The fact that appender allocates defeats the whole purpose. :-/ T -- BREAKFAST.COM halted...Cereal Port Not Responding. -- YHL
Apr 22 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

 The question is, is there a way to take a slice of the static 
 array, set
 the length to zero, and append to it with ~= such that when it 
 runs out
 of space in the static buffer, it will reallocate a longer 
 array on the
 GC heap? Or is this a bad idea?

I suggested to add this to Phobos: https://d.puremagic.com/issues/show_bug.cgi?id=8550 Bye, bearophile
Apr 22 2014
prev sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 04/22/2014 11:10 AM, H. S. Teoh via Digitalmars-d wrote:

 is there a way to take a slice of the static array, set the
 length to zero,

Currently, instead of setting the length to zero, you should start with the whole slice.
 and append to it with ~=

Instead, you can use the output range primitive put().
 such that when it runs out of space in the static buffer

When slice.empty, it would have covered the static array completely: import std.range; void foo() { int[2] buffer; int[] slice = buffer[]; foreach (int i, _; slice) { slice.put(i); } assert(buffer == [ 0, 1 ]); } void main() { foo(); } Ali
Apr 22 2014