digitalmars.D.learn - Tips and Tricks for D

mclysenk mtu.edu (12/12) Jul 02 2006 (Not sure if this belongs in announcements)

Derek Parnell (9/12) Jul 02 2006 An alternate or additional place for this information (which is quite go...
Deewiant (49/53) Jul 03 2006 I've always removed items from an array whilst preserving order in the f...

mclysenk mtu.edu (55/69) Jul 03 2006 First of all, thanks for the feedback! And second, you do have a good p...

Tom S (18/18) Jul 03 2006 Errm... array.dup is evil, since it causes an alloc -> GC stalls anyone ...

mclysenk mtu.edu (8/18) Jul 03 2006 I thought about using a templated solution like this, but I decided to a...

Deewiant (5/32) Jul 03 2006 My bad: originally I used a LENGTH of 10000 and then I neglected to chan...
Georg Wrede (33/116) Jul 06 2006 If I understand correctly, the last items in the array get shifted a

Henrik Eneroth (10/22) Jul 03 2006 Useful stuff!

Oskar Linde (27/56) Jul 03 2006 A class will always have a bit more space overhead than a struct. Other

mclysenk mtu.edu writes:

(Not sure if this belongs in announcements)

I've recently written up an article for intermediate and beginning D
programmers.  

In it I discuss some basic tricks, covering:
+ printf
+ import scope
+ struct initialization
+ converting static arrays
+ removing objects from dynamic arrays

You can read it at http://www.assertfalse.com/articles/tricks.shtml .

Any comments, questions, suggestions or other feedback is welcome!

-Mikola Lysenko

Jul 02 2006

Derek Parnell <derek nomail.afraid.org> writes:

On Sun, 2 Jul 2006 22:38:15 +0000 (UTC), mclysenk mtu.edu wrote:


 You can read it at http://www.assertfalse.com/articles/tricks.shtml .
 
 Any comments, questions, suggestions or other feedback is welcome!

An alternate or additional place for this information (which is quite good)
is the Wiki for D (http://www.prowiki.org/wiki4d/wiki.cgi). 


-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
3/07/2006 10:50:07 AM

Jul 02 2006

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

mclysenk mtu.edu wrote:
 You can read it at http://www.assertfalse.com/articles/tricks.shtml .
 
 Any comments, questions, suggestions or other feedback is welcome!
 

I've always removed items from an array whilst preserving order in the following
way (using variables from your example):

queue = queue[0..idx] ~ queue[idx+1..$];

I did some test runs (benchmark source can be found at bottom of message), and
on my machine, at least, my version is consistently faster - it runs in about
60% of the time yours takes.

BTW, you write "tmp[].dup", which seems strange at least to my eyes: why not
just "tmp.dup"? It confused me for a moment, I wondered whether it was doing
something more than just a normal .dup operation.

All in all, yours is a fine article about D. I'm not sure I knew about slicing
pointers either. <g>

My benchmark code:

import std.stdio, std.perf;

void main() {
	const int REPEATS = 1000,
	          LENGTH  = 100000;

	const int[] removeThese = [0, 42, 999, 7, 1000, 6888, 999, 9992];
	int[] array;
	TickCounter t = new TickCounter();

	t.start;
	for (int i = 0; i < REPEATS; ++i) {
		array.length = LENGTH;

		foreach (n; removeThese) {
			if (n != array.length - 1) {
				int[] tmp = array[n+1..$];
				array[n..$-1] = tmp[].dup;
			}
			array.length = array.length - 1;
		}
	}
	t.stop;

	uint mTime = t.milliseconds;
	writefln("MLysenko: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
removal)", mTime, REPEATS, LENGTH, cast(real)mTime / REPEATS);

	t.start;
	for (int i = 0; i < REPEATS; ++i) {
		array.length = LENGTH;

		foreach (n; removeThese)
			array = array[0..n] ~ array[n+1..$];
	}
	t.stop;

	uint dTime = t.milliseconds;
	writefln("Deewiant: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
removal)", dTime, REPEATS, LENGTH, cast(real)dTime / REPEATS);
	
	writefln("D / M: %.2f", cast(real)dTime / mTime);
	writefln("M / D: %.2f", cast(real)mTime / dTime);
}

Jul 03 2006

mclysenk mtu.edu writes:

In article <e8ajmp$o9c$1 digitaldaemon.com>, Deewiant says...
mclysenk mtu.edu wrote:
 You can read it at http://www.assertfalse.com/articles/tricks.shtml .
 
 Any comments, questions, suggestions or other feedback is welcome!
 

I've always removed items from an array whilst preserving order in the following
way (using variables from your example):

queue = queue[0..idx] ~ queue[idx+1..$];

I did some test runs (benchmark source can be found at bottom of message), and
on my machine, at least, my version is consistently faster - it runs in about
60% of the time yours takes.

First of all, thanks for the feedback!  And second, you do have a good point.
However, there is a slight bias in your test.  Instead of removing elements from
arbitrary positions in the array, you are strictly removing elements from the
first 10000 positions.  In your method, you are copying the beginning of the
array, whereas in my method I am copying the end of the array.  If you try the
following benchmark, my version runs about 25-40 times faster on my machine:


import std.stdio, std.perf;

void main() {
const int REPEATS = 1000,
LENGTH  = 100000;

//Changed 0 to 1 in order to prevent bounds errors.
const int[] removeThese = [1, 42, 999, 7, 1000, 6888, 999, 9992];
int[] array;
TickCounter t = new TickCounter();

t.start;
for (int i = 0; i < REPEATS; ++i) {
array.length = LENGTH;

foreach (n; removeThese) {
n = array.length - n;  //Pull from the end of the array
if (n != array.length - 1) {
int[] tmp = array[n+1..$];
array[n..$-1] = tmp[].dup;
}
array.length = array.length - 1;
}
}
t.stop;

uint mTime = t.milliseconds;
writefln("MLysenko: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
removal)", mTime, REPEATS, LENGTH, cast(real)mTime / REPEATS);

t.start;
for (int i = 0; i < REPEATS; ++i) {
array.length = LENGTH;

foreach (n; removeThese)
{
n = array.length - n; // Remove from opposite side.
array = array[0..n] ~ array[n+1..$];
}
}
t.stop;

uint dTime = t.milliseconds;
writefln("Deewiant: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
removal)", dTime, REPEATS, LENGTH, cast(real)dTime / REPEATS);

writefln("D / M: %.2f", cast(real)dTime / mTime);
writefln("M / D: %.2f", cast(real)mTime / dTime);
}


In the end, there isn't a single best answer to this problem.  Both approaches
have their merit, and perhaps an adaptive strategy which combines the two might
best. (However yours performs far more consistently than mine, which may be a
big advantage in some cases.)


BTW, you write "tmp[].dup", which seems strange at least to my eyes: why not
just "tmp.dup"? It confused me for a moment, I wondered whether it was doing
something more than just a normal .dup operation.

There is no difference.  Since it is an array copy, I prefer to use the
brackets.

Once more, thanks for the comments!

- Mikola Lysenko

Jul 03 2006

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Errm... array.dup is evil, since it causes an alloc -> GC stalls anyone ?

How about this ?

void removeNth(T)(inout T[] array, int n) {
	if (n < array.length - 1) {
		memmove(&array[n],
				&array[n+1],
				(array.length - n - 1) * T.sizeof);
	}

	array.length = array.length - 1;
}


then just:

foreach (n; removeThese)
{
	n = array.length - n;
	array.removeNth(n);
}



-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jul 03 2006

mclysenk mtu.edu writes:

In article <e8b8js$1m53$1 digitaldaemon.com>, Tom S says...
Errm... array.dup is evil, since it causes an alloc -> GC stalls anyone ?

How about this ?

void removeNth(T)(inout T[] array, int n) {
	if (n < array.length - 1) {
		memmove(&array[n],
				&array[n+1],
				(array.length - n - 1) * T.sizeof);
	}

	array.length = array.length - 1;
}

I thought about using a templated solution like this, but I decided to avoid it
since it is perhaps a bit too complicated for an introductory article.  However,
upon consideration I think that this is probably the best way to go for any sort
of heavy duty application.  Of course, if you have 100k+ elements and you care
about order, even a naive a linked list will beat the fastest array
implementation any day.

-Mikola Lysenko

Jul 03 2006

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

mclysenk mtu.edu wrote:
 In article <e8ajmp$o9c$1 digitaldaemon.com>, Deewiant says...
 mclysenk mtu.edu wrote:
 You can read it at http://www.assertfalse.com/articles/tricks.shtml .

 Any comments, questions, suggestions or other feedback is welcome!

 I've always removed items from an array whilst preserving order in the
following
 way (using variables from your example):

 queue = queue[0..idx] ~ queue[idx+1..$];

 I did some test runs (benchmark source can be found at bottom of message), and
 on my machine, at least, my version is consistently faster - it runs in about
 60% of the time yours takes.

 
 First of all, thanks for the feedback!  And second, you do have a good point.
 However, there is a slight bias in your test.  Instead of removing elements
from
 arbitrary positions in the array, you are strictly removing elements from the
 first 10000 positions.  In your method, you are copying the beginning of the
 array, whereas in my method I am copying the end of the array.  If you try the
 following benchmark, my version runs about 25-40 times faster on my machine:

My bad: originally I used a LENGTH of 10000 and then I neglected to change
removeThese to include more testcases.

 In the end, there isn't a single best answer to this problem.  Both approaches
 have their merit, and perhaps an adaptive strategy which combines the two might
 best. (However yours performs far more consistently than mine, which may be a
 big advantage in some cases.)
 

You're right. Your method is a lot faster for removing elements near the end of
an array.

Jul 03 2006

Georg Wrede <georg.wrede nospam.org> writes:

mclysenk mtu.edu wrote:
 In article <e8ajmp$o9c$1 digitaldaemon.com>, Deewiant says...
 
mclysenk mtu.edu wrote:

You can read it at http://www.assertfalse.com/articles/tricks.shtml .

Any comments, questions, suggestions or other feedback is welcome!

I've always removed items from an array whilst preserving order in the following
way (using variables from your example):

queue = queue[0..idx] ~ queue[idx+1..$];

I did some test runs (benchmark source can be found at bottom of message), and
on my machine, at least, my version is consistently faster - it runs in about
60% of the time yours takes.

 
 
 First of all, thanks for the feedback!  And second, you do have a good point.
 However, there is a slight bias in your test.  Instead of removing elements
from
 arbitrary positions in the array, you are strictly removing elements from the
 first 10000 positions.  In your method, you are copying the beginning of the
 array, whereas in my method I am copying the end of the array.  If you try the
 following benchmark, my version runs about 25-40 times faster on my machine:
 
 
 import std.stdio, std.perf;
 
 void main() {
 const int REPEATS = 1000,
 LENGTH  = 100000;
 
 //Changed 0 to 1 in order to prevent bounds errors.
 const int[] removeThese = [1, 42, 999, 7, 1000, 6888, 999, 9992];
 int[] array;
 TickCounter t = new TickCounter();
 
 t.start;
 for (int i = 0; i < REPEATS; ++i) {
 array.length = LENGTH;
 
 foreach (n; removeThese) {
 n = array.length - n;  //Pull from the end of the array
 if (n != array.length - 1) {
 int[] tmp = array[n+1..$];
 array[n..$-1] = tmp[].dup;
 }
 array.length = array.length - 1;
 }
 }
 t.stop;
 
 uint mTime = t.milliseconds;
 writefln("MLysenko: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
 removal)", mTime, REPEATS, LENGTH, cast(real)mTime / REPEATS);
 
 t.start;
 for (int i = 0; i < REPEATS; ++i) {
 array.length = LENGTH;
 
 foreach (n; removeThese)
 {
 n = array.length - n; // Remove from opposite side.
 array = array[0..n] ~ array[n+1..$];
 }
 }
 t.stop;
 
 uint dTime = t.milliseconds;
 writefln("Deewiant: %6d ms for %6d repeats on %6d-length arrays (%.2f ms per
 removal)", dTime, REPEATS, LENGTH, cast(real)dTime / REPEATS);
 
 writefln("D / M: %.2f", cast(real)dTime / mTime);
 writefln("M / D: %.2f", cast(real)mTime / dTime);
 }
 
 
 In the end, there isn't a single best answer to this problem.  Both approaches
 have their merit, and perhaps an adaptive strategy which combines the two might
 best. (However yours performs far more consistently than mine, which may be a
 big advantage in some cases.)

If I understand correctly, the last items in the array get shifted a 
number of times, which takes unnecessary time.

Imagine a "history diagram" of the array. In it I've put an "R" for the 
removables, and those in between are denoted with single lower case letters:

RaaaaRbbbbbRccRddddddReeeeeeeeeeRfffff

(So all items between the first and second removable are called "a".) 
Now, the history would look like:

RaaaaRbbbbbRccRddddddReeeeeeeeeeRfffff
aaaaRbbbbbRccRddddddReeeeeeeeeeRfffff
aaaabbbbbRccRddddddReeeeeeeeeeRfffff
aaaabbbbbccRddddddReeeeeeeeeeRfffff
aaaabbbbbccddddddReeeeeeeeeeRfffff
aaaabbbbbccddddddeeeeeeeeeeRfffff
aaaabbbbbccddddddeeeeeeeeeefffff

A lot less copying would be done with the following:

RaaaaRbbbbbRccRddddddReeeeeeeeeeRfffff
aaaa RbbbbbRccRddddddReeeeeeeeeeRfffff
aaaabbbbb  RccRddddddReeeeeeeeeeRfffff
aaaabbbbbcc   RddddddReeeeeeeeeeRfffff
aaaabbbbbccdddddd    ReeeeeeeeeeRfffff
aaaabbbbbccddddddeeeeeeeeee     Rfffff
aaaabbbbbccddddddeeeeeeeeeefffff

With this strategy, no element ever gets moved more than once.
Of course, this is only applicable when all the elements-to-be-removed 
are known before the start.

If you have an array of 1M elements, and remove, say a few hundred 
elements from near the beginning, then these two methods have a huge 
performance difference.

(And yes, like someone pointed out, there do exist real-life situations 
where one simply has to use an array for this, as opposed to a linked list.)

PS: I have no objection if you include this reasoning in your excellent 
Tips and Tricks, should you want to.   :-)

Jul 06 2006

Henrik Eneroth <Henrik_member pathlink.com> writes:

In article <e89hsn$1rvh$1 digitaldaemon.com>, mclysenk mtu.edu says...
(Not sure if this belongs in announcements)

I've recently written up an article for intermediate and beginning D
programmers.  

In it I discuss some basic tricks, covering:
+ printf
+ import scope
+ struct initialization
+ converting static arrays
+ removing objects from dynamic arrays

You can read it at http://www.assertfalse.com/articles/tricks.shtml .

Any comments, questions, suggestions or other feedback is welcome!

-Mikola Lysenko

Useful stuff! 
A question about structs: I have in many cases used classes over structs because
of the lack of constructors in structs. Performance-wise, is it always better to
use structs over classes if you can? Or will a class where no virtual functions
etc. are used be similar to a struct performance-wise?
Also, if structs are copy-by-value, could this lead to unnecessary memory
allocations in those cases where just passing a reference (i.e. classes) would
do? That is, would classes be a better choice in this case?

/// Henrik

Jul 03 2006

Oskar Linde <oskar.lindeREM OVEgmail.com> writes:

Henrik Eneroth skrev:
 In article <e89hsn$1rvh$1 digitaldaemon.com>, mclysenk mtu.edu says...
 (Not sure if this belongs in announcements)

 I've recently written up an article for intermediate and beginning D
 programmers.  

 In it I discuss some basic tricks, covering:
 + printf
 + import scope
 + struct initialization
 + converting static arrays
 + removing objects from dynamic arrays

 You can read it at http://www.assertfalse.com/articles/tricks.shtml .

 Any comments, questions, suggestions or other feedback is welcome!

 -Mikola Lysenko

 
 Useful stuff! 
 A question about structs: I have in many cases used classes over structs
because
 of the lack of constructors in structs. Performance-wise, is it always better
to
 use structs over classes if you can? Or will a class where no virtual functions
 etc. are used be similar to a struct performance-wise?

A class will always have a bit more space overhead than a struct. Other 
than the implications of by-value vs by-reference there should not be 
any performance differences.

 Also, if structs are copy-by-value, could this lead to unnecessary memory
 allocations in those cases where just passing a reference (i.e. classes) would
 do? That is, would classes be a better choice in this case?

A pass-by-value struct will be allocated on the stack which generally 
costs nothing compared to a heap allocation. The cost of copying depends 
on the size of the struct. In general, passing large structs by value is 
expensive. Small structs may (or may not) be more efficiently passed by 
value. The actual speed difference of passing a small struct by value 
compared to by reference depends largely on how the struct is used and 
the ABI in use.

On some ABIs

struct XY { int x,y; }
int myfunc(XY p) { return p.x+p.y; }

will be identical to:

int myfunc(int x, int y) { return x+y; }

Returning a struct by value may be more expensive than passing it.

The only way to know for certain is to run the benchmarks yourself. I 
would guess that on x86, with its few registers, passing by reference 
could often be faster even for rather small structs. Does anyone have 
conclusive benchmark results and a rule of thumb here?

Rather than using a class, one could explicitly pass the struct by 
reference, by either an inout parameter or a pointer:

int myfunc(inout XY p)

int myfunc(XY *p)

Regards,

Oskar

Jul 03 2006

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Tips and Tricks for D