www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Fixed-size arrays and randomShuffle()

reply Vidar Wahlberg <canidae exent.net> writes:
May be that this works as intended, but it fooled me:
---
import std.random;
import std.stdio;
void main() {
         int[5] a = 0;
         a[0] = 1;
         int[] b = [1, 0, 0, 0, 0];
         randomShuffle(a);
         writeln(a);
         randomShuffle(b);
         writeln(b);
}
---

In DMD 2.0.59 the fixed-size array "a" won't be shuffled (the dynamic 
array "b" will), and you won't get any warning about it.

I'm not sure whether this counts as something that should be reported as 
a bug/improvement, nor if only randomShuffle() displays this behaviour, 
perhaps you could enlighten me.
May 03 2012
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/03/2012 06:30 AM, Vidar Wahlberg wrote:
 May be that this works as intended, but it fooled me:
 ---
 import std.random;
 import std.stdio;
 void main() {
 int[5] a = 0;
 a[0] = 1;
 int[] b = [1, 0, 0, 0, 0];
 randomShuffle(a);

Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]);
 writeln(a);
 randomShuffle(b);
 writeln(b);
 }
 ---

 In DMD 2.0.59 the fixed-size array "a" won't be shuffled (the dynamic
 array "b" will), and you won't get any warning about it.

 I'm not sure whether this counts as something that should be reported as
 a bug/improvement, nor if only randomShuffle() displays this behaviour,
 perhaps you could enlighten me.

Ali
May 03 2012
next sibling parent reply Vidar Wahlberg <canidae exent.net> writes:
On 2012-05-03 15:34, Ali Çehreli wrote:
 Fixed-length arrays are value types. 'a' is copied to randomShuffle() so
 its copy is shuffled. Passing a slice of the whole array works:

 randomShuffle(a[]);

True, it is however still not exceptionally newbie (or perhaps even user?) friendly (my question was more of "does it have to be this way?" rather than "how do you do this?", even though I appreciate the answer on how to do it). Is it not possible for the compilator to let you know that what you're doing doesn't make any sense? A quick follow-up: I've tried some various random number engines, but neither come even close to the performance of whatever is used for Java's "Collection.shuffle()" method. Perhaps someone can shed some light on this?
May 03 2012
parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/03/2012 06:55 AM, Vidar Wahlberg wrote:
 On 2012-05-03 15:34, Ali Çehreli wrote:
 Fixed-length arrays are value types. 'a' is copied to randomShuffle() so
 its copy is shuffled. Passing a slice of the whole array works:

 randomShuffle(a[]);

True, it is however still not exceptionally newbie (or perhaps even user?) friendly (my question was more of "does it have to be this way?" rather than "how do you do this?", even though I appreciate the answer on how to do it). Is it not possible for the compilator to let you know that what you're doing doesn't make any sense?

Random shuffle can work on a fixed-length array and there is a way for the implementation to know: import std.traits; // ... __traits(isStaticArray, a) That can be used in a template constraint.
 A quick follow-up:
 I've tried some various random number engines, but neither come even
 close to the performance of whatever is used for Java's
 "Collection.shuffle()" method. Perhaps someone can shed some light on 

I have no idea with that one. Ali
May 03 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 03.05.2012 18:02, Ali Çehreli wrote:

  > A quick follow-up:
  > I've tried some various random number engines, but neither come even
  > close to the performance of whatever is used for Java's
  > "Collection.shuffle()" method. Perhaps someone can shed some light on
 this?

 I have no idea with that one.

It's all about RNG used behind the scenes. Default one is Mersane Twister which (AFAIK) is not particularly fast. But has a period of 2^19937 elements. You should probably use XorShift or MinstdRand generator and a version of shuffle with 2nd parameter. -- Dmitry Olshansky
May 03 2012
parent reply Vidar Wahlberg <canidae exent.net> writes:
On 2012-05-03 16:26, Dmitry Olshansky wrote:
 It's all about RNG used behind the scenes. Default one is Mersane
 Twister which (AFAIK) is not particularly fast. But has a period of
 2^19937 elements.
 You should probably use XorShift or MinstdRand generator and a version
 of shuffle with 2nd parameter.

I tried those two as well. Still significantly slower than what I can achieve in Java.
May 03 2012
parent Vidar Wahlberg <canidae exent.net> writes:
On 2012-05-03 17:31, Chris Cain wrote:
 You might want to post your code...

Sure! D: --- import std.random; import std.stdio; void main() { auto iterations = 10000000; int[] a; for (int i = 0; i < 42; ++i) a ~= i; for (int i = 0; i < iterations; ++i) randomShuffle(a); } naushika:~/projects> dmd random.d && time ./random ./random 38,35s user 0,05s system 99% cpu 38,420 total --- Java (7): --- import java.util.ArrayList; import java.util.Collections; public class Rnd { public static void main(String... args) { int iterations = 10000000; ArrayList<Integer> a = new ArrayList<Integer>(); for (int i = 0; i < 42; ++i) a.add(i); for (int i = 0; i < iterations; ++i) Collections.shuffle(a); } } naushika:~/projects> javac Rnd.java && time java Rnd java Rnd 9,92s user 0,03s system 100% cpu 9,922 total ---
May 03 2012
prev sibling next sibling parent "ixid" <nuaccount gmail.com> writes:
Why would you not want it to shuffle a fixed array? That's a very 
frustrating silent failure.
May 03 2012
prev sibling next sibling parent "Chris Cain" <clcain uncg.edu> writes:
On Thursday, 3 May 2012 at 14:41:20 UTC, Vidar Wahlberg wrote:
 I tried those two as well. Still significantly slower than what 
 I can achieve in Java.

You might want to post your code... I wrote this code in D: -=-=-=- import std.random, std.stdio, std.datetime; void main() { int[] arr = new int[5_000_000]; foreach(i, ref e; arr) e = i; StopWatch sw = AutoStart.yes; arr.randomShuffle(); sw.stop(); writeln("Took ", sw.peek().to!("msecs", double)(), "ms"); } -=-=-=- And it performed _identically_ to this in Java: -=-=-=- import java.util.ArrayList; import java.util.Collections; public class Main { public static void main(String[] args) { ArrayList<Integer> ints = new ArrayList<>(5000); for(int i = 0; i < 5_000_000; ++i) ints.add(i); long startTime = System.currentTimeMillis(); Collections.shuffle(ints); long endTime = System.currentTimeMillis(); System.out.println("Took " + (endTime - startTime) + "ms"); } } -=-=-=-
May 03 2012
prev sibling next sibling parent "Chris Cain" <clcain uncg.edu> writes:
On a related note, how did you get the other random generators 
working? I tried to compile this and it gives me an error:
-=-=-=-
import std.random, std.stdio, std.datetime;

void main() {
	int[] arr = new int[5_000_000];
     foreach(i, ref e; arr)
         e = i;
     auto rng = MinstdRand0(1);
     rng.seed(unpredictableSeed);
     StopWatch sw = AutoStart.yes;
     randomShuffle(arr, rng);
     sw.stop();

     writeln("Took ", sw.peek().to!("msecs", double)(), "ms");
}
-=-=-=-

The error:
-=-=-=-
C:\D\dmd2\windows\bin\..\..\src\phobos\std\random.d(1263): Error: 
cannot implicitly convert expression (rndGen()) of type 
MersenneTwisterEngine!(uint,32,624,397,31,-1727483681u,11,7,-1658038656
,15,-272236544u,18) 
to LinearCongruentialEngine!(uint,16807,0,2147483647)
-=-=-=-

Is this a bug in 2.059? Or am I doing something wrong?
May 03 2012
prev sibling parent "Chris Cain" <clcain uncg.edu> writes:
OK, I took a look at your example, and I saw the kind of 
performance you were seeing.

I performed an investigation on what could cause such a disparity 
and came up with a conclusion. The answer is two things: First of 
all, as noted above, D's default generator is a mersenne twister 
RNG. You can emulate Java's RNG like so:

auto rng = LinearCongruentialEngine!(ulong,
                         25214903917uL, 11uL, 2uL^^48uL)();

Once I equalized that, I looked into the various methods that are 
called and settled in on uniform.

https://github.com/D-Programming-Language/phobos/blob/master/std/random.d#L1154

As you can see, there's a division at line 1154 and another at 
line 1158. This means there's a minimum of two division 
operations every time uniform is called. Now, normally this isn't 
a big deal, but if we really want maximum performance, we need to 
eliminate at least one.

If you replace lines 1154-1161 (auto bucketSize ... to return...) 
with:
     CountType rnum, result;
     do
     {
         rnum = cast(CountType) uniform!CountType(urng);
         result = rnum % count;
     }
     while (rnum > count &&
             (rnum - result + (count - 1)) < (rnum - result - 1));
     return cast(typeof(return)) (min + result);

Then the time taken shrinks down to roughly the same (within a 
tenth of a second) as Java.

I'll probably clean this up (and write some comments on how this 
works) and see about submitting it as a patch unless anyone sees 
anything wrong with this approach.
May 03 2012
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Vidar Wahlberg:

 I'm not sure whether this counts as something that should be 
 reported as a bug/improvement,

It's a badly written function (with insufficient unit tests): http://d.puremagic.com/issues/show_bug.cgi?id=8026 Bye, bearophile
May 03 2012