digitalmars.D - Reference semantic ranges and algorithms (and std.random)

monarch_dodra (51/51) Sep 18 2012 I've been developing a PRNG recently, and studying how it behaves

Jonathan M Davis (14/23) Sep 18 2012 Moving to classes would definitely break code, but it should be possible...

monarch_dodra (42/53) Sep 19 2012 I was thinking of doing that. The problem with this (as I've run

Jonathan M Davis (44/76) Sep 20 2012 There's always the check that the internals have been initialized on eve...

monarch_dodra (6/125) Sep 20 2012 TY for your insight. Good points. I'll try to do your "#1": It is

Johannes Pfau (43/54) Sep 20 2012 How would the internal state be allocated? malloc + refcounting or GC?

monarch_dodra (32/100) Sep 20 2012 That is a good point, I'd also make the "default" heap allocated,

Johannes Pfau (15/56) Sep 20 2012 Well for by reference types we'd need intialization, so there's

monarch_dodra (7/22) Sep 20 2012 These are good suggestions, but they are also breaking changes :/

Jonathan M Davis (9/11) Sep 20 2012 What std.container has is make, which is supposed to construct a type wh...

monarch_dodra (41/43) Sep 21 2012 #1

monarch_dodra (3/7) Sep 21 2012 Never mind I found out the answer: Using something deprecated is
Jonathan M Davis (13/27) Sep 21 2012 It is _very_ crippling to a range to not be a forward range. There's lot...

monarch_dodra (39/75) Sep 21 2012 I know it is crippling, but that is kind of the entire point: If
jerro (5/23) Sep 22 2012 Reference type ranges could have a payload property which would

Jesse Phillips (4/4) Sep 21 2012 I've ran into the problem with Random, there are some bugs
Joseph Rushton Wakeling (4/9) Sep 25 2012 Just to note, we already have an existing problem with this in Phobos, w...

monarch_dodra (13/27) Sep 25 2012 Thank you for bringing it up. The problem is indeed as you said,

Joseph Rushton Wakeling (3/7) Sep 27 2012 That would be fantastic. I'm a bit compromised right now time-wise but ...

monarch_dodra (7/18) Oct 26 2012 Well, it took me a month (I was working on other things), but I

"monarch_dodra" <monarchdodra gmail.com> writes:

I've been developing a PRNG recently, and studying how it behaves 
with phobos' other stuff, in particular, range and algorithm.

I quickly came to the conclusion that if a PRNG does not have 
reference semantics, it is just as good as useless.

Why? Because everything in phobos is passed "by value". For 
example:

--------
import std.random, std.stdio, std.range, std.algorithm;

void main()
{
     auto rng = MinstdRand(); rng.seed();
     auto r1 = rng.take(5).array();
     auto r2 = rng.take(5).array();
     writeln(r1);
     writeln(r2);
     copy(rng.take(5), r1);
     copy(rng.take(5), r2);
     writeln(r1);
     writeln(r2);
}
--------
[48271, 182605794, 1291394886, 1914720637, 2078669041]
[48271, 182605794, 1291394886, 1914720637, 2078669041]
[48271, 182605794, 1291394886, 1914720637, 2078669041]
[48271, 182605794, 1291394886, 1914720637, 2078669041]
--------
As you can see from the ouput, that is not very random. That's 
just the "tip of the iceberg". *Anything* in phobos that iterates 
on a range, such a fill, filter, or whatever, will not advance 
the PRNG, arguably breaking it.

At best, a "global" PRNG will work, provided it is never ever 
passed as an argument to a method.


migrated to classes (or be made reference structs), sooner than 
later. This might break some code, so I do not know how this is 
usually done, but I think it is necessary. I do not, however, 
propose that they should all derive from a base class.


reference semantics, there are still some issues regarding 
phobo's behavior with references ranges: Completly un-documented 
and inconsistent.

For example, if one were to call copy(source, target), would 
"source" be advanced? would "target" be advanced? Depends really. 
You'd be surprised too: For example, fill, with an infinite 
range, would save the filler before starting.

I was making a pull for optimizing fill/copy, and the thing kind 
of exploded in my face. I decided to take this opportunity to try 
to formalize and document this behavior.

I was hoping for some info/feedback/suggestions...
The pull request is here:
https://github.com/D-Programming-Language/phobos/pull/802

Sep 18 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:

 migrated to classes (or be made reference structs), sooner than
 later. This might break some code, so I do not know how this is
 usually done, but I think it is necessary. I do not, however,
 propose that they should all derive from a base class.

Moving to classes would definitely break code, but it should be possible to 
make them reference types simply by making it so that their internal state is 
in a separate object held by a pointer.


 reference semantics, there are still some issues regarding
 phobo's behavior with references ranges: Completly un-documented
 and inconsistent.

Reference-type ranges are not properly tested by Phobos in general, so they 
just plain won't work in a number of cases. I've done some work in fixing that, 
but there's plenty more to do. I really need to fiinish the unit test helper 
stuff that I've been doing for ranges and get that into Phobos. It's almost 
done, but I was running into weird build errors and haven't sorted them out 
yet. Once that's done, it should be much easier to test a larger range of 
range types (including reference types). Regardless, std.range and 
std.algorithm in particular need to be properly tested for reference range 
types and the related bugs fixed.

- Jonathan M Davis

Sep 18 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 18 September 2012 at 17:59:04 UTC, Jonathan M Davis 
wrote:
 On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:

 migrated to classes (or be made reference structs), sooner than
 later. This might break some code, so I do not know how this is
 usually done, but I think it is necessary. I do not, however,
 propose that they should all derive from a base class.

 Moving to classes would definitely break code, but it should be 
 possible to
 make them reference types simply by making it so that their 
 internal state is
 in a separate object held by a pointer.

I was thinking of doing that. The problem with this (as I've run 
into and stated in another thread), is a problem of 
initialization: The simpler PRNGs are init'ed seeded, and are 
ready for use immediately. Changing to this approach would break 
the initialization, as shown in this post:

http://forum.dlang.org/thread/bvuquzwfykiytdwsqkky forum.dlang.org#post-yvtsivozyhqzscgddbrl:40forum.dlang.org

A "used to be valid" PRNG has now become an un-initialized PRNG". 
This is extremely insidious, as the code still compiles, but will 
crash.

I'm hoping my thread goes somewhere, but since I doubt it, I've 
thought up of two other solutions for a "clean" migration:


Keep as struct (as you suggest), but use the opCall hack. I HATE 
this solution.


Change to class, but leave behind some "opCall"s for each old 
constructor, plus an extra one for default:
class A
{
     static A opCall(){return new A();}
     static A opCall(int i){return new A(i);}

     this(){}
     this(int){}

     /// ...
}

void main()
{
     A a1 = A();  //Works AND constructs
     A a2 = A(5); //Still Works
     A a3 = new A();  //Works
     A a4 = new A(5); //Works
}

In this case, I'm fine with having some opCalls, because 
technically, they aren't mixed with the constructors and used as 
hacks: The constructors are all there, and the opCalls, merelly 
forward to them.

That said, they *would* be destined for deprecation.

Use of "A a;" would create a problem though, but nothing's 
perfect...

Is this second solution something you think I should look into?

Sep 19 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, September 20, 2012 08:51:28 monarch_dodra wrote:
 On Tuesday, 18 September 2012 at 17:59:04 UTC, Jonathan M Davis
 
 wrote:
 On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:

 migrated to classes (or be made reference structs), sooner than
 later. This might break some code, so I do not know how this is
 usually done, but I think it is necessary. I do not, however,
 propose that they should all derive from a base class.

 
 Moving to classes would definitely break code, but it should be
 possible to
 make them reference types simply by making it so that their
 internal state is
 in a separate object held by a pointer.

 
 I was thinking of doing that. The problem with this (as I've run
 into and stated in another thread), is a problem of
 initialization: The simpler PRNGs are init'ed seeded, and are
 ready for use immediately. Changing to this approach would break
 the initialization, as shown in this post:
 
 http://forum.dlang.org/thread/bvuquzwfykiytdwsqkky forum.dlang.org#post-yvts
 ivozyhqzscgddbrl:40forum.dlang.org
 
 A "used to be valid" PRNG has now become an un-initialized PRNG".
 This is extremely insidious, as the code still compiles, but will
 crash.

There's always the check that the internals have been initialized on every 
call and initialize it if it hasn't been solution. It's not pretty, but it 
won't break code. It's actually a use case that makes me wish that we had 
something like the invariant which ran before every public function call 
except that it was always there (even in -release) and let you do anything you 
want.

In any case, while it's a bit ugly, I believe that simply adding checks for 
initialization in every function call is the cleanest solution from the 
standpoint of backwards compatibility, and the ugliness is all self-contained. 
As far as performance goes, it's only an issue if you're iterating over it in 
a tight loop, but the actual random number generation is so much more 
expensive than a check for a null pointer that it probably doesn't matter.


 Change to class, but leave behind some "opCall"s for each old
 constructor, plus an extra one for default:

 Is this second solution something you think I should look into?

Since

A a;

will just blow up in your face if you switch it to a class, it's not a non-
breaking change even as a migration path, so I don't see that as really being 
viable. Even if you've found a way to minimize the immediate code breakage, 
you didn't eliminate it. If you're going to break code immediately, you might 
as well just break it all at once and get people to fix their stuff rather than 
mostly fix it but not quite, especially when you're asking them to change their 
code later anyway as part of a migration path.

Regardless, when this came up previously, I believe that the conclusion was 
that if we were going to switch to classes, we needed to do something like 
create std.random2 and schedule std.random for deprecation rather than 
changing the current structs to classes (either that or rename _every_ type in 
there and schedule them for deprecation individually, but then you have to 
come up for new names for everything, and it's more of a pain to migrate, 
since all the names changed rather than just the import). So, I believe that 
the idea of switching to classes was pretty much rejected previously unless 
entirely new types were used so that no code would be broken.

I think that we have two options at this point:

1. Switch the internals so that they're in a separate struct pointed to by the 
outer struct and check for initialization on every function call to avoid the 
problem where init was used.

2. Create a new module to replace std.random and make them final classes in 
there, scheduling the old module for deprecation.


and there's increasing resistance to breaking code. Even Andrei, who was 
fairly willing to break code for improvements before, is almost paranoid about 
it now, and Walter was _always_ against it. So, if we have a viable solution 
that avoids breaking code (especially if any ugliness that comes with it is 
internal to the implementation), we should probably go with that.

- Jonathan M Davis

Sep 20 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 20 September 2012 at 07:26:01 UTC, Jonathan M Davis 
wrote:
 On Thursday, September 20, 2012 08:51:28 monarch_dodra wrote:
 On Tuesday, 18 September 2012 at 17:59:04 UTC, Jonathan M Davis
 
 wrote:
 On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:

 std.random be
 migrated to classes (or be made reference structs), sooner 
 than
 later. This might break some code, so I do not know how 
 this is
 usually done, but I think it is necessary. I do not, 
 however,
 propose that they should all derive from a base class.

 
 Moving to classes would definitely break code, but it should 
 be
 possible to
 make them reference types simply by making it so that their
 internal state is
 in a separate object held by a pointer.

 
 I was thinking of doing that. The problem with this (as I've 
 run
 into and stated in another thread), is a problem of
 initialization: The simpler PRNGs are init'ed seeded, and are
 ready for use immediately. Changing to this approach would 
 break
 the initialization, as shown in this post:
 
 http://forum.dlang.org/thread/bvuquzwfykiytdwsqkky forum.dlang.org#post-yvts
 ivozyhqzscgddbrl:40forum.dlang.org
 
 A "used to be valid" PRNG has now become an un-initialized 
 PRNG".
 This is extremely insidious, as the code still compiles, but 
 will
 crash.

 There's always the check that the internals have been 
 initialized on every
 call and initialize it if it hasn't been solution. It's not 
 pretty, but it
 won't break code. It's actually a use case that makes me wish 
 that we had
 something like the invariant which ran before every public 
 function call
 except that it was always there (even in -release) and let you 
 do anything you
 want.

 In any case, while it's a bit ugly, I believe that simply 
 adding checks for
 initialization in every function call is the cleanest solution 
 from the
 standpoint of backwards compatibility, and the ugliness is all 
 self-contained.
 As far as performance goes, it's only an issue if you're 
 iterating over it in
 a tight loop, but the actual random number generation is so 
 much more
 expensive than a check for a null pointer that it probably 
 doesn't matter.


 Change to class, but leave behind some "opCall"s for each old
 constructor, plus an extra one for default:

 Is this second solution something you think I should look into?

 Since

 A a;

 will just blow up in your face if you switch it to a class, 
 it's not a non-
 breaking change even as a migration path, so I don't see that 
 as really being
 viable. Even if you've found a way to minimize the immediate 
 code breakage,
 you didn't eliminate it. If you're going to break code 
 immediately, you might
 as well just break it all at once and get people to fix their 
 stuff rather than
 mostly fix it but not quite, especially when you're asking them 
 to change their
 code later anyway as part of a migration path.

 Regardless, when this came up previously, I believe that the 
 conclusion was
 that if we were going to switch to classes, we needed to do 
 something like
 create std.random2 and schedule std.random for deprecation 
 rather than
 changing the current structs to classes (either that or rename 
 _every_ type in
 there and schedule them for deprecation individually, but then 
 you have to
 come up for new names for everything, and it's more of a pain 
 to migrate,
 since all the names changed rather than just the import). So, I 
 believe that
 the idea of switching to classes was pretty much rejected 
 previously unless
 entirely new types were used so that no code would be broken.

 I think that we have two options at this point:

 1. Switch the internals so that they're in a separate struct 
 pointed to by the
 outer struct and check for initialization on every function 
 call to avoid the
 problem where init was used.

 2. Create a new module to replace std.random and make them 
 final classes in
 there, scheduling the old module for deprecation.


 breaking code,
 and there's increasing resistance to breaking code. Even 
 Andrei, who was
 fairly willing to break code for improvements before, is almost 
 paranoid about
 it now, and Walter was _always_ against it. So, if we have a 
 viable solution
 that avoids breaking code (especially if any ugliness that 
 comes with it is
 internal to the implementation), we should probably go with 
 that.

 - Jonathan M Davis


simple and non breaking. *IF* we ever do decide to break, and 
rather it be done after a very thourough discussion of 
requirements, specifications, migration path, etc...

Sep 20 2012

Johannes Pfau <nospam example.com> writes:

Am Thu, 20 Sep 2012 08:51:28 +0200
schrieb "monarch_dodra" <monarchdodra gmail.com>:

 Moving to classes would definitely break code, but it should be 
 possible to
 make them reference types simply by making it so that their 
 internal state is
 in a separate object held by a pointer.  

 
 I was thinking of doing that. The problem with this (as I've run 
 into and stated in another thread), is a problem of 
 initialization: The simpler PRNGs are init'ed seeded, and are 
 ready for use immediately. Changing to this approach would break 
 the initialization, as shown in this post:

How would the internal state be allocated? malloc + refcounting or GC?

I'm not really happy about that as I'd like to avoid allocation
(especially GC) whenever possible. Using a PRNG only locally in a
function seems like a valid use case to me and it currently doesn't
require allocation. As the similar problem with
std.digest/std.algorithm.copy has showed value type ranges seem not
very well supported in phobos.

I wonder why we don't pass all struct/value type ranges by ref? Is
there a reason this wouldn't work?

Or we could leave the PRNGs as is and provide reference wrappers. This
would allow to place the PRNG on the stack and still make it work with
all range functions. But it's also cumbersome and error prone.

Best solution is probably to have wrappers which allocate by default,
but also allow to pass a reference to a stack allocated value. Then
make the wrappers default, so the default's safe and easy to use.

Pseudo-code:

struct RNG_Impl
{
    uint front();
    void popFront();
    bool empty();
}

struct RNG
{
    RNG_Impl* impl;

    this(ref RNG_Impl impl)
        impl = cast(RNG_Impl*) impl;

    void initialize()
    {
        assert(!impl); //Either initialize or constructor
        impl = new RNG_Impl();
    }
}

RNG_Impl impl;
RNG(impl).take(5); //No allocation (but must not leak references...)

Regarding the initialization check: I'd avoid the check in release
mode. Not initializing a struct is a developer mistake and should be
found in debug mode. I think it's unlikely that error handling code can
handle such a situation anyway. But you could check how
std.typecons.RefCounted handles this, as it also need explicit
initialization.

Sep 20 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 20 September 2012 at 09:58:41 UTC, Johannes Pfau 
wrote:
 Am Thu, 20 Sep 2012 08:51:28 +0200
 schrieb "monarch_dodra" <monarchdodra gmail.com>:

 Moving to classes would definitely break code, but it should 
 be possible to
 make them reference types simply by making it so that their 
 internal state is
 in a separate object held by a pointer.

 
 I was thinking of doing that. The problem with this (as I've 
 run into and stated in another thread), is a problem of 
 initialization: The simpler PRNGs are init'ed seeded, and are 
 ready for use immediately. Changing to this approach would 
 break the initialization, as shown in this post:

 How would the internal state be allocated? malloc + refcounting 
 or GC?

 I'm not really happy about that as I'd like to avoid allocation
 (especially GC) whenever possible. Using a PRNG only locally in 
 a
 function seems like a valid use case to me and it currently 
 doesn't
 require allocation. As the similar problem with
 std.digest/std.algorithm.copy has showed value type ranges seem 
 not
 very well supported in phobos.

 I wonder why we don't pass all struct/value type ranges by ref? 
 Is
 there a reason this wouldn't work?

 Or we could leave the PRNGs as is and provide reference 
 wrappers. This
 would allow to place the PRNG on the stack and still make it 
 work with
 all range functions. But it's also cumbersome and error prone.

 Best solution is probably to have wrappers which allocate by 
 default,
 but also allow to pass a reference to a stack allocated value. 
 Then
 make the wrappers default, so the default's safe and easy to 
 use.

 Pseudo-code:

 struct RNG_Impl
 {
     uint front();
     void popFront();
     bool empty();
 }

 struct RNG
 {
     RNG_Impl* impl;

     this(ref RNG_Impl impl)
         impl = cast(RNG_Impl*) impl;

     void initialize()
     {
         assert(!impl); //Either initialize or constructor
         impl = new RNG_Impl();
     }
 }

 RNG_Impl impl;
 RNG(impl).take(5); //No allocation (but must not leak 
 references...)

 Regarding the initialization check: I'd avoid the check in 
 release
 mode. Not initializing a struct is a developer mistake and 
 should be
 found in debug mode. I think it's unlikely that error handling 
 code can
 handle such a situation anyway. But you could check how
 std.typecons.RefCounted handles this, as it also need explicit
 initialization.

That is a good point, I'd also make the "default" heap allocated, 
but give a way to access a stack allocated payload.

Regarding the "developer mistake", the problem is that currently:
"Misnstdrand a;"
Will create a valid and seeded PRNG that is ready for use, so we 
can't break that.

Arguably though, the argument holds for Mersenne twister, which 
needs a function call to be (default) seeded.

HOWEVER I find that:
auto a = MersenneTwister();  //Not seeded and invalid
auto b = MersenneTwister(5); //Seeded and valid
Is a confusing, especially since MersenneTwister provides 
"seed()" to default seed.

I'd rather have:
auto a = MersenneTwister();  //Not *yet* seeded: It will be done 
on the fly...
auto b = MersenneTwister(5); //Seeded and valid

But AGAIN, on the other hand, if you change back again to your 
proposed stack allocated MersenneTwisterImpl:
auto a = MersenneTwister();  //Not Seeded, but not assertable
auto b = MersenneTwister(5); //Seeded and valid

It really feels like there is no perfect solution.

********
The truth is that I would have rather ALL prngs not have ANY 
constructors, and that they ALL required an explicit seed: This 
would be uniform, and not have any surprises.

OR

That creating a prng would seed it with the default seed if 
nothing is specified, meaning that a prng is ALWAYS valid.

It feels like the current behavior is a bastardly hybrid...

Sep 20 2012

Johannes Pfau <nospam example.com> writes:

Am Thu, 20 Sep 2012 12:23:12 +0200
schrieb "monarch_dodra" <monarchdodra gmail.com>:
 
 That is a good point, I'd also make the "default" heap allocated, 
 but give a way to access a stack allocated payload.
 
 Regarding the "developer mistake", the problem is that currently:
 "Misnstdrand a;"
 Will create a valid and seeded PRNG that is ready for use, so we 
 can't break that.

Well for by reference types we'd need intialization, so there's
probably no way not to break it. We'd probably have to do the
initialization check in release mode for some phobos releases.
But after some time it should be changed to a debug-only warning.

 
 Arguably though, the argument holds for Mersenne twister, which 
 needs a function call to be (default) seeded.
 
 HOWEVER I find that:
 auto a = MersenneTwister();  //Not seeded and invalid
 auto b = MersenneTwister(5); //Seeded and valid
 Is a confusing, especially since MersenneTwister provides 
 "seed()" to default seed.
 
 I'd rather have:
 auto a = MersenneTwister();  //Not *yet* seeded: It will be done 
 on the fly...
 auto b = MersenneTwister(5); //Seeded and valid
 
 But AGAIN, on the other hand, if you change back again to your 
 proposed stack allocated MersenneTwisterImpl:
 auto a = MersenneTwister();  //Not Seeded, but not assertable
 auto b = MersenneTwister(5); //Seeded and valid
 
 It really feels like there is no perfect solution.
 
 ********
 The truth is that I would have rather ALL prngs not have ANY 
 constructors, and that they ALL required an explicit seed: This 
 would be uniform, and not have any surprises.
 
 OR
 
 That creating a prng would seed it with the default seed if 
 nothing is specified, meaning that a prng is ALWAYS valid.
 
 It feels like the current behavior is a bastardly hybrid...

That's what I did in std.digest. All Digests have a start() method,
even if it's not necessary for that specific Digest. It's probably a
good solution if you want a uniform interface for types which need
initialization and types which don't.

But there's also another, nice solution which should work: Introduce a
new makeRNG template function. This func checks if the RNG has a seed
function and calls it if available. Then you can do this for all RNGs:

auto rng = makeRNG!MersenneTwister();
auto rng = makeRNG!Misnstdrand();

Sep 20 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 20 September 2012 at 10:46:22 UTC, Johannes Pfau 
wrote:
 That's what I did in std.digest. All Digests have a start() 
 method,
 even if it's not necessary for that specific Digest. It's 
 probably a
 good solution if you want a uniform interface for types which 
 need
 initialization and types which don't.

 But there's also another, nice solution which should work: 
 Introduce a
 new makeRNG template function. This func checks if the RNG has 
 a seed
 function and calls it if available. Then you can do this for 
 all RNGs:

 auto rng = makeRNG!MersenneTwister();
 auto rng = makeRNG!Misnstdrand();

These are good suggestions, but they are also breaking changes :/ 
I *AM* writing them down though, should we ever go down Jonathan 
M Davis's suggestion is changing the module.

BTW: std.container also has MakeContainter, but, AFAIK, I've 
never seen ANYONE use it :/

Sep 20 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, September 20, 2012 12:57:03 monarch_dodra wrote:
 BTW: std.container also has MakeContainter, but, AFAIK, I've
 never seen ANYONE use it :/

What std.container has is make, which is supposed to construct a type where it 
goes by default (classes on the heap with new, and structs on the stack). The 
idea is definitely useful, and I have a pull request to improve upon it:

https://github.com/D-Programming-Language/phobos/pull/756

But I don't think that I've ever seen anyone use std.container.make either. 
The new version will be more useful though (particularly, since it'll work on 
more than just structs and classes).

- Jonathan M Davis

Sep 20 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 20 September 2012 at 11:10:43 UTC, Jonathan M Davis 
wrote:
 [SNIP]

 - Jonathan M Davis


Hey, I've been working on this (locally): I've made all the PRNGs 
reference ranges. It actually works perfectly. I took the "ensure 
initialized" route, as you suggested. I was able to take an 
approach that moved little code, so there is a clean distinction 
between the old "Payload", which was not touched (much), and the 
wrappers. The advantage is that the diff is very clean.

I was able to do this without adding or removing any 
functionality. Great!

Regarding the cost of "is initialized", I think I found a very 
good work around. I'll show it in a pull.


Johannes Pfau: When asking for a generator, by default you get a 
reference prng. This is the _safe_ approach.

If somebody *really* wants to, they can always declare a 
PRNGPayload on the stack, but I advise against that as:
*Passing it by value could generate duplicate sequences
*Passing it by value (for certain PRNGs) can be prohibitively 
expansive.

But the option is there.


The only thing I'm having an issue with is "save". IMO, it is 
exceptionally dangerous to have a PRNG be a ForwardRange: It 
should only be saved if you have a damn good reason to do so. You 
can still "dup" if you want (manually) (if you think that is 
smart), but I don't think it should answer true to 
"isForwardRange".

You just don't know what an algorithm will do under the hood if 
it finds out the range is saveable. In particular, save can be 
non-trivial and expensive...

I think if somebody really wants to be able to access some random 
numbers more than once, they are just as well off doing:
Type[] randomNumbers = myPRNG.take(50).array();

The advantage here is that at no point to you ever risk having 
duplicate numbers. (Take advances the reference range generator.

QUESTION:
If I (were to) deprecate "save", how would that work with the 
range traits type? If a range has "save", but it is deprecated, 
does it answer true to isForwardRange?

Sep 21 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 21 September 2012 at 13:19:55 UTC, monarch_dodra wrote:
 QUESTION:
 If I (were to) deprecate "save", how would that work with the 
 range traits type? If a range has "save", but it is deprecated, 
 does it answer true to isForwardRange?

Never mind I found out the answer: Using something deprecated is 
a compile error, so traits correctly report the change.

Sep 21 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, September 21, 2012 15:20:49 monarch_dodra wrote:

 The only thing I'm having an issue with is "save". IMO, it is
 exceptionally dangerous to have a PRNG be a ForwardRange: It
 should only be saved if you have a damn good reason to do so. You
 can still "dup" if you want (manually) (if you think that is
 smart), but I don't think it should answer true to
 "isForwardRange".

It is _very_ crippling to a range to not be a forward range. There's lots of 
stuff that requires it. And I really don't see what the problem is. _Maybe_ 
it's okay for them to be input ranges and not forward ranges, but in general, 
I think that we need to be _very_ careful about doing that. It can be really, 
really annoying when a range is not a forward range.

 You just don't know what an algorithm will do under the hood if
 it finds out the range is saveable. In particular, save can be
 non-trivial and expensive...

It shouldn't be. It's a copy, and copy's are not supposed to be expensive. 
We've discussed this before with regards to postlbit constructors. Certainly, 
beyond the extra cost of having to new things up in some cases, they should be 
relatively cheap.

 QUESTION:
 If I (were to) deprecate "save", how would that work with the
 range traits type? If a range has "save", but it is deprecated,
 does it answer true to isForwardRange?

You'd have to  test it. It might depend on whether -d is used, but it could 
easily be that it'll be true as long as save exists.

- Jonathan M Davis

Sep 21 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Friday, 21 September 2012 at 19:47:16 UTC, Jonathan M Davis 
wrote:
 On Friday, September 21, 2012 15:20:49 monarch_dodra wrote:

 The only thing I'm having an issue with is "save". IMO, it is
 exceptionally dangerous to have a PRNG be a ForwardRange: It
 should only be saved if you have a damn good reason to do so. 
 You
 can still "dup" if you want (manually) (if you think that is
 smart), but I don't think it should answer true to
 "isForwardRange".

 It is _very_ crippling to a range to not be a forward range. 
 There's lots of
 stuff that requires it. And I really don't see what the problem 
 is. _Maybe_
 it's okay for them to be input ranges and not forward ranges, 
 but in general,
 I think that we need to be _very_ careful about doing that. It 
 can be really,
 really annoying when a range is not a forward range.

I know it is crippling, but that is kind of the entire point: If 
somebody wants to read the same numbers trough twice, he'd better 
have a damn good reason. For example, *even* with reference 
ranges:

----
auto rng = SomeReferenceForwardRange();
auto a = new int[](10);
auto b = new int[](10);
a.fill(rng);
b.fill(rng);
----

This will fill a and b with the *same* numbers (!), even though 
rng is a reference range (!) Arguably, the bug is in fill (and 
has since been fixed), but the point is that by the time you 
notice it, who knows how long you've debugged? And who knows 
which *other* algorithms will break your prnd?

Had the rng been only InputRange, the compilation would have 
raised an error. And the work around is also easy. Safety first, 
right?

It might not be worth deprecating *now*, but I do think it is a 
latent danger.

PS: In all of random, there is not one single use case for having 
a ForwardPrng. Just saying.

 You just don't know what an algorithm will do under the hood if
 it finds out the range is saveable. In particular, save can be
 non-trivial and expensive...

 It shouldn't be. It's a copy, and copy's are not supposed to be 
 expensive.
 We've discussed this before with regards to postlbit 
 constructors. Certainly,
 beyond the extra cost of having to new things up in some cases, 
 they should be
 relatively cheap.

The copy of the reference is designed to be cheap (and is), yes 
(we've discussed this).

However, when you save, you have to duplicate the payload, and 
even considering that there is no postblit cost, you have 
strictly no control over its size:

LCG: 1 Int
XORShift: 5 Ints
MersenneTwister: 397 Ints
LaggedFib: (607 to 44497) ulongs or doubles

Not the end of the world, but not trivially cheap either.

 QUESTION:
 If I (were to) deprecate "save", how would that work with the
 range traits type? If a range has "save", but it is deprecated,
 does it answer true to isForwardRange?

 You'd have to  test it. It might depend on whether -d is used, 
 but it could
 easily be that it'll be true as long as save exists.

 - Jonathan M Davis

I just tested it btw:
without -d: isForwardRange: false
with    -d: isForwardRange: true

It is kind of the logical behavior actually.

Sep 21 2012

"jerro" <a a.com> writes:

On Friday, 21 September 2012 at 19:47:16 UTC, Jonathan M Davis 
wrote:
 On Friday, September 21, 2012 15:20:49 monarch_dodra wrote:

 The only thing I'm having an issue with is "save". IMO, it is
 exceptionally dangerous to have a PRNG be a ForwardRange: It
 should only be saved if you have a damn good reason to do so. 
 You
 can still "dup" if you want (manually) (if you think that is
 smart), but I don't think it should answer true to
 "isForwardRange".

 It is _very_ crippling to a range to not be a forward range. 
 There's lots of
 stuff that requires it. And I really don't see what the problem 
 is. _Maybe_
 it's okay for them to be input ranges and not forward ranges, 
 but in general,
 I think that we need to be _very_ careful about doing that. It 
 can be really,
 really annoying when a range is not a forward range.

Reference type ranges could have a payload property which would 
return the underlying value type range which would be a forward 
range. If you need a forward range, you could just use that.

Sep 22 2012

"Jesse Phillips" <Jessekphillips+D gmail.com> writes:

I've ran into the problem with Random, there are some bugs 
entered by bearophile. Since then they added

http://dlang.org/phobos/std_range.html#RefRange

but I haven't used it to see how well it solves the problem.

Sep 21 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 18/09/12 17:05, monarch_dodra wrote:
 As you can see from the ouput, that is not very random. That's just the "tip of
 the iceberg". *Anything* in phobos that iterates on a range, such a fill,
 filter, or whatever, will not advance the PRNG, arguably breaking it.

 At best, a "global" PRNG will work, provided it is never ever passed as an
 argument to a method.

Just to note, we already have an existing problem with this in Phobos, with the 
RandomSample functionality.  See:
http://d.puremagic.com/issues/show_bug.cgi?id=8247

Sep 25 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 25 September 2012 at 16:15:24 UTC, Joseph Rushton 
Wakeling wrote:
 On 18/09/12 17:05, monarch_dodra wrote:
 As you can see from the ouput, that is not very random. That's 
 just the "tip of
 the iceberg". *Anything* in phobos that iterates on a range, 
 such a fill,
 filter, or whatever, will not advance the PRNG, arguably 
 breaking it.

 At best, a "global" PRNG will work, provided it is never ever 
 passed as an
 argument to a method.

 Just to note, we already have an existing problem with this in 
 Phobos, with the RandomSample functionality.  See:
 http://d.puremagic.com/issues/show_bug.cgi?id=8247

Thank you for bringing it up. The problem is indeed as you said, 
and making the PRNGs references types fixes it.

I have a pretty well developed first iteration. I hope that by 
the end of next week, I'll have something to show.

Right now, I'm at a crossroad between:
*Inserting (requested) new features, (forced stack allocation, 
pre-initialized ranges for performance)
*Strictly no new features, for an eventual (easier) breaking 
change.

The problem with inserting "workaround 'features'", is that you 
have to keep supporting even if it is not required anymore :p

Sep 25 2012

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 25/09/12 18:36, monarch_dodra wrote:
 Thank you for bringing it up. The problem is indeed as you said, and making the
 PRNGs references types fixes it.

 I have a pretty well developed first iteration. I hope that by the end of next
 week, I'll have something to show.

That would be fantastic.  I'm a bit compromised right now time-wise but do ping 
me if I can do anything to help you test stuff, particularly wrt RandomSample.

Sep 27 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 27 September 2012 at 10:06:54 UTC, Joseph Rushton 
Wakeling wrote:
 On 25/09/12 18:36, monarch_dodra wrote:
 Thank you for bringing it up. The problem is indeed as you 
 said, and making the
 PRNGs references types fixes it.

 I have a pretty well developed first iteration. I hope that by 
 the end of next
 week, I'll have something to show.

 That would be fantastic.  I'm a bit compromised right now 
 time-wise but do ping me if I can do anything to help you test 
 stuff, particularly wrt RandomSample.

Well, it took me a month (I was working on other things), but I 
have provided a change to make PRNGs reference ranges.

The pull request is here:
https://github.com/D-Programming-Language/phobos/pull/893

Please review or comment.

Oct 26 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Reference semantic ranges and algorithms (and std.random)