www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Reference semantic ranges and algorithms (and std.random)

reply "monarch_dodra" <monarchdodra gmail.com> writes:
I've been developing a PRNG recently, and studying how it behaves 
with phobos' other stuff, in particular, range and algorithm.

I quickly came to the conclusion that if a PRNG does not have 
reference semantics, it is just as good as useless.

Why? Because everything in phobos is passed "by value". For 
example:

--------
import std.random, std.stdio, std.range, std.algorithm;

void main()
{
     auto rng = MinstdRand(); rng.seed();
     auto r1 = rng.take(5).array();
     auto r2 = rng.take(5).array();
     writeln(r1);
     writeln(r2);
     copy(rng.take(5), r1);
     copy(rng.take(5), r2);
     writeln(r1);
     writeln(r2);
}
--------
[48271, 182605794, 1291394886, 1914720637, 2078669041]
[48271, 182605794, 1291394886, 1914720637, 2078669041]
[48271, 182605794, 1291394886, 1914720637, 2078669041]
[48271, 182605794, 1291394886, 1914720637, 2078669041]
--------
As you can see from the ouput, that is not very random. That's 
just the "tip of the iceberg". *Anything* in phobos that iterates 
on a range, such a fill, filter, or whatever, will not advance 
the PRNG, arguably breaking it.

At best, a "global" PRNG will work, provided it is never ever 
passed as an argument to a method.

This is issue #1: I'd propose that all objects in std.random be 
migrated to classes (or be made reference structs), sooner than 
later. This might break some code, so I do not know how this is 
usually done, but I think it is necessary. I do not, however, 
propose that they should all derive from a base class.

The second issue I've run into (issue #2) is that even with 
reference semantics, there are still some issues regarding 
phobo's behavior with references ranges: Completly un-documented 
and inconsistent.

For example, if one were to call copy(source, target), would 
"source" be advanced? would "target" be advanced? Depends really. 
You'd be surprised too: For example, fill, with an infinite 
range, would save the filler before starting.

I was making a pull for optimizing fill/copy, and the thing kind 
of exploded in my face. I decided to take this opportunity to try 
to formalize and document this behavior.

I was hoping for some info/feedback/suggestions...
The pull request is here:
https://github.com/D-Programming-Language/phobos/pull/802
Sep 18 2012
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:
 This is issue #1: I'd propose that all objects in std.random be
 migrated to classes (or be made reference structs), sooner than
 later. This might break some code, so I do not know how this is
 usually done, but I think it is necessary. I do not, however,
 propose that they should all derive from a base class.

Moving to classes would definitely break code, but it should be possible to make them reference types simply by making it so that their internal state is in a separate object held by a pointer.
 The second issue I've run into (issue #2) is that even with
 reference semantics, there are still some issues regarding
 phobo's behavior with references ranges: Completly un-documented
 and inconsistent.

Reference-type ranges are not properly tested by Phobos in general, so they just plain won't work in a number of cases. I've done some work in fixing that, but there's plenty more to do. I really need to fiinish the unit test helper stuff that I've been doing for ranges and get that into Phobos. It's almost done, but I was running into weird build errors and haven't sorted them out yet. Once that's done, it should be much easier to test a larger range of range types (including reference types). Regardless, std.range and std.algorithm in particular need to be properly tested for reference range types and the related bugs fixed. - Jonathan M Davis
Sep 18 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 18 September 2012 at 17:59:04 UTC, Jonathan M Davis 
wrote:
 On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:
 This is issue #1: I'd propose that all objects in std.random be
 migrated to classes (or be made reference structs), sooner than
 later. This might break some code, so I do not know how this is
 usually done, but I think it is necessary. I do not, however,
 propose that they should all derive from a base class.

Moving to classes would definitely break code, but it should be possible to make them reference types simply by making it so that their internal state is in a separate object held by a pointer.

I was thinking of doing that. The problem with this (as I've run into and stated in another thread), is a problem of initialization: The simpler PRNGs are init'ed seeded, and are ready for use immediately. Changing to this approach would break the initialization, as shown in this post: http://forum.dlang.org/thread/bvuquzwfykiytdwsqkky forum.dlang.org#post-yvtsivozyhqzscgddbrl:40forum.dlang.org A "used to be valid" PRNG has now become an un-initialized PRNG". This is extremely insidious, as the code still compiles, but will crash. I'm hoping my thread goes somewhere, but since I doubt it, I've thought up of two other solutions for a "clean" migration: #1 Keep as struct (as you suggest), but use the opCall hack. I HATE this solution. #2 Change to class, but leave behind some "opCall"s for each old constructor, plus an extra one for default: class A { static A opCall(){return new A();} static A opCall(int i){return new A(i);} this(){} this(int){} /// ... } void main() { A a1 = A(); //Works AND constructs A a2 = A(5); //Still Works A a3 = new A(); //Works A a4 = new A(5); //Works } In this case, I'm fine with having some opCalls, because technically, they aren't mixed with the constructors and used as hacks: The constructors are all there, and the opCalls, merelly forward to them. That said, they *would* be destined for deprecation. Use of "A a;" would create a problem though, but nothing's perfect... Is this second solution something you think I should look into?
Sep 19 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, September 20, 2012 08:51:28 monarch_dodra wrote:
 On Tuesday, 18 September 2012 at 17:59:04 UTC, Jonathan M Davis
 
 wrote:
 On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:
 This is issue #1: I'd propose that all objects in std.random be
 migrated to classes (or be made reference structs), sooner than
 later. This might break some code, so I do not know how this is
 usually done, but I think it is necessary. I do not, however,
 propose that they should all derive from a base class.

Moving to classes would definitely break code, but it should be possible to make them reference types simply by making it so that their internal state is in a separate object held by a pointer.

I was thinking of doing that. The problem with this (as I've run into and stated in another thread), is a problem of initialization: The simpler PRNGs are init'ed seeded, and are ready for use immediately. Changing to this approach would break the initialization, as shown in this post: http://forum.dlang.org/thread/bvuquzwfykiytdwsqkky forum.dlang.org#post-yvts ivozyhqzscgddbrl:40forum.dlang.org A "used to be valid" PRNG has now become an un-initialized PRNG". This is extremely insidious, as the code still compiles, but will crash.

There's always the check that the internals have been initialized on every call and initialize it if it hasn't been solution. It's not pretty, but it won't break code. It's actually a use case that makes me wish that we had something like the invariant which ran before every public function call except that it was always there (even in -release) and let you do anything you want. In any case, while it's a bit ugly, I believe that simply adding checks for initialization in every function call is the cleanest solution from the standpoint of backwards compatibility, and the ugliness is all self-contained. As far as performance goes, it's only an issue if you're iterating over it in a tight loop, but the actual random number generation is so much more expensive than a check for a null pointer that it probably doesn't matter.
 #2
 Change to class, but leave behind some "opCall"s for each old
 constructor, plus an extra one for default:

 Is this second solution something you think I should look into?

Since A a; will just blow up in your face if you switch it to a class, it's not a non- breaking change even as a migration path, so I don't see that as really being viable. Even if you've found a way to minimize the immediate code breakage, you didn't eliminate it. If you're going to break code immediately, you might as well just break it all at once and get people to fix their stuff rather than mostly fix it but not quite, especially when you're asking them to change their code later anyway as part of a migration path. Regardless, when this came up previously, I believe that the conclusion was that if we were going to switch to classes, we needed to do something like create std.random2 and schedule std.random for deprecation rather than changing the current structs to classes (either that or rename _every_ type in there and schedule them for deprecation individually, but then you have to come up for new names for everything, and it's more of a pain to migrate, since all the names changed rather than just the import). So, I believe that the idea of switching to classes was pretty much rejected previously unless entirely new types were used so that no code would be broken. I think that we have two options at this point: 1. Switch the internals so that they're in a separate struct pointed to by the outer struct and check for initialization on every function call to avoid the problem where init was used. 2. Create a new module to replace std.random and make them final classes in there, scheduling the old module for deprecation. Honestly, I'd just go with #1 at this point, because it avoids breaking code, and there's increasing resistance to breaking code. Even Andrei, who was fairly willing to break code for improvements before, is almost paranoid about it now, and Walter was _always_ against it. So, if we have a viable solution that avoids breaking code (especially if any ugliness that comes with it is internal to the implementation), we should probably go with that. - Jonathan M Davis
Sep 20 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 20 September 2012 at 07:26:01 UTC, Jonathan M Davis 
wrote:
 On Thursday, September 20, 2012 08:51:28 monarch_dodra wrote:
 On Tuesday, 18 September 2012 at 17:59:04 UTC, Jonathan M Davis
 
 wrote:
 On Tuesday, September 18, 2012 17:05:26 monarch_dodra wrote:
 This is issue #1: I'd propose that all objects in 
 std.random be
 migrated to classes (or be made reference structs), sooner 
 than
 later. This might break some code, so I do not know how 
 this is
 usually done, but I think it is necessary. I do not, 
 however,
 propose that they should all derive from a base class.

Moving to classes would definitely break code, but it should be possible to make them reference types simply by making it so that their internal state is in a separate object held by a pointer.

I was thinking of doing that. The problem with this (as I've run into and stated in another thread), is a problem of initialization: The simpler PRNGs are init'ed seeded, and are ready for use immediately. Changing to this approach would break the initialization, as shown in this post: http://forum.dlang.org/thread/bvuquzwfykiytdwsqkky forum.dlang.org#post-yvts ivozyhqzscgddbrl:40forum.dlang.org A "used to be valid" PRNG has now become an un-initialized PRNG". This is extremely insidious, as the code still compiles, but will crash.

There's always the check that the internals have been initialized on every call and initialize it if it hasn't been solution. It's not pretty, but it won't break code. It's actually a use case that makes me wish that we had something like the invariant which ran before every public function call except that it was always there (even in -release) and let you do anything you want. In any case, while it's a bit ugly, I believe that simply adding checks for initialization in every function call is the cleanest solution from the standpoint of backwards compatibility, and the ugliness is all self-contained. As far as performance goes, it's only an issue if you're iterating over it in a tight loop, but the actual random number generation is so much more expensive than a check for a null pointer that it probably doesn't matter.
 #2
 Change to class, but leave behind some "opCall"s for each old
 constructor, plus an extra one for default:

 Is this second solution something you think I should look into?

Since A a; will just blow up in your face if you switch it to a class, it's not a non- breaking change even as a migration path, so I don't see that as really being viable. Even if you've found a way to minimize the immediate code breakage, you didn't eliminate it. If you're going to break code immediately, you might as well just break it all at once and get people to fix their stuff rather than mostly fix it but not quite, especially when you're asking them to change their code later anyway as part of a migration path. Regardless, when this came up previously, I believe that the conclusion was that if we were going to switch to classes, we needed to do something like create std.random2 and schedule std.random for deprecation rather than changing the current structs to classes (either that or rename _every_ type in there and schedule them for deprecation individually, but then you have to come up for new names for everything, and it's more of a pain to migrate, since all the names changed rather than just the import). So, I believe that the idea of switching to classes was pretty much rejected previously unless entirely new types were used so that no code would be broken. I think that we have two options at this point: 1. Switch the internals so that they're in a separate struct pointed to by the outer struct and check for initialization on every function call to avoid the problem where init was used. 2. Create a new module to replace std.random and make them final classes in there, scheduling the old module for deprecation. Honestly, I'd just go with #1 at this point, because it avoids breaking code, and there's increasing resistance to breaking code. Even Andrei, who was fairly willing to break code for improvements before, is almost paranoid about it now, and Walter was _always_ against it. So, if we have a viable solution that avoids breaking code (especially if any ugliness that comes with it is internal to the implementation), we should probably go with that. - Jonathan M Davis

TY for your insight. Good points. I'll try to do your "#1": It is simple and non breaking. *IF* we ever do decide to break, and rather it be done after a very thourough discussion of requirements, specifications, migration path, etc...
Sep 20 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 20 Sep 2012 08:51:28 +0200
schrieb "monarch_dodra" <monarchdodra gmail.com>:

 Moving to classes would definitely break code, but it should be 
 possible to
 make them reference types simply by making it so that their 
 internal state is
 in a separate object held by a pointer.  

I was thinking of doing that. The problem with this (as I've run into and stated in another thread), is a problem of initialization: The simpler PRNGs are init'ed seeded, and are ready for use immediately. Changing to this approach would break the initialization, as shown in this post:

How would the internal state be allocated? malloc + refcounting or GC? I'm not really happy about that as I'd like to avoid allocation (especially GC) whenever possible. Using a PRNG only locally in a function seems like a valid use case to me and it currently doesn't require allocation. As the similar problem with std.digest/std.algorithm.copy has showed value type ranges seem not very well supported in phobos. I wonder why we don't pass all struct/value type ranges by ref? Is there a reason this wouldn't work? Or we could leave the PRNGs as is and provide reference wrappers. This would allow to place the PRNG on the stack and still make it work with all range functions. But it's also cumbersome and error prone. Best solution is probably to have wrappers which allocate by default, but also allow to pass a reference to a stack allocated value. Then make the wrappers default, so the default's safe and easy to use. Pseudo-code: struct RNG_Impl { uint front(); void popFront(); bool empty(); } struct RNG { RNG_Impl* impl; this(ref RNG_Impl impl) impl = cast(RNG_Impl*) impl; void initialize() { assert(!impl); //Either initialize or constructor impl = new RNG_Impl(); } } RNG_Impl impl; RNG(impl).take(5); //No allocation (but must not leak references...) Regarding the initialization check: I'd avoid the check in release mode. Not initializing a struct is a developer mistake and should be found in debug mode. I think it's unlikely that error handling code can handle such a situation anyway. But you could check how std.typecons.RefCounted handles this, as it also need explicit initialization.
Sep 20 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 20 September 2012 at 09:58:41 UTC, Johannes Pfau 
wrote:
 Am Thu, 20 Sep 2012 08:51:28 +0200
 schrieb "monarch_dodra" <monarchdodra gmail.com>:

 Moving to classes would definitely break code, but it should 
 be possible to
 make them reference types simply by making it so that their 
 internal state is
 in a separate object held by a pointer.

I was thinking of doing that. The problem with this (as I've run into and stated in another thread), is a problem of initialization: The simpler PRNGs are init'ed seeded, and are ready for use immediately. Changing to this approach would break the initialization, as shown in this post:

How would the internal state be allocated? malloc + refcounting or GC? I'm not really happy about that as I'd like to avoid allocation (especially GC) whenever possible. Using a PRNG only locally in a function seems like a valid use case to me and it currently doesn't require allocation. As the similar problem with std.digest/std.algorithm.copy has showed value type ranges seem not very well supported in phobos. I wonder why we don't pass all struct/value type ranges by ref? Is there a reason this wouldn't work? Or we could leave the PRNGs as is and provide reference wrappers. This would allow to place the PRNG on the stack and still make it work with all range functions. But it's also cumbersome and error prone. Best solution is probably to have wrappers which allocate by default, but also allow to pass a reference to a stack allocated value. Then make the wrappers default, so the default's safe and easy to use. Pseudo-code: struct RNG_Impl { uint front(); void popFront(); bool empty(); } struct RNG { RNG_Impl* impl; this(ref RNG_Impl impl) impl = cast(RNG_Impl*) impl; void initialize() { assert(!impl); //Either initialize or constructor impl = new RNG_Impl(); } } RNG_Impl impl; RNG(impl).take(5); //No allocation (but must not leak references...) Regarding the initialization check: I'd avoid the check in release mode. Not initializing a struct is a developer mistake and should be found in debug mode. I think it's unlikely that error handling code can handle such a situation anyway. But you could check how std.typecons.RefCounted handles this, as it also need explicit initialization.

That is a good point, I'd also make the "default" heap allocated, but give a way to access a stack allocated payload. Regarding the "developer mistake", the problem is that currently: "Misnstdrand a;" Will create a valid and seeded PRNG that is ready for use, so we can't break that. Arguably though, the argument holds for Mersenne twister, which needs a function call to be (default) seeded. HOWEVER I find that: auto a = MersenneTwister(); //Not seeded and invalid auto b = MersenneTwister(5); //Seeded and valid Is a confusing, especially since MersenneTwister provides "seed()" to default seed. I'd rather have: auto a = MersenneTwister(); //Not *yet* seeded: It will be done on the fly... auto b = MersenneTwister(5); //Seeded and valid But AGAIN, on the other hand, if you change back again to your proposed stack allocated MersenneTwisterImpl: auto a = MersenneTwister(); //Not Seeded, but not assertable auto b = MersenneTwister(5); //Seeded and valid It really feels like there is no perfect solution. ******** The truth is that I would have rather ALL prngs not have ANY constructors, and that they ALL required an explicit seed: This would be uniform, and not have any surprises. OR That creating a prng would seed it with the default seed if nothing is specified, meaning that a prng is ALWAYS valid. It feels like the current behavior is a bastardly hybrid...
Sep 20 2012
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 20 Sep 2012 12:23:12 +0200
schrieb "monarch_dodra" <monarchdodra gmail.com>:
 
 That is a good point, I'd also make the "default" heap allocated, 
 but give a way to access a stack allocated payload.
 
 Regarding the "developer mistake", the problem is that currently:
 "Misnstdrand a;"
 Will create a valid and seeded PRNG that is ready for use, so we 
 can't break that.

Well for by reference types we'd need intialization, so there's probably no way not to break it. We'd probably have to do the initialization check in release mode for some phobos releases. But after some time it should be changed to a debug-only warning.
 
 Arguably though, the argument holds for Mersenne twister, which 
 needs a function call to be (default) seeded.
 
 HOWEVER I find that:
 auto a = MersenneTwister();  //Not seeded and invalid
 auto b = MersenneTwister(5); //Seeded and valid
 Is a confusing, especially since MersenneTwister provides 
 "seed()" to default seed.
 
 I'd rather have:
 auto a = MersenneTwister();  //Not *yet* seeded: It will be done 
 on the fly...
 auto b = MersenneTwister(5); //Seeded and valid
 
 But AGAIN, on the other hand, if you change back again to your 
 proposed stack allocated MersenneTwisterImpl:
 auto a = MersenneTwister();  //Not Seeded, but not assertable
 auto b = MersenneTwister(5); //Seeded and valid
 
 It really feels like there is no perfect solution.
 
 ********
 The truth is that I would have rather ALL prngs not have ANY 
 constructors, and that they ALL required an explicit seed: This 
 would be uniform, and not have any surprises.
 
 OR
 
 That creating a prng would seed it with the default seed if 
 nothing is specified, meaning that a prng is ALWAYS valid.
 
 It feels like the current behavior is a bastardly hybrid...

That's what I did in std.digest. All Digests have a start() method, even if it's not necessary for that specific Digest. It's probably a good solution if you want a uniform interface for types which need initialization and types which don't. But there's also another, nice solution which should work: Introduce a new makeRNG template function. This func checks if the RNG has a seed function and calls it if available. Then you can do this for all RNGs: auto rng = makeRNG!MersenneTwister(); auto rng = makeRNG!Misnstdrand();
Sep 20 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 20 September 2012 at 10:46:22 UTC, Johannes Pfau 
wrote:
 That's what I did in std.digest. All Digests have a start() 
 method,
 even if it's not necessary for that specific Digest. It's 
 probably a
 good solution if you want a uniform interface for types which 
 need
 initialization and types which don't.

 But there's also another, nice solution which should work: 
 Introduce a
 new makeRNG template function. This func checks if the RNG has 
 a seed
 function and calls it if available. Then you can do this for 
 all RNGs:

 auto rng = makeRNG!MersenneTwister();
 auto rng = makeRNG!Misnstdrand();

These are good suggestions, but they are also breaking changes :/ I *AM* writing them down though, should we ever go down Jonathan M Davis's suggestion is changing the module. BTW: std.container also has MakeContainter, but, AFAIK, I've never seen ANYONE use it :/
Sep 20 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, September 20, 2012 12:57:03 monarch_dodra wrote:
 BTW: std.container also has MakeContainter, but, AFAIK, I've
 never seen ANYONE use it :/

What std.container has is make, which is supposed to construct a type where it goes by default (classes on the heap with new, and structs on the stack). The idea is definitely useful, and I have a pull request to improve upon it: https://github.com/D-Programming-Language/phobos/pull/756 But I don't think that I've ever seen anyone use std.container.make either. The new version will be more useful though (particularly, since it'll work on more than just structs and classes). - Jonathan M Davis
Sep 20 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 20 September 2012 at 11:10:43 UTC, Jonathan M Davis 
wrote:
 [SNIP]

 - Jonathan M Davis

#1 Hey, I've been working on this (locally): I've made all the PRNGs reference ranges. It actually works perfectly. I took the "ensure initialized" route, as you suggested. I was able to take an approach that moved little code, so there is a clean distinction between the old "Payload", which was not touched (much), and the wrappers. The advantage is that the diff is very clean. I was able to do this without adding or removing any functionality. Great! Regarding the cost of "is initialized", I think I found a very good work around. I'll show it in a pull. #2 Johannes Pfau: When asking for a generator, by default you get a reference prng. This is the _safe_ approach. If somebody *really* wants to, they can always declare a PRNGPayload on the stack, but I advise against that as: *Passing it by value could generate duplicate sequences *Passing it by value (for certain PRNGs) can be prohibitively expansive. But the option is there. #3 The only thing I'm having an issue with is "save". IMO, it is exceptionally dangerous to have a PRNG be a ForwardRange: It should only be saved if you have a damn good reason to do so. You can still "dup" if you want (manually) (if you think that is smart), but I don't think it should answer true to "isForwardRange". You just don't know what an algorithm will do under the hood if it finds out the range is saveable. In particular, save can be non-trivial and expensive... I think if somebody really wants to be able to access some random numbers more than once, they are just as well off doing: Type[] randomNumbers = myPRNG.take(50).array(); The advantage here is that at no point to you ever risk having duplicate numbers. (Take advances the reference range generator. QUESTION: If I (were to) deprecate "save", how would that work with the range traits type? If a range has "save", but it is deprecated, does it answer true to isForwardRange?
Sep 21 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 21 September 2012 at 13:19:55 UTC, monarch_dodra wrote:
 QUESTION:
 If I (were to) deprecate "save", how would that work with the 
 range traits type? If a range has "save", but it is deprecated, 
 does it answer true to isForwardRange?

Never mind I found out the answer: Using something deprecated is a compile error, so traits correctly report the change.
Sep 21 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, September 21, 2012 15:20:49 monarch_dodra wrote:
 #3
 The only thing I'm having an issue with is "save". IMO, it is
 exceptionally dangerous to have a PRNG be a ForwardRange: It
 should only be saved if you have a damn good reason to do so. You
 can still "dup" if you want (manually) (if you think that is
 smart), but I don't think it should answer true to
 "isForwardRange".

It is _very_ crippling to a range to not be a forward range. There's lots of stuff that requires it. And I really don't see what the problem is. _Maybe_ it's okay for them to be input ranges and not forward ranges, but in general, I think that we need to be _very_ careful about doing that. It can be really, really annoying when a range is not a forward range.
 You just don't know what an algorithm will do under the hood if
 it finds out the range is saveable. In particular, save can be
 non-trivial and expensive...

It shouldn't be. It's a copy, and copy's are not supposed to be expensive. We've discussed this before with regards to postlbit constructors. Certainly, beyond the extra cost of having to new things up in some cases, they should be relatively cheap.
 QUESTION:
 If I (were to) deprecate "save", how would that work with the
 range traits type? If a range has "save", but it is deprecated,
 does it answer true to isForwardRange?

You'd have to test it. It might depend on whether -d is used, but it could easily be that it'll be true as long as save exists. - Jonathan M Davis
Sep 21 2012
prev sibling next sibling parent "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
I've ran into the problem with Random, there are some bugs 
entered by bearophile. Since then they added

http://dlang.org/phobos/std_range.html#RefRange

but I haven't used it to see how well it solves the problem.
Sep 21 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 21 September 2012 at 19:47:16 UTC, Jonathan M Davis 
wrote:
 On Friday, September 21, 2012 15:20:49 monarch_dodra wrote:
 #3
 The only thing I'm having an issue with is "save". IMO, it is
 exceptionally dangerous to have a PRNG be a ForwardRange: It
 should only be saved if you have a damn good reason to do so. 
 You
 can still "dup" if you want (manually) (if you think that is
 smart), but I don't think it should answer true to
 "isForwardRange".

It is _very_ crippling to a range to not be a forward range. There's lots of stuff that requires it. And I really don't see what the problem is. _Maybe_ it's okay for them to be input ranges and not forward ranges, but in general, I think that we need to be _very_ careful about doing that. It can be really, really annoying when a range is not a forward range.

I know it is crippling, but that is kind of the entire point: If somebody wants to read the same numbers trough twice, he'd better have a damn good reason. For example, *even* with reference ranges: ---- auto rng = SomeReferenceForwardRange(); auto a = new int[](10); auto b = new int[](10); a.fill(rng); b.fill(rng); ---- This will fill a and b with the *same* numbers (!), even though rng is a reference range (!) Arguably, the bug is in fill (and has since been fixed), but the point is that by the time you notice it, who knows how long you've debugged? And who knows which *other* algorithms will break your prnd? Had the rng been only InputRange, the compilation would have raised an error. And the work around is also easy. Safety first, right? It might not be worth deprecating *now*, but I do think it is a latent danger. PS: In all of random, there is not one single use case for having a ForwardPrng. Just saying.
 You just don't know what an algorithm will do under the hood if
 it finds out the range is saveable. In particular, save can be
 non-trivial and expensive...

It shouldn't be. It's a copy, and copy's are not supposed to be expensive. We've discussed this before with regards to postlbit constructors. Certainly, beyond the extra cost of having to new things up in some cases, they should be relatively cheap.

The copy of the reference is designed to be cheap (and is), yes (we've discussed this). However, when you save, you have to duplicate the payload, and even considering that there is no postblit cost, you have strictly no control over its size: LCG: 1 Int XORShift: 5 Ints MersenneTwister: 397 Ints LaggedFib: (607 to 44497) ulongs or doubles Not the end of the world, but not trivially cheap either.
 QUESTION:
 If I (were to) deprecate "save", how would that work with the
 range traits type? If a range has "save", but it is deprecated,
 does it answer true to isForwardRange?

You'd have to test it. It might depend on whether -d is used, but it could easily be that it'll be true as long as save exists. - Jonathan M Davis

I just tested it btw: without -d: isForwardRange: false with -d: isForwardRange: true It is kind of the logical behavior actually.
Sep 21 2012
prev sibling next sibling parent "jerro" <a a.com> writes:
On Friday, 21 September 2012 at 19:47:16 UTC, Jonathan M Davis 
wrote:
 On Friday, September 21, 2012 15:20:49 monarch_dodra wrote:
 #3
 The only thing I'm having an issue with is "save". IMO, it is
 exceptionally dangerous to have a PRNG be a ForwardRange: It
 should only be saved if you have a damn good reason to do so. 
 You
 can still "dup" if you want (manually) (if you think that is
 smart), but I don't think it should answer true to
 "isForwardRange".

It is _very_ crippling to a range to not be a forward range. There's lots of stuff that requires it. And I really don't see what the problem is. _Maybe_ it's okay for them to be input ranges and not forward ranges, but in general, I think that we need to be _very_ careful about doing that. It can be really, really annoying when a range is not a forward range.

Reference type ranges could have a payload property which would return the underlying value type range which would be a forward range. If you need a forward range, you could just use that.
Sep 22 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 18/09/12 17:05, monarch_dodra wrote:
 As you can see from the ouput, that is not very random. That's just the "tip of
 the iceberg". *Anything* in phobos that iterates on a range, such a fill,
 filter, or whatever, will not advance the PRNG, arguably breaking it.

 At best, a "global" PRNG will work, provided it is never ever passed as an
 argument to a method.

Just to note, we already have an existing problem with this in Phobos, with the RandomSample functionality. See: http://d.puremagic.com/issues/show_bug.cgi?id=8247
Sep 25 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 25 September 2012 at 16:15:24 UTC, Joseph Rushton 
Wakeling wrote:
 On 18/09/12 17:05, monarch_dodra wrote:
 As you can see from the ouput, that is not very random. That's 
 just the "tip of
 the iceberg". *Anything* in phobos that iterates on a range, 
 such a fill,
 filter, or whatever, will not advance the PRNG, arguably 
 breaking it.

 At best, a "global" PRNG will work, provided it is never ever 
 passed as an
 argument to a method.

Just to note, we already have an existing problem with this in Phobos, with the RandomSample functionality. See: http://d.puremagic.com/issues/show_bug.cgi?id=8247

Thank you for bringing it up. The problem is indeed as you said, and making the PRNGs references types fixes it. I have a pretty well developed first iteration. I hope that by the end of next week, I'll have something to show. Right now, I'm at a crossroad between: *Inserting (requested) new features, (forced stack allocation, pre-initialized ranges for performance) *Strictly no new features, for an eventual (easier) breaking change. The problem with inserting "workaround 'features'", is that you have to keep supporting even if it is not required anymore :p
Sep 25 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 25/09/12 18:36, monarch_dodra wrote:
 Thank you for bringing it up. The problem is indeed as you said, and making the
 PRNGs references types fixes it.

 I have a pretty well developed first iteration. I hope that by the end of next
 week, I'll have something to show.

That would be fantastic. I'm a bit compromised right now time-wise but do ping me if I can do anything to help you test stuff, particularly wrt RandomSample.
Sep 27 2012
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 27 September 2012 at 10:06:54 UTC, Joseph Rushton 
Wakeling wrote:
 On 25/09/12 18:36, monarch_dodra wrote:
 Thank you for bringing it up. The problem is indeed as you 
 said, and making the
 PRNGs references types fixes it.

 I have a pretty well developed first iteration. I hope that by 
 the end of next
 week, I'll have something to show.

That would be fantastic. I'm a bit compromised right now time-wise but do ping me if I can do anything to help you test stuff, particularly wrt RandomSample.

Well, it took me a month (I was working on other things), but I have provided a change to make PRNGs reference ranges. The pull request is here: https://github.com/D-Programming-Language/phobos/pull/893 Please review or comment.
Oct 26 2012