digitalmars.D - Dconf 2015 talks...

Era Scarecrow (29/29) Jan 25 2016 Finally getting around to watching all the talks, all good stuff

Dicebot (3/3) Jan 25 2016 Main problem is with both making it @safe and avoid unforced
Joseph Rushton Wakeling (15/45) Jan 25 2016 It's still an issue, at least partly because amid all other

Era Scarecrow (5/12) Jan 25 2016 Hmm i wonder... If recognizes it as infinite, could it avoid

Joseph Rushton Wakeling (10/13) Jan 25 2016 They shouldn't be forward ranges anyway, because if their content

Joseph Rushton Wakeling (6/6) Jan 25 2016 BTW, I apologize for the rather terse replies so far; busy day
Jonathan M Davis (16/25) Jan 25 2016 As long as the numbers are pseudo-random, then in theory, there's

Joseph Rushton Wakeling (47/62) Jan 25 2016 I thought that too, for a long time, but I came to the conclusion

Era Scarecrow (10/27) Jan 25 2016 So in short, the RNG shouldn't be a range at all. Of course it

Jonathan M Davis (13/24) Jan 25 2016 Why? They work fine as input ranges regardless of whether having
Joseph Rushton Wakeling (47/57) Jan 25 2016 Well, an idea that was suggested to me at DConf (by several

Andrei Alexandrescu (2/4) Jan 25 2016 That's semantically the same as an input range. -- Andrei

Joseph Rushton Wakeling (12/17) Jan 25 2016 “Yes, but...” :-P

Andrei Alexandrescu (4/6) Jan 25 2016 I see no problem with adding a category of rngs that are not forward.

Joseph Rushton Wakeling (5/9) Jan 25 2016 It's not about different categories of RNG in this case --

Andrei Alexandrescu (6/12) Jan 26 2016 I think a pseudo-rng as a forward range is useful. It's good in testing

Joseph Rushton Wakeling (16/21) Jan 26 2016 Yes, but there are other ways to fork that sequence that don't

Era Scarecrow (10/16) Jan 25 2016 What about an alternative allocator? Specifically I'm thinking

Joseph Rushton Wakeling (6/14) Jan 25 2016 I have been wondering about how allocators could help to deal

Era Scarecrow (56/70) Jan 25 2016 Most likely alloca would have to be built into the compiler.

Joseph Rushton Wakeling (6/9) Jan 25 2016 An apology in advance: I have an early start tomorrow, and a busy
Joseph Rushton Wakeling (27/67) Feb 07 2016 Apologies for the delay in writing back about this.

Era Scarecrow (33/65) Feb 07 2016 That's the low level assembly language that's generated i'm

Joseph Rushton Wakeling (24/34) Feb 08 2016 It's that "passed outside of the scope where they were generated"

John Colvin (27/28) Feb 08 2016 This might be a stupid idea, but perhaps there's something useful

Era Scarecrow <rtcvb32 yahoo.com> writes:

  Finally getting around to watching all the talks, all good stuff 
:) Figured I'd make a thread talking about things I came across 
and perhaps get into a discussion on them in detail.

  Anyways I'm watching the talk with Joseph Wakeling involving RNG 
and I wonder, is that still a problem or not?

  I ask since the simplest and most correct thing I can think of 
is to use the heap for a referenced state, and dup would then 
duplicate the state properly while otherwise the RNG problems go 
away that most of the talk went on about.

  It's possible I've been out of the loop and this has already 
been solved. Not sure yet, I haven't looked closely at any of the 
sources or the language updates.

  So... time for some rusty D prototyping.

[code]
struct RNG {
   int* state;

   this(int seed) {
     state = new int;
     *state = seed;
   }

   //if you provide a pointer, we don't rely on the heap/gc at 
all...
   //maybe a small stack based allocator could be useful for this?
   this(int* seed) { state = seed; }

   //dup instead of save, so it's not a forward range
   RNG dup() { return RNG(*state); }

   //popfront, front & empty as expected
}
[/code]

Jan 25 2016

Dicebot <public dicebot.lv> writes:

Main problem is with both making it  safe and avoid unforced 
allocations (i.e. allowing shared state to be kept on stack of 
`main`).

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 13:05:46 UTC, Era Scarecrow wrote:
  Finally getting around to watching all the talks, all good 
 stuff :) Figured I'd make a thread talking about things I came 
 across and perhaps get into a discussion on them in detail.

  Anyways I'm watching the talk with Joseph Wakeling involving 
 RNG and I wonder, is that still a problem or not?

It's still an issue, at least partly because amid all other 
things in life, I haven't had much opportunity to sit down and 
focus on it :-(

  I ask since the simplest and most correct thing I can think of 
 is to use the heap for a referenced state, and dup would then 
 duplicate the state properly while otherwise the RNG problems 
 go away that most of the talk went on about.

  It's possible I've been out of the loop and this has already 
 been solved. Not sure yet, I haven't looked closely at any of 
 the sources or the language updates.

  So... time for some rusty D prototyping.

 [code]
 struct RNG {
   int* state;

   this(int seed) {
     state = new int;
     *state = seed;
   }

   //if you provide a pointer, we don't rely on the heap/gc at 
 all...
   //maybe a small stack based allocator could be useful for 
 this?
   this(int* seed) { state = seed; }

   //dup instead of save, so it's not a forward range
   RNG dup() { return RNG(*state); }

   //popfront, front & empty as expected
 }
 [/code]

The major problems are not so much with RNGs per se (note that 
your approach can actually be much more nicely implemented as a 
final class).  It's all the functionality that _wraps_ RNGs, such 
as random algorithms like randomCover and randomSample, or a D 
equivalent of C++'s random distributions.  These need to also be 
reference types (for their internal state as well as the RNG they 
wrap), and here it gets very tricky to avoid nasty situations 
with lots of allocations and frees (when ideally, you'd like 
stuff like this to be stack only).

Don't even ask about how complicated a `.dup` method gets when 
you start trying to extend to such entities. :-(

Jan 25 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Monday, 25 January 2016 at 14:31:12 UTC, Joseph Rushton 
Wakeling wrote:
 It's all the functionality that _wraps_ RNGs, such as random 
 algorithms like randomCover and randomSample, or a D equivalent 
 of C++'s random distributions.  These need to also be reference 
 types (for their internal state as well as the RNG they wrap), 
 and here it gets very tricky to avoid nasty situations with 
 lots of allocations and frees (when ideally, you'd like stuff 
 like this to be stack only).

  Hmm i wonder... If recognizes it as infinite, could it avoid 
treating them as forward ranges? As a struct it still wouldn't 
work, but as a class/reference type it would work then...

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 15:38:45 UTC, Era Scarecrow wrote:
  Hmm i wonder... If recognizes it as infinite, could it avoid 
 treating them as forward ranges? As a struct it still wouldn't 
 work, but as a class/reference type it would work then...

They shouldn't be forward ranges anyway, because if their content 
is randomly generated then it's not legit for them to implement 
the .save property.  The whole implementation of stuff in 
std.random via forward ranges is a massive design mistake.

Implementing the random algorithms/other wrappers as a class is 
problematic because then you get into the hassle of potentially 
having to new/free a lot of individual heap objects deep in the 
inner loop of your program.  I already tried this in hap.random, 
and came to the conclusion that it wasn't a valid approach.

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

BTW, I apologize for the rather terse replies so far; busy day 
:-)  I'll try and write out a more complete summary of the 
problem at some point in the near future (though it might have to 
wait 'til the weekend).

Thanks for the interest in contributing to solving this problem 
:-)

Jan 25 2016

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, 25 January 2016 at 17:19:05 UTC, Joseph Rushton 
Wakeling wrote:
 On Monday, 25 January 2016 at 15:38:45 UTC, Era Scarecrow wrote:
  Hmm i wonder... If recognizes it as infinite, could it avoid 
 treating them as forward ranges? As a struct it still wouldn't 
 work, but as a class/reference type it would work then...

 They shouldn't be forward ranges anyway, because if their 
 content is randomly generated then it's not legit for them to 
 implement the .save property.  The whole implementation of 
 stuff in std.random via forward ranges is a massive design 
 mistake.

As long as the numbers are pseudo-random, then in theory, there's 
no problem with a range of random numbers being a forward range. 
That could be useful upon occasion (similar to how it can be 
useful that the same seed results in the same sequence of 
numbers). The problem is that if they're a forward range, then 
you tend to get code that accidentally ends up reusing numbers in 
the sequence and that definitely is a problem. So ultimately, 
they probably should be input ranges rather than forward ranges, 
but I think that it's more of an issue with human error than with 
what makes sense from a technical perspective.

Regardless, non-pseudo random number generators obviously can't 
be forward ranges though, since their state isn't savable or 
repeatable.

- Jonathan M Davis

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 18:40:59 UTC, Jonathan M Davis 
wrote:
 As long as the numbers are pseudo-random, then in theory, 
 there's no problem with a range of random numbers being a 
 forward range.

I thought that too, for a long time, but I came to the conclusion 
it's not the case.

Here's the essence of the problem: a forward range is a promise 
to the code that uses it, "I am a deterministic sequence."  But 
the whole point of pseudo-random sequences is that you're not 
supposed to be able to tell them from actual randomness.

If you _tell_ downstream code "I am deterministic", that code may 
make assumptions about how it can use you, that are valid for 
"normal" deterministic sequences, but aren't valid for what are 
supposedly random sequences.

Consider the typical handling of forward ranges:

     auto foo (R) (R range)
         if (isForwardRange!R)
     {
         auto rcopy = range.save;

         // do something with rcopy
     }

That's fine if you're dealing with something whose behaviour is 
meant to be deterministic, but if you call this with a 
pseudo-random sequence, than every time you call it, you will get 
the same result.  It won't matter if what you pass is a 
reference-type -- the .save (which is correct behaviour for 
handling a forward range) means you're just repeatedly using a 
copy of the random sequence.

 That could be useful upon occasion (similar to how it can be 
 useful that the same seed results in the same sequence of 
 numbers).

Right, but the point is that re-use of a pseudo-random sequence 
should happen at programmer request only, not under the hood in 
library code they call -- which is what unfortunately happens if 
you implement RNGs and other random sequences as forward ranges 
:-(


 The problem is that if they're a forward range, then you tend 
 to get code that accidentally ends up reusing numbers in the 
 sequence and that definitely is a problem. So ultimately, they 
 probably should be input ranges rather than forward ranges, but 
 I think that it's more of an issue with human error than with 
 what makes sense from a technical perspective.

Again, I thought that too, but I came to the conclusion that I'd 
been wrong.  It's not about fallible humans, it's about the 
promise a forward range makes.

Superficially, it looks like that promise is: "You can use my 
deterministic nature if you need to."  But actually, I think it 
is really something stronger: "You can use my deterministic 
nature freely and nothing will go wrong in doing so."

That's a much stronger promise than can be allowed for a 
pseudo-RNG, and I think it's borne out in the way in which e.g. 
phobos functionality makes use of forward ranges.

 Regardless, non-pseudo random number generators obviously can't 
 be forward ranges though, since their state isn't savable or 
 repeatable.

Right -- non-PRNGs must be input ranges by design.  I came to the 
conclusion that pseudo-RNGs need to be input ranges, but that 
implement an alternative to .save: a .dup method whose promise 
is, "You can duplicate the state and hence behaviour of this 
range, but you can't make any assumptions that this is a safe or 
sane thing to do."

Jan 25 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Monday, 25 January 2016 at 21:20:09 UTC, Joseph Rushton 
Wakeling wrote:
 I thought that too, for a long time, but I came to the 
 conclusion it's not the case.

 <snip>

 That's fine if you're dealing with something whose behaviour is 
 meant to be deterministic, but if you call this with a 
 pseudo-random sequence, than every time you call it, you will 
 get the same result.  It won't matter if what you pass is a  
 reference-type -- the .save (which is correct behaviour for 
 handling a forward range) means you're just repeatedly using a 
 copy of the random sequence.

 <snip>

 Right -- non-PRNGs must be input ranges by design.  I came to 
 the conclusion that pseudo-RNGs need to be input ranges, but 
 that implement an alternative to .save: a .dup method whose 
 promise is, "You can duplicate the state and hence behaviour of 
 this range, but you can't make any assumptions that this is a 
 safe or sane thing to do."


  So in short, the RNG shouldn't be a range at all. Of course it 
could be a struct (for sanity and other reasons), but not a range.

  I wonder then, assuming we remove RNG from being a range, the a 
RNG could give out a delegate of it's current data, and that 
delegate get passed to a range-like wrapper? Or maybe the RNG 
returns a Voldemort range that referrs to the original RNG which 
isn't a range anymore... Actually that sounds promising. Aliasing 
could deal with it automatically converting/creating the range.

Jan 25 2016

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, 25 January 2016 at 21:30:47 UTC, Era Scarecrow wrote:
 On Monday, 25 January 2016 at 21:20:09 UTC, Joseph Rushton 
 Wakeling wrote:
 Right -- non-PRNGs must be input ranges by design.  I came to 
 the conclusion that pseudo-RNGs need to be input ranges, but 
 that implement an alternative to .save: a .dup method whose 
 promise is, "You can duplicate the state and hence behaviour 
 of this range, but you can't make any assumptions that this is 
 a safe or sane thing to do."


  So in short, the RNG shouldn't be a range at all. Of course it 
 could be a struct (for sanity and other reasons), but not a 
 range.

Why? They work fine as input ranges regardless of whether having 
them be forward ranges make sense. By its very nature, an input 
range cannot be saved, and it's not assumed that it's 
deterministic. The issue that we run into isn't that they're 
ranges but that they're not implemented as reference types, and 
copies of them accidentally get used, resulting in subtle bugs. 
But I'd strongly argue that we should at least consider requiring 
that all input ranges be reference types, since they don't make 
sense as value types, and pseudo-reference types are just plain 
buggy. That's a wider discussion than random number generator 
ranges though.

- Jonathan M Davis

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 21:30:47 UTC, Era Scarecrow wrote:
  So in short, the RNG shouldn't be a range at all. Of course it 
 could be a struct (for sanity and other reasons), but not a 
 range.

  I wonder then, assuming we remove RNG from being a range, the 
 a RNG could give out a delegate of it's current data, and that 
 delegate get passed to a range-like wrapper? Or maybe the RNG 
 returns a Voldemort range that referrs to the original RNG 
 which isn't a range anymore... Actually that sounds promising. 
 Aliasing could deal with it automatically converting/creating 
 the range.

Well, an idea that was suggested to me at DConf (by several 
people; thanks in particular to Steve S. and to DiceBot!) was 
indeed to separate RNG state from the range interface, and to 
disable copy-by-value for the state data structure.

One option would be to implement the basic RNG data structor à la 
C++, as a functor: so it'd be something like:

     struct SomeRNG(UIntType, ...)
     {
       private:
         // the RNG state variables

       public:
         UIntType opCall()
         {
             // returns a random variate
         }
     }

... and again, you'd disable copy-by-value for this entity.  I 
had some fun a while ago playing with this and the appropriate 
technique to wrap it into a range (it's not as trivial as you 
think, because you need to use some tricks to ensure truly lazy 
evaluation of the range, where D ranges typically evaluate the 
first element eagerly).

Where I ran into trouble was considering how to extend this 
approach in a viable way to stuff like randomSample, i.e. the 
stuff which wraps randomness, which also needs to ensure its 
internal state is never copied by value, and yet which needs 
(ideally) to fit nicely into a UFCS chain of ranges:

     someInput.filter!(WHATEVER).randomSample(n, 
rng).map!(...).writeln;

I suppose it might be possible to implement a functor-based 
approach for this, that could be wrapped in a range, but it feels 
nasty for something which really ought to fit more naturally into 
the range syntax: a random sample is just an algorithm (similar 
to those in std.algorithm), but one whose elements need to be 
truly lazily evaluated and whose values are not deterministic but 
depend on a source of randomness.

The entire complication comes out of the fact that in order to 
play nice in a statistical sense, you need the RandomSample range 
to be a reference type, but in order to play nice with the kind 
of in-the-middle-of-the-inner-loop use above, you need it to be 
cheap and quick to instantiate and destroy (so classes and heap 
allocation are a problem).

Basically, what you probably want is for RandomSample to be a 
struct, but with a reference-type internal state that is 
nevertheless allocated on the stack and that is cheap to create 
and let go of when you're done with it.

Jan 25 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 01/25/2016 05:05 PM, Joseph Rushton Wakeling wrote:
 One option would be to implement the basic RNG data structor à la C++,
 as a functor

That's semantically the same as an input range. -- Andrei

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 23:37:25 UTC, Andrei Alexandrescu 
wrote:
 On 01/25/2016 05:05 PM, Joseph Rushton Wakeling wrote:
 One option would be to implement the basic RNG data structor à 
 la C++,
 as a functor

 That's semantically the same as an input range. -- Andrei

“Yes, but...” :-P

There are actually some interesting subtleties required for the 
input range design, not just for RNGs but for ANY range whose 
elements are generated randomly.

I will try and write this up properly later, but the TL;DR is, it 
involves doing extra work to ensure every element is _truly_ 
lazily evaluated.

To some extent, splitting into functor and range wrapper can help 
clarify the code design there (even if the functor stays hidden 
as an implementation detail).

Jan 25 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 01/25/2016 04:20 PM, Joseph Rushton Wakeling wrote:
 I thought that too, for a long time, but I came to the conclusion it's
 not the case.

I see no problem with adding a category of rngs that are not forward. 
Naming is of course the hardest problem in our community :o). A good 
stard would be a /dev/random wrapper. -- Andrei

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 23:34:56 UTC, Andrei Alexandrescu 
wrote:
 I see no problem with adding a category of rngs that are not 
 forward. Naming is of course the hardest problem in our 
 community :o). A good stard would be a /dev/random wrapper. -- 
 Andrei

It's not about different categories of RNG in this case -- 
really, NO RNGs should be forward ranges.

If ONLY it was just a naming problem... :-(

Jan 25 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 01/26/2016 02:07 AM, Joseph Rushton Wakeling wrote:
 On Monday, 25 January 2016 at 23:34:56 UTC, Andrei Alexandrescu wrote:
 I see no problem with adding a category of rngs that are not forward.
 Naming is of course the hardest problem in our community :o). A good
 stard would be a /dev/random wrapper. -- Andrei

 It's not about different categories of RNG in this case -- really, NO
 RNGs should be forward ranges.

I think a pseudo-rng as a forward range is useful. It's good in testing 
and experimentation to fork a sequence of pseudo-random numbers, turn 
the clock back, etc. Essentially I see no harm in it; it's always easy 
to make a forward range into an input range, it's the opposite that's 
hard. -- Andrei

Jan 26 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Tuesday, 26 January 2016 at 12:07:59 UTC, Andrei Alexandrescu 
wrote:
 I think a pseudo-rng as a forward range is useful. It's good in 
 testing and experimentation to fork a sequence of pseudo-random 
 numbers, turn the clock back, etc. Essentially I see no harm in 
 it; it's always easy to make a forward range into an input 
 range, it's the opposite that's hard. -- Andrei

Yes, but there are other ways to fork that sequence that don't 
involve implementing the .save primitive.  That's why in my talk 
(and here) I distinguish between the .save primitive versus some 
other possible primitive -- say, .dup -- whose promise is, "You 
can use me to duplicate the state of this range, but you must use 
me carefully and in an appropriate context."

This is important, because if you just define pseudo-RNGs as 
forward ranges, most people won't read the small print about when 
you need to wrap it in an input range (which is pretty much any 
time you want to pass it into phobos code and have it behave like 
an actual source of randomness).  As a result, they'll get 
unintended and possibly unnoticed statistical correlations -- 
annoying for something like a computer game, possibly a serious 
consequence if occurring in something like a research simulation.

Jan 26 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Monday, 25 January 2016 at 17:19:05 UTC, Joseph Rushton 
Wakeling wrote:
 Implementing the random algorithms/other wrappers as a class is 
 problematic because then you get into the hassle of potentially 
 having to new/free a lot of individual heap objects deep in the 
 inner loop of your program.  I already tried this in 
 hap.random, and came to the conclusion that it wasn't a valid 
 approach.

  What about an alternative allocator? Specifically I'm thinking 
in C's equivalent which is alloca (allocates directly on the 
stack with the rest of the variables). If the constructor is a 
function then that won't work; but if it's inlined then it should 
work.

  I suppose the alternate is an option to skip/throw away some 
numbers that should've been consumed (assuming you want to keep 
using the same seed), or seeding each per use.

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 20:26:12 UTC, Era Scarecrow wrote:
  What about an alternative allocator? Specifically I'm thinking 
 in C's equivalent which is alloca (allocates directly on the 
 stack with the rest of the variables). If the constructor is a 
 function then that won't work; but if it's inlined then it 
 should work.

I have been wondering about how allocators could help to deal 
with these problems.  Could you put forward a minimal example of 
how you would see it working?

  I suppose the alternate is an option to skip/throw away some 
 numbers that should've been consumed (assuming you want to keep 
 using the same seed), or seeding each per use.

I'm not sure I follow what you mean here or why you think this 
would work?  Could you give a concrete example?

Jan 25 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Monday, 25 January 2016 at 21:22:13 UTC, Joseph Rushton 
Wakeling wrote:
 On Monday, 25 January 2016 at 20:26:12 UTC, Era Scarecrow wrote:
  What about an alternative allocator? Specifically I'm 
 thinking in C's equivalent which is alloca (allocates directly 
 on the stack with the rest of the variables). If the 
 constructor is a function then that won't work; but if it's 
 inlined then it should work.

 I have been wondering about how allocators could help to deal 
 with these problems.  Could you put forward a minimal example 
 of how you would see it working?

  Most likely alloca would have to be built into the compiler. 
Here's a crash course in how the stack memory management works. 
sp=stack pointer, bp=base pointer (more relevant pre 386).

  When you prepare to enter a function (or block) the bp register 
is backed up, the sp is saved in the bp register, then you add 
your arguments, then the function is entered. After the function 
is entered it adds to the sp register which give it a block of 
space for all known local variables at once; then does any 
assignments it needs to. When you leave the function/block you 
simply replace the sp register with your backup the bp register. 
Once you return you restore the old bp register.

  So something close to this:
[code]
   ...
   push bp
   push calling_argument(s)
   call func
   sub sp, -N //size of total arguments pushed
   pop bp //completely done with function call.
   ...


   func: mov bp, sp
   add sp, +N //size of local variables
   // setup variables
   // code

   // cleanup/return
   mov sp, bp
   ret
[/code]

  Technically alloca simply returns the current sp, then adds to 
it the number of bytes you requested. This means you have to run 
it at the function stack where you want to use it (and not in a 
called function, otherwise corruption). So inlined functions 
where alloca's data would remain would be a must. Then it comes 
down to a simple:

[code]
struct RNG {
   int *state; //add assert to functions that this isn't null
   this(int seed) {
     state = alloca(sizeof(int));
     *state = seed;
   }
}
[/code]

  I suppose the alternate is an option to skip/throw-away some 
 numbers that should've been consumed (assuming you want to 
 keep using the same seed), or seeding each per use.

 I'm not sure I follow what you mean here or why you think this 
 would work?  Could you give a concrete example?

certainly.

[code]
struct RNG {
   ...

   void skip(int x) {assert(x>0); while(x--) popfront();}
}

RNG rnd = RND(seed);
rnd.take(10).writeln;  //loosely based on talk examples
rnd.skip(10);          //skips the 'consumed' numbers.
rnd.take(10).writeln;  //won't have identical output

[/code]

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 22:06:31 UTC, Era Scarecrow wrote:
  Most likely alloca would have to be built into the compiler. 
 Here's a crash course in how the stack memory management works. 
 sp=stack pointer, bp=base pointer (more relevant pre 386).

An apology in advance: I have an early start tomorrow, and a busy 
few days coming up, so I probably won't be able to follow up on 
this discussion until the weekend.

In the meantime, thanks for the food for thought.  I'll try to be 
in touch about this topic soon :-)

Jan 25 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 25 January 2016 at 22:06:31 UTC, Era Scarecrow wrote:
 On Monday, 25 January 2016 at 21:22:13 UTC, Joseph Rushton 
 Wakeling wrote:
 I have been wondering about how allocators could help to deal 
 with these problems.  Could you put forward a minimal example 
 of how you would see it working?

  Most likely alloca would have to be built into the compiler. 
 Here's a crash course in how the stack memory management works. 
 sp=stack pointer, bp=base pointer (more relevant pre 386).

Apologies for the delay in writing back about this.

What you describe makes sense, but I don't quite follow what you 
mean in one particular case:

  Technically alloca simply returns the current sp, then adds to 
 it the number of bytes you requested. This means you have to 
 run it at the function stack where you want to use it (and not 
 in a called function, otherwise corruption). So inlined 
 functions where alloca's data would remain would be a must.

I don't quite follow your remark about inlined functions; do you 
mean that the function where the RNG instance is generated must 
be inlined?  (That would make sense in order to avoid the 
internal state being deallocated immediately.)

I think there might be more complications here than just 
allocating individual RNG instances, though (which can happen 
quite early on in the program); what about stuff like random 
algorithms (RandomCover, RandomSample) which might be generated 
deep in internal loops, passed to other functionality as rvalues, 
etc. etc.?

 Then it comes down to a simple:

 [code]
 struct RNG {
   int *state; //add assert to functions that this isn't null
   this(int seed) {
     state = alloca(sizeof(int));
     *state = seed;
   }
 }
 [/code]

Yes, missing your understanding of the details of how it would 
have to happen, this is pretty much what I had in mind for random 
ranges; a pointer to internal state nevertheless allocated on the 
stack.  But the already-mentioned concerns about some of the ways 
that stack could be deallocated make for some concerns.

It might be simpler, in practice, to just have the state 
refcounted.

  I suppose the alternate is an option to skip/throw-away some 
 numbers that should've been consumed (assuming you want to 
 keep using the same seed), or seeding each per use.

 I'm not sure I follow what you mean here or why you think this 
 would work?  Could you give a concrete example?

 certainly.

 [code]
 struct RNG {
   ...

   void skip(int x) {assert(x>0); while(x--) popfront();}
 }

 RNG rnd = RND(seed);
 rnd.take(10).writeln;  //loosely based on talk examples
 rnd.skip(10);          //skips the 'consumed' numbers.
 rnd.take(10).writeln;  //won't have identical output

 [/code]

I'm afraid that's not really viable :-(  In the best case, it's 
just working around the fundamental design problem via programmer 
virtue.  But the problem is, in the general case, you can't 
anticipate how many random variates may be popped from your 
random number generator inside a function (it might depend on 
other input factors over which you as programmer have no control).

Feb 07 2016

Era Scarecrow <rtcvb32 yahoo.com> writes:

On Sunday, 7 February 2016 at 22:27:40 UTC, Joseph Rushton 
Wakeling wrote:
 On Monday, 25 January 2016 at 22:06:31 UTC, Era Scarecrow wrote:
 What you describe makes sense, but I don't quite follow what 
 you mean in one particular case:

  Technically alloca simply returns the current sp, then adds 
 to it the number of bytes you requested. This means you have 
 to run it at the function stack where you want to use it (and 
 not in a called function, otherwise corruption). So inlined 
 functions where alloca's data would remain would be a must.


  That's the low level assembly language that's generated i'm 
referring to; At least based on what i've read and seen for code 
output from a compiler to a .s file or similar.

 I don't quite follow your remark about inlined functions; do 
 you mean that the function where the RNG instance is generated 
 must be inlined?  (That would make sense in order to avoid the 
 internal state being deallocated immediately.)

Assuming alloca moves to the inlined function. Although i had 
another idea thrown in my head where the memory would be 
pre-allocated and you could just point to it when requested via 
an allocator. So assume

 alloca(sizeof(int)) struct RNG {

  During instantiation it would know the size ahead of time and 
just append that to the end of the structure. That extra padding 
space could be handled manually instead.

  this(int seed) {
    state = cast(void*)(this+1);
  }

  But this forced type breaking is quite clunky (and obviously the 
wrong way to write it).

  I recall in C there was suppose to be a way to attach an array 
(of unknown size) immediately following a struct by making the 
length of the array 0, then accessing it directly. But you'd 
still need to guarantee somehow that the access rights are in 
place and not referencing other data which you could screw up 
(via optimizations or something).

 I think there might be more complications here than just 
 allocating individual RNG instances, though (which can happen 
 quite early on in the program); what about stuff like random 
 algorithms (RandomCover, RandomSample) which might be generated 
 deep in internal loops, passed to other functionality as 
 rvalues, etc. etc.?

  Either they use more stack space, or they act normally after 
their call is done and are deallocated normally (automatically, 
unless they are passed outside of the scope where they were 
generated).

 It might be simpler, in practice, to just have the state 
 refcounted.

  I suppose the alternate is an option to skip/throw-away 
 some numbers that should've been consumed

 I'm not sure I follow what you mean here or why you think 
 this would work?  Could you give a concrete example?



   void skip(int x) {assert(x>0); while(x--) popfront();}


 rnd.take(10).writeln;  //loosely based on talk examples
 rnd.skip(10);          //skips the 'consumed' numbers.
 rnd.take(10).writeln;  //won't have identical output



 I'm afraid that's not really viable :-(
 But the problem is, in the general case, you can't anticipate 
 how many random variates may be popped from your random number 
 generator inside a function.

  True; Perhaps have one RNG for seeding and one RNG for passing, 
then reseed after passing the function off, although how far deep 
some of this could go with it's deeper copying; I don't know.

  Perhaps RNG should be a class outright, which probably removes a 
lot of these problems.

Feb 07 2016

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 8 February 2016 at 00:54:24 UTC, Era Scarecrow wrote:
  Either they use more stack space, or they act normally after 
 their call is done and are deallocated normally (automatically, 
 unless they are passed outside of the scope where they were 
 generated).

It's that "passed outside of the scope where they were generated" 
that is bothering me.

Consider:

     auto foo (Range) (Range r)
         if (isInputRange!Range)
     {
         return r.randomSample(100).take(10);
     }

     void main ()
     {
         iota(1000).foo.writeln;
     }

If the RandomSample internals are allocated on the stack using 
the technique you propose, what's going to happen here?


  True; Perhaps have one RNG for seeding and one RNG for 
 passing, then reseed after passing the function off, although 
 how far deep some of this could go with it's deeper copying; I 
 don't know.

No, I really don't think that's an approach that's worth 
pursuing, except as a desperate workaround for the faults of the 
existing design.  RNGs should as much as possible "just work" and 
the user shouldn't have to concern themselves like this with


  Perhaps RNG should be a class outright, which probably removes 
 a lot of these problems.

It does indeed give you the desired reference semantics very 
cheaply.  But now suppose you have lots of RandomSamples being 
instantiated in the inner loop of your program.  If _they_ are 
also a class, how do you handle their instantiation and 
deallocation?

Feb 08 2016

John Colvin <john.loughran.colvin gmail.com> writes:

On Monday, 8 February 2016 at 19:46:19 UTC, Joseph Rushton 
Wakeling wrote:
 [snip]

This might be a stupid idea, but perhaps there's something useful 
in it:

Determinism isn't the same thing as "one long chain of numbers 
that everybody reads from".

It can be acceptable to seed a set of reasonable pseudo-random 
number generators with consecutive integers (indeed seeding 
randomly can be dangerous because of the birthday problem). More 
generally, any change of the state of the rng in "seed-space" 
should produce an output equivalent to taking a sample from the 
output distribution.

Can you not have a random number generator make a copy of itself 
like this:

struct RNG
{
     State state;
     static State.ModifierT modifier;
     this(this)
     {
         this.state.alterBy(modifier++);
         //recalculate output sample etc...
     }
}

Then any time you copy a RNG, the copy is kicked on to a new path 
in state-space. Basically we're deterministically re-seeding on 
copy.

Feb 08 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Dconf 2015 talks...