www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - default random object?

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leonardo suggested that some functions in std.random should not require 
their user to be bothered with creating a random object, i.e.:

auto r = Random(unpredictableSeed);
auto n = uniform(r, 0, 100);

Instead the library should simply support:

auto n = uniform(0, 100);

and do this by presumably using a global RNG under the hood. So I wanted 
to ask all y'all:

1. Are you cool with making the rng the last parameter and give it a 
default value?

2. The global random generator will be allocated per thread. Are you 
cool with this too?

3. How should the global rng be initialized? To always generate the same 
sequence, or not?

4. While we're at it, should uniform(a, b) generate by default something 
in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
for floating point numbers is the source of all evils and that Hitler, 
Stalin and Kim Il Sung (should he still be alive) must be using that 
kind of generator. Conversely, generating [a, b) is guaranteed to bring 
in the long term everlasting peace to Earth. My problem however is that 
in the integer realm I always want to generate [a, b]. Furthermore, I 
wouldn't be happy if the shape of the interval was different for 
integers and floating point numbers. How to break this conundrum? Don't 
forget that we're only worrying about defaults, explicit generation is 
always possible with self-explanatory code:

auto rng = Random(unpredictableSeed);
auto a = 0.0, b = 1.0;
auto x1 = uniform!("[]")(rng, a, b);
auto x2 = uniform!("[)")(rng, a, b);
auto x3 = uniform!("(]")(rng, a, b);
auto x4 = uniform!("()")(rng, a, b);


Thanks,

Andrei
Feb 13 2009
next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 14 Feb 2009 04:10:29 +0300, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 Leonardo suggested that some functions in std.random should not require  
 their user to be bothered with creating a random object, i.e.:

 auto r = Random(unpredictableSeed);
 auto n = uniform(r, 0, 100);

 Instead the library should simply support:

 auto n = uniform(0, 100);

I think that both should be available.
 and do this by presumably using a global RNG under the hood. So I wanted  
 to ask all y'all:

 1. Are you cool with making the rng the last parameter and give it a  
 default value?

Yes.
 2. The global random generator will be allocated per thread. Are you  
 cool with this too?

Yes.
 3. How should the global rng be initialized? To always generate the same  
 sequence, or not?

Yes, it should be deterministic.
 4. While we're at it, should uniform(a, b) generate by default something  
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b]  
 for floating point numbers is the source of all evils and that Hitler,  
 Stalin and Kim Il Sung (should he still be alive) must be using that  
 kind of generator. Conversely, generating [a, b) is guaranteed to bring  
 in the long term everlasting peace to Earth. My problem however is that  
 in the integer realm I always want to generate [a, b]. Furthermore, I  
 wouldn't be happy if the shape of the interval was different for  
 integers and floating point numbers. How to break this conundrum? Don't  
 forget that we're only worrying about defaults, explicit generation is  
 always possible with self-explanatory code:

 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

It should definitely include a but not b for floating point types. It'd be satisfied if the same would be for integer types, too. My typical use-case is "auto x = rand() % t;" which yields numbers in [0 to t). For example, to get random range element, I'd like to write the following: auto elem = range[uniform(0, range.length)];
 Thanks,

 Andrei

Feb 13 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 3. How should the global rng be initialized?

Automatically seeded with the time at the beginning of the program. Of course the seed can be set again in any moment.
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

That's awful and ugly. My suggestions are simple (copied from my dlibs and the random std lib of Python): random() => floating point [0, 1) randInt(a=0, b) => integral [a, b] randRange(a=0, b) => integral [a, b) uniform(a, b) => floating point [a, b) normal(a, b) => good quality normally-distributed number with given std dev and avg. Bye, bearophile
Feb 13 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 
 3. How should the global rng be initialized?

Automatically seeded with the time at the beginning of the program. Of course the seed can be set again in any moment.
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

That's awful and ugly. My suggestions are simple (copied from my dlibs and the random std lib of Python): random() => floating point [0, 1) randInt(a=0, b) => integral [a, b] randRange(a=0, b) => integral [a, b) uniform(a, b) => floating point [a, b) normal(a, b) => good quality normally-distributed number with given std dev and avg.

Leaving normal() aside (does your implementation implement the ziggurat algorithm? If so, would you agree to contribute it to phobos?), I don't see how having to memorize four names instead of one is necessarily awesome and beautiful. Besides uniform() renders "random" redundant. Hardly a pinnacle of good API design, see realloc() which essentially makes malloc() and free() redundant. Andrei
Feb 13 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 Leaving normal() aside (does your implementation implement the ziggurat 
 algorithm?

You can use two well known algorithms, one is a simple one, and the other is more precise. The simple one is just to extract 10 or 12 uniform numbers and sum them.
 I don't see how having to memorize four names instead of one is necessarily 
 awesome and beautiful. Besides uniform() renders "random" redundant. 
 Hardly a pinnacle of good API design

You never need this, so it's over-generalized, and it's also ugly to see and noisy too: uniform!("(]")(rng, a, b); Those names are almost self-explaining, you don't need much time to remember them. And each one of them represent a very common operation you too want to do. Of course your tastes can be different from mine (and Python ones too). When the phobos of D2 will be "finished" it will require one year or more of "tuning" (discussing names here on the newsgroup) to find the best names, API, etc. While we are discussing Phobos, I'd also like to have a lazy prime generator, prime tests a binomial, etc. Such things are useful in many programs (you can find them all in my dlibs, but that hyperfast lazy prime generator is GNU, unfortunately). Bye, bearophile
Feb 13 2009
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
bearophile wrote:
 Andrei Alexandrescu:
 I don't see how having to memorize four names instead of one is necessarily 
 awesome and beautiful. Besides uniform() renders "random" redundant. 
 Hardly a pinnacle of good API design

You never need this, so it's over-generalized, and it's also ugly to see and noisy too: uniform!("(]")(rng, a, b);

My pages of code implementing it in Python would like to disagree with you on that. So would the safety science guys I write simulation and analysis code for. Then again "I never need this" isn't as convincing an argument. :)
 ...
 
 While we are discussing Phobos, I'd also like to have a lazy prime generator,
prime tests a binomial, etc. Such things are useful in many programs (you can
find them all in my dlibs, but that hyperfast lazy prime generator is GNU,
unfortunately).

Prime test a binomial? I never need this, so it's not necessary.
 Bye,
 bearophile

-- Daniel
Feb 13 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Daniel Keep:
 My pages of code implementing it in Python would like to disagree with
 you on that.  So would the safety science guys I write simulation and
 analysis code for.

Python devs (and the same is true for other people, like Mathematica devs) have seen such thing isn't a common need, I have followed when they voted for functions to put and functions to keep out of the random module. I think a std rnd module can't contain everything, but only the things useful for most people, to keep low the number of simple functions, so you can remember them better. For more specialized usages you can import a third-part module for stats, etc.
 Then again "I never need this" isn't as convincing an argument.  :)

"Never" is of course meant in statistical terms, a std lib can't contain everything, so you put there only the things that most people need.
 Prime test a binomial?  I never need this, so it's not necessary.

Sorry: Prime test, binomial, are two separated things. And lot of people need that, all basic statistics modules have such things. Bye, bearophile
Feb 13 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 When the phobos of D2 will be "finished" it will require one year or
 more of "tuning" (discussing names here on the newsgroup) to find the
 best names, API, etc.

I know. This is an exciting time to be around D! Andrei
Feb 13 2009
prev sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
Andrei Alexandrescu wrote:
 bearophile wrote:
 Andrei Alexandrescu:

 3. How should the global rng be initialized?

Automatically seeded with the time at the beginning of the program. Of course the seed can be set again in any moment.
 4. While we're at it, should uniform(a, b) generate by default
 something in [a, b] or [a, b)? Someone once explained to me that
 generating [a, b] for floating point numbers is the source of all
 evils and that Hitler, Stalin and Kim Il Sung (should he still be
 alive) must be using that kind of generator. Conversely, generating
 [a, b) is guaranteed to bring in the long term everlasting peace to
 Earth. My problem however is that in the integer realm I always want
 to generate [a, b]. Furthermore, I wouldn't be happy if the shape of
 the interval was different for integers and floating point numbers.
 How to break this conundrum? Don't forget that we're only worrying
 about defaults, explicit generation is always possible with
 self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

That's awful and ugly. My suggestions are simple (copied from my dlibs and the random std lib of Python): random() => floating point [0, 1) randInt(a=0, b) => integral [a, b] randRange(a=0, b) => integral [a, b) uniform(a, b) => floating point [a, b) normal(a, b) => good quality normally-distributed number with given std dev and avg.

Leaving normal() aside (does your implementation implement the ziggurat algorithm? If so, would you agree to contribute it to phobos?), I don't see how having to memorize four names instead of one is necessarily awesome and beautiful. Besides uniform() renders "random" redundant. Hardly a pinnacle of good API design, see realloc() which essentially makes malloc() and free() redundant. Andrei

your example counters your claim. The same reason we prefer to use D with so many concepts and keywords over something like brainf**k that is turing-complete with just 8 or so commands can be apllied here. sure, the *implementation* of realloc() makes the *implementations* of free() and malloc() redundunt but it's easier to remember to use free() when you actually want to free something. if free() wasn't available, the first thing I'd do was to use a macro to define it.
Feb 14 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Yigal Chripun wrote:
 Andrei Alexandrescu wrote:
 bearophile wrote:
 random() => floating point [0, 1)
 randInt(a=0, b) => integral [a, b]
 randRange(a=0, b) => integral [a, b)
 uniform(a, b) => floating point [a, b)
 normal(a, b) => good quality normally-distributed number with given
 std dev and avg.

Leaving normal() aside (does your implementation implement the ziggurat algorithm? If so, would you agree to contribute it to phobos?), I don't see how having to memorize four names instead of one is necessarily awesome and beautiful. Besides uniform() renders "random" redundant. Hardly a pinnacle of good API design, see realloc() which essentially makes malloc() and free() redundant. Andrei

your example counters your claim. The same reason we prefer to use D with so many concepts and keywords over something like brainf**k that is turing-complete with just 8 or so commands can be apllied here. sure, the *implementation* of realloc() makes the *implementations* of free() and malloc() redundunt but it's easier to remember to use free() when you actually want to free something. if free() wasn't available, the first thing I'd do was to use a macro to define it.

My point was that realloc() is wrong, not free(). The sorely missing allocation primitive is expand(), and we've been paying for it through the nose for decades. Andrei
Feb 14 2009
next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
Andrei Alexandrescu wrote:
 Yigal Chripun wrote:
 Andrei Alexandrescu wrote:
 bearophile wrote:
 random() => floating point [0, 1)
 randInt(a=0, b) => integral [a, b]
 randRange(a=0, b) => integral [a, b)
 uniform(a, b) => floating point [a, b)
 normal(a, b) => good quality normally-distributed number with given
 std dev and avg.

Leaving normal() aside (does your implementation implement the ziggurat algorithm? If so, would you agree to contribute it to phobos?), I don't see how having to memorize four names instead of one is necessarily awesome and beautiful. Besides uniform() renders "random" redundant. Hardly a pinnacle of good API design, see realloc() which essentially makes malloc() and free() redundant. Andrei

your example counters your claim. The same reason we prefer to use D with so many concepts and keywords over something like brainf**k that is turing-complete with just 8 or so commands can be apllied here. sure, the *implementation* of realloc() makes the *implementations* of free() and malloc() redundunt but it's easier to remember to use free() when you actually want to free something. if free() wasn't available, the first thing I'd do was to use a macro to define it.

My point was that realloc() is wrong, not free(). The sorely missing allocation primitive is expand(), and we've been paying for it through the nose for decades. Andrei

I see. Having 3 functions: malloc(), expand() and free() is indeed a very good design. regarding the original topic, I still generally prefer bearophile's API, random may be omitted and other small changes are also possible but it looks better to me than using something like: uniform!("[]")(rng, a, b); maybe some aliases are in order?
Feb 14 2009
parent reply Derek Parnell <derek psych.ward> writes:
On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:


 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good 
 design.

Unless you need to contract the allocated memory ;-) I believe that realloc() does expansions and contractions, no? -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Feb 14 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Derek Parnell wrote:
 On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:
 
 
 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good 
 design.

Unless you need to contract the allocated memory ;-) I believe that realloc() does expansions and contractions, no?

expand() is a poorly-chosen name. People who are criticizing C's allocation API really mean reallocInPlace. Then realloc() can be implemented in terms of malloc, free, and reallocInPlace as a convenience function. Andrei
Feb 14 2009
prev sibling next sibling parent grauzone <none example.net> writes:
Derek Parnell wrote:
 On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:
 
 
 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good 
 design.

Unless you need to contract the allocated memory ;-) I believe that realloc() does expansions and contractions, no?

expand() could take a negative value! I'm not sure though, what you guys are up to. The really bad thing about realloc() is error handling. How do you check for errors? You need some additional lines of code just to find out if (re-)allocation has failed.
Feb 14 2009
prev sibling parent Yigal Chripun <yigal100 gmail.com> writes:
Derek Parnell wrote:
 On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:


 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good
 design.

Unless you need to contract the allocated memory ;-) I believe that realloc() does expansions and contractions, no?

use free() to free the part the is not needed anymore? T* arr = malloc(sizeof(T) * n); ... tail = arr[x]; free(tail);
Feb 14 2009
prev sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
 My point was that realloc() is wrong, not free(). The sorely missing 
 allocation primitive is expand(), and we've been paying for it through 
 the nose for decades.
 
 Andrei

What would expand's signature be? It needs the current pointer, it needs the new size. Seems that it's nothing else than a renamed realloc. The problem is just that realloc allows null-pointer (=malloc) and 0-size (=free), thereby making malloc/free obsolete. L.
Feb 15 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Lionello Lunesu wrote:
 My point was that realloc() is wrong, not free(). The sorely missing 
 allocation primitive is expand(), and we've been paying for it through 
 the nose for decades.

 Andrei

What would expand's signature be? It needs the current pointer, it needs the new size. Seems that it's nothing else than a renamed realloc.

realloc moves memory, expand wouldn't. That delivered a fatal blow to C++ making its allocation primitives inferior to C's.
 The problem is just that realloc allows null-pointer (=malloc) and 
 0-size (=free), thereby making malloc/free obsolete.

That's the least of its problems. Andrei
Feb 15 2009
prev sibling next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leonardo suggested that some functions in std.random should not require
 their user to be bothered with creating a random object, i.e.:
 auto r = Random(unpredictableSeed);
 auto n = uniform(r, 0, 100);
 Instead the library should simply support:
 auto n = uniform(0, 100);
 and do this by presumably using a global RNG under the hood. So I wanted
 to ask all y'all:
 1. Are you cool with making the rng the last parameter and give it a
 default value?

Yes.
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

Yes. I've found D2's random number generation scheme somewhat annoying up until now because it requires the caller to explicitly do something with the underlying RNG object between random number generations, in a sense lessening the encapsulation of the higher level random number generation functions such as uniform() and stuff that's written by the user. Even if you make a Uniform object or a Normal object or something, the implementation detail that pseudorandom number generation requires state to be maintained in between calls still leaks out into the interface. IMHO, the whole concept of RNGs requiring state to be maintained between calls is an implementation detail and therefore should ideally be encapsulated. For small, single-threaded programs, most of the time I just declare a global RNG and seed it in a module initializer anyhow, for convenience. This thread-local RNG is a perfect way to provide a simple default behavior that "just works" and is well-encapsulated, while still allowing more flexibility and control in the relatively rare cases when it's really needed.
 3. How should the global rng be initialized? To always generate the same
 sequence, or not?

It should be properly seeded, though I don't know how this would work for RNGs for anything but the main thread. This keeps with the principle that the default should "just work" reasonably well without the user having to think about implementation details. Maybe core.thread.Thread needs to have the ability to execute a callback function on the creation of a new thread. This is probably not a bad feature anyhow.
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b]
 for floating point numbers is the source of all evils and that Hitler,
 Stalin and Kim Il Sung (should he still be alive) must be using that
 kind of generator. Conversely, generating [a, b) is guaranteed to bring
 in the long term everlasting peace to Earth. My problem however is that
 in the integer realm I always want to generate [a, b]. Furthermore, I
 wouldn't be happy if the shape of the interval was different for
 integers and floating point numbers. How to break this conundrum? Don't
 forget that we're only worrying about defaults, explicit generation is
 always possible with self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I like uniform the way it is, [a, b), even for integers, because it jives well with array slice notation, and a major use for uniform integers is randomly selecting an element from an array. This is not that important, though, because no matter what you do, it's going to seem stupid and unintuitive to someone.
Feb 13 2009
prev sibling next sibling parent Derek Parnell <derek psych.ward> writes:
On Fri, 13 Feb 2009 17:10:29 -0800, Andrei Alexandrescu wrote:


 1. Are you cool with making the rng the last parameter and give it a 
 default value?

Yes
 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

Yes
 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

New sequence each time. A predetermined sequence should be under the application's control and be explicit, IMHO.
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? 

I prefer [a,b] as the default, and I'm thinking integers too. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Feb 13 2009
prev sibling next sibling parent reply Steve Schveighoffer <schveiguy yahoo.com> writes:
On Fri, 13 Feb 2009 17:10:29 -0800, Andrei Alexandrescu wrote:


 2. The global random generator will be allocated per thread. Are you
 cool with this too?

Good idea. Random number generators should not require locking in the most common case.
 
 3. How should the global rng be initialized? To always generate the same
 sequence, or not?

Not. And even between threads not. So some combination of time and thread id for seed is probably in order. If I want to "just use" random numbers in a simple program, I don't want to see the same values every time. If I want to repeat the values every time, I'll take the time to seed it so that happens.
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

[a,b) Every other piece of range-like code is zero based, and excludes the upper bound. This should be no different. It makes the code simpler too. -Steve
Feb 13 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

[a,b) Every other piece of range-like code is zero based, and excludes the upper bound. This should be no different. It makes the code simpler too.

I tried both versions, and it turns out my code is almost never simpler with open integral intervals. Most of the time I need something like: auto x = uniform(rng, -100, 100); auto y = uniform(rng, 0, 100); and I need to remember to actually ask for 101 instead of 100. True, when you want a random index in an array, open intervals are more convenient. One purity-based argument is that in a random number you may actually ask for the total range: auto big = uniform(rng, uint.max / 2, uint.max); If the interval is open I can't generate uint.max. Anyway, I checked the C++ API and it turns out they use closed intervals for integers and open intervals for reals. I know there's been a lot of expert scrutiny there, so I suppose I better copy their design. Andrei
Feb 13 2009
next sibling parent reply grauzone <none example.net> writes:
 Anyway, I checked the C++ API and it turns out they use closed intervals 
 for integers and open intervals for reals. I know there's been a lot of 
 expert scrutiny there, so I suppose I better copy their design.

Because C++ is the pinnacle of language and library design. SCNR
Feb 13 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Anyway, I checked the C++ API and it turns out they use closed 
 intervals for integers and open intervals for reals. I know there's 
 been a lot of expert scrutiny there, so I suppose I better copy their 
 design.

Because C++ is the pinnacle of language and library design.

No, just because their random numbers library is very good. Andrei
Feb 13 2009
prev sibling next sibling parent KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
 Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

[a,b) Every other piece of range-like code is zero based, and excludes the upper bound. This should be no different. It makes the code simpler too.

I tried both versions, and it turns out my code is almost never simpler with open integral intervals. Most of the time I need something like: auto x = uniform(rng, -100, 100); auto y = uniform(rng, 0, 100); and I need to remember to actually ask for 101 instead of 100. True, when you want a random index in an array, open intervals are more convenient. One purity-based argument is that in a random number you may actually ask for the total range: auto big = uniform(rng, uint.max / 2, uint.max); If the interval is open I can't generate uint.max.

Use uniform!("[]")(rng, uint.max/2, uint.max) for it then.
 Anyway, I checked the C++ API and it turns out they use closed intervals 
 for integers and open intervals for reals. I know there's been a lot of 
 expert scrutiny there, so I suppose I better copy their design.
 
 
 Andrei

FYI, GSL (GNU Scientific Library) uses [0,1) for their uniform() and [0,n) for uniform_int(). -- Kenny.
Feb 13 2009
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
 
 If the interval is open I can't generate uint.max.

auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;
Feb 14 2009
parent Walter Bright <newshound1 digitalmars.com> writes:
Denis Koroskin wrote:
 On Sat, 14 Feb 2009 14:04:36 +0300, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
  If the interval is open I can't generate uint.max.

auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

Yeah, but now about uniform(uint.min, uint.max)?

auto big = cast(uint)uniform(rng, cast(ulong)uint.min, cast(ulong)uint.max + 1);
Feb 14 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

[a,b) Every other piece of range-like code is zero based, and excludes the upper bound. This should be no different. It makes the code simpler too.

I tried both versions, and it turns out my code is almost never simpler with open integral intervals. Most of the time I need something like: auto x = uniform(rng, -100, 100); auto y = uniform(rng, 0, 100); and I need to remember to actually ask for 101 instead of 100. True, when you want a random index in an array, open intervals are more convenient. One purity-based argument is that in a random number you may actually ask for the total range: auto big = uniform(rng, uint.max / 2, uint.max); If the interval is open I can't generate uint.max. Anyway, I checked the C++ API and it turns out they use closed intervals for integers and open intervals for reals. I know there's been a lot of expert scrutiny there, so I suppose I better copy their design. Andrei

D uses a..b notation as part of the language to mean "[)". Because of that, it makes more sense to make that the default in D vs. other languages that don't have such constructs.
Feb 14 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Jason House wrote:
 D uses a..b notation as part of the language to mean "[)". Because of
 that, it makes more sense to make that the default in D vs. other
 languages that don't have such constructs.

I think that's exactly right. Consistency is very important.
Feb 14 2009
parent Don <nospam nospam.com> writes:
Walter Bright wrote:
 Jason House wrote:
 D uses a..b notation as part of the language to mean "[)". Because of
 that, it makes more sense to make that the default in D vs. other
 languages that don't have such constructs.

I think that's exactly right. Consistency is very important.

Agreed. Note that arr[random(0..arr.length)] is likely to be an _extremely_ common usage pattern. That argues very strongly for "[)".
Feb 15 2009
prev sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Andrei Alexandrescu wrote:
 Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

[a,b) Every other piece of range-like code is zero based, and excludes the upper bound. This should be no different. It makes the code simpler too.

I tried both versions, and it turns out my code is almost never simpler with open integral intervals. Most of the time I need something like: auto x = uniform(rng, -100, 100); auto y = uniform(rng, 0, 100); and I need to remember to actually ask for 101 instead of 100. True, when you want a random index in an array, open intervals are more convenient. One purity-based argument is that in a random number you may actually ask for the total range: auto big = uniform(rng, uint.max / 2, uint.max); If the interval is open I can't generate uint.max. Anyway, I checked the C++ API and it turns out they use closed intervals for integers and open intervals for reals. I know there's been a lot of expert scrutiny there, so I suppose I better copy their design. Andrei

Maybe a NumericInterval struct would be a good idea. It could be specialized to any numeric type (float, double, int, etc), it would know its own boundaries, and it'd keep track of whether those boundaries were open or closed. The random functions would take an RND and an interval (with some reasonable default intervals for common tasks like choosing elements from arrays and random-access ranges). I have a Java implementation around here somewhere that I could port to D if anyone is interested. --benji
Feb 15 2009
parent reply Benji Smith <dlanguage benjismith.net> writes:
Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would know 
 its own boundaries, and it'd keep track of whether those boundaries were 
 open or closed.
 
 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).
 
 I have a Java implementation around here somewhere that I could port to 
 D if anyone is interested.
 
 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value auto i = NumericInterval.parse("[ 123, 456.789 )"); double random = uniform!(double)(i, rng); --benji
Feb 15 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port 
 to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei
Feb 15 2009
parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port 
 to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.
Feb 16 2009
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Don" <nospam nospam.com> wrote in message 
news:gnb7mt$2osv$1 digitalmars.com...
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port to 
 D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.

I would think that, with the possible exception of mathematicians, most of the time anyone would want a random floating-point number they would consider +/-infinity a special case. Most non-math-oriented code isn't really designed to handle infinity anyway. Besides, technically, double also includes NaN ;)
Feb 16 2009
parent Don <nospam nospam.com> writes:
Nick Sabalausky wrote:
 "Don" <nospam nospam.com> wrote in message 
 news:gnb7mt$2osv$1 digitalmars.com...
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port to 
 D if anyone is interested.

 --benji

applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

double is exactly what I'd need right now. It's a ginormous interval! Andrei

uniform distribution must return +-infinity with probability 1. It's nonsense.

I would think that, with the possible exception of mathematicians, most of the time anyone would want a random floating-point number they would consider +/-infinity a special case. Most non-math-oriented code isn't really designed to handle infinity anyway. Besides, technically, double also includes NaN ;)

You're asking for a random *number* and NaN is not a number :). If you have a double x, which could be double.max, -double.max, 0.0, -double.min*double.epsilon, or +double.min*double.epsilon, there's hardly any operations you can do on it which won't generate an infinity or NaN. But cutting out infinity doesn't help much -- anything from the range of double will be really close to double.max. If the code is correct, it's still going to have to deal with infinity. double d = uniform(NumericInterval.DOUBLE); d *=2; // BUG: There's a 50% chance that d is now +- infinity.
Feb 17 2009
prev sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port 
 to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.

Way to miss the forest for the trees. You guys telling me can't see any legitimate use for a NumericInterval type? And that it wouldn't be convenient to use for random number generation within that interval? So the full double range was a dumb example. But that wasn't really the point, was it? --benji
Feb 18 2009
parent reply Don <nospam nospam.com> writes:
Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing 
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could 
 port to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.

Way to miss the forest for the trees. You guys telling me can't see any legitimate use for a NumericInterval type? And that it wouldn't be convenient to use for random number generation within that interval? So the full double range was a dumb example. But that wasn't really the point, was it? --benji

On the contrary, I've been giving NumericInterval considerable thought. One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2 (the _strict_ definition), or whether it is also permissible to have x2<=x1 (ie, you can specify the two endpoints in reverse order; the interval is then between min(x1,x2) and max(x1, x2)). This is an issue because I've noticed is that when I want to use it, I often have related pairs of values. eg. Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}. Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an interval in the _strict_ sense. But it satisfies the _relaxed_ definition.
Feb 19 2009
parent reply Benji Smith <dlanguage benjismith.net> writes:
Don wrote:
 Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it 
 would know its own boundaries, and it'd keep track of whether 
 those boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing 
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could 
 port to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.

Way to miss the forest for the trees. You guys telling me can't see any legitimate use for a NumericInterval type? And that it wouldn't be convenient to use for random number generation within that interval? So the full double range was a dumb example. But that wasn't really the point, was it? --benji

On the contrary, I've been giving NumericInterval considerable thought. One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2 (the _strict_ definition), or whether it is also permissible to have x2<=x1 (ie, you can specify the two endpoints in reverse order; the interval is then between min(x1,x2) and max(x1, x2)). This is an issue because I've noticed is that when I want to use it, I often have related pairs of values. eg. Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}. Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an interval in the _strict_ sense. But it satisfies the _relaxed_ definition.

I don't see any idealogical reason for requiring x2 >= x1. But the public API of the interval will probably have functions or properties returning the "lowerBound" and "upperBound". And the implementations of the "containsValue", "intersect", and "overlap" functions are all more straightforward to write if you know in advance which value is which, potentially switching them in the constructor. Of course, if you switch the values, do you also switch the boundary open/closed boundaries? What about this case: auto i = Interval!("[)")(1000, -1000); Which side of the range is open, and which is closed? Does the "[)" argument apply to the natural order of the range (closed on its lower bound) or does it apply to the order of the arguments in the function (closed on its leftmost argument)? As long as the behavior is well documented, I think it'd be fine either way. But I also think it'd be reasonable to throw an exception if the arguments are in the wrong order. --benji
Feb 19 2009
parent Don <nospam nospam.com> writes:
Benji Smith wrote:
 Don wrote:
 Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it 
 would know its own boundaries, and it'd keep track of whether 
 those boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing 
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could 
 port to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.

Way to miss the forest for the trees. You guys telling me can't see any legitimate use for a NumericInterval type? And that it wouldn't be convenient to use for random number generation within that interval? So the full double range was a dumb example. But that wasn't really the point, was it? --benji

On the contrary, I've been giving NumericInterval considerable thought. One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2 (the _strict_ definition), or whether it is also permissible to have x2<=x1 (ie, you can specify the two endpoints in reverse order; the interval is then between min(x1,x2) and max(x1, x2)). This is an issue because I've noticed is that when I want to use it, I often have related pairs of values. eg. Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}. Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an interval in the _strict_ sense. But it satisfies the _relaxed_ definition.

I don't see any idealogical reason for requiring x2 >= x1. But the public API of the interval will probably have functions or properties returning the "lowerBound" and "upperBound". And the implementations of the "containsValue", "intersect", and "overlap" functions are all more straightforward to write if you know in advance which value is which, potentially switching them in the constructor. Of course, if you switch the values, do you also switch the boundary open/closed boundaries? What about this case: auto i = Interval!("[)")(1000, -1000); Which side of the range is open, and which is closed? Does the "[)" argument apply to the natural order of the range (closed on its lower bound) or does it apply to the order of the arguments in the function (closed on its leftmost argument)?

That's a really good question, and probably the killer argument against relaxed ranges. A NumericInterval would always be stored internally as "[]", so the "[)" is only required at construction/assignment (it's not part of the type). So instead, it seems as though a different type is required (a pair of values), which provides conversion to a strict range. Not as simple as I hoped.
 As long as the behavior is well documented, I think it'd be fine either 
 way. But I also think it'd be reasonable to throw an exception if the 
 arguments are in the wrong order.
 
 --benji

Feb 19 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Sat, Feb 14, 2009 at 3:04 PM, grauzone <none example.net> wrote:
 Anyway, I checked the C++ API and it turns out they use closed intervals
 for integers and open intervals for reals. I know there's been a lot of
 expert scrutiny there, so I suppose I better copy their design.

Because C++ is the pinnacle of language and library design.

(Score: -1, flamebait) --bb
Feb 13 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 14 Feb 2009 14:04:36 +0300, Walter Bright <newshound1 digitalmars.com>
wrote:

 Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
  If the interval is open I can't generate uint.max.

auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

Yeah, but now about uniform(uint.min, uint.max)?
Feb 14 2009
prev sibling next sibling parent Steve Schveighoffer <schveiguy yahoo.com> writes:
On Sat, 14 Feb 2009 14:11:10 +0300, Denis Koroskin wrote:

 On Sat, 14 Feb 2009 14:04:36 +0300, Walter Bright
 <newshound1 digitalmars.com> wrote:
 
 Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
  If the interval is open I can't generate uint.max.

auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

Yeah, but now about uniform(uint.min, uint.max)?

I think we're working under the assumption that the uniform function is based on some function that returns a random int or uint, no? So just call the base function. -Steve
Feb 14 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 19 Feb 2009 11:35:04 +0300, Don <nospam nospam.com> wrote:

 Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be  
 specialized to any numeric type (float, double, int, etc), it would  
 know its own boundaries, and it'd keep track of whether those  
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some  
 reasonable default intervals for common tasks like choosing  
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could  
 port to D if anyone is interested.

 --benji

Incidentally, the NumericInterval has lots of other interesting applications. For example auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE); bool safelyPolysemious = i.contains(someByteValue); auto array = new double[123]; auto i = NumericInterval.indexInterval(array); bool indexIsLegal = i.contains(someIndex); Using a numeric interval for generating random numbers would be, in my opinion, totally ideal. double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random double is exactly what I'd need right now. It's a ginormous interval! Andrei

It's worse than that. Since the range of double includes infinity, a uniform distribution must return +-infinity with probability 1. It's nonsense.

You guys telling me can't see any legitimate use for a NumericInterval type? And that it wouldn't be convenient to use for random number generation within that interval? So the full double range was a dumb example. But that wasn't really the point, was it? --benji

On the contrary, I've been giving NumericInterval considerable thought. One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2 (the _strict_ definition), or whether it is also permissible to have x2<=x1 (ie, you can specify the two endpoints in reverse order; the interval is then between min(x1,x2) and max(x1, x2)). This is an issue because I've noticed is that when I want to use it, I often have related pairs of values. eg. Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}. Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an interval in the _strict_ sense. But it satisfies the _relaxed_ definition.

Same as NumericInterval's bounds ([],(),[),(]), strictness might be an interval's template parameter.
Feb 19 2009
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

That could have speed problems with a tight loop accessing thread local storage. If it's a shared global with no locking, the thread randomness will increase its randomness <g>.
 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

Always different, because that's what I want if I just want a freakin' random number!
Feb 14 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG that is used if a check to thread_needLock() returns false.
 If it's a shared global with no locking, the thread randomness
 will increase its randomness <g>.

Yay, D's getting hardware random number generators!!!
Feb 14 2009
parent reply grauzone <none example.net> writes:
dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG that is used if a check to thread_needLock() returns false.

I don't understand this. If implemented right (D2.0 or gcc thread variables), TLS can be as fast as a normal global variable. You only need an additional check (a simple if()) to lazily initialize the RNG. Regarding "just work": it seems D's philosophy, to prefer to do the simple and "right" thing instead of raw efficiency, seems to be abandoned. Like std.algorithm uses weird string mixins, and the predicate is a template argument instead of a normal one (that's IMHO).
 If it's a shared global with no locking, the thread randomness
 will increase its randomness <g>.


Huh, non-blocking random generators?
 Yay, D's getting hardware random number generators!!!

Feb 14 2009
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
grauzone wrote:
 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.

That's the notorious double-checked-locking bug, an incredibly seductive pattern that is wrong and very illustrative about how hard it is to do multithreaded programming correctly. http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
Feb 14 2009
parent grauzone <none example.net> writes:
Walter Bright wrote:
 grauzone wrote:
 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.

That's the notorious double-checked-locking bug, an incredibly seductive pattern that is wrong and very illustrative about how hard it is to do multithreaded programming correctly. http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

No, I meant something like this: thread RandomGenerator rng; if (!rng) { rng = new RandomGenerator(); }
Feb 14 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG that is used if a check to thread_needLock() returns false.

I don't understand this. If implemented right (D2.0 or gcc thread variables), TLS can be as fast as a normal global variable. You only need an additional check (a simple if()) to lazily initialize the RNG. Regarding "just work": it seems D's philosophy, to prefer to do the simple and "right" thing instead of raw efficiency, seems to be abandoned. Like std.algorithm uses weird string mixins, and the predicate is a template argument instead of a normal one (that's IMHO).

Well I can only welcome the return of persistent flamebaiting to this group. D's philosophy is to prefer to do the right thing, even when that doesn't align with what you believe that is. I don't understand why the pattern of some of your posts is to throw some random (sic) semantic grenade in hope to pick a fight. Speaking of "normal" passing of the predicate, I'd been curious this morning to see how a hand-written max fares compared to reduce!(max), with reduce!("b > a ? b : a"), and with an indirect reduce using a delegate passed, ahem, "normally". This is because a nearest-neighbor algorithm I work on needs to compute max over an array in its inner loop. I wanted to see whether I need to roll my own max implementation versus using std.algorithm. For 20000 evaluations over a 100000-integers array, reduce!(min), reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 seconds on my machine. The "normal" version took 11.6 seconds to complete. Trials with various sizes reveal a similar pattern throughout: pass-by-delegate is way behind the others, which work about as fast. This means that, had std.algorithm taken the "normal" route, it would have been plenty useless for any serious needs. Not taking the "normal" route means that its abstractions rival hand-tuned code so I can use them without worrying they will come unglued when shown a real challenge. So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If you want to shoot, shoot, don't troll. Andrei
Feb 14 2009
next sibling parent reply grauzone <none example.net> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG that is used if a check to thread_needLock() returns false.

I don't understand this. If implemented right (D2.0 or gcc thread variables), TLS can be as fast as a normal global variable. You only need an additional check (a simple if()) to lazily initialize the RNG. Regarding "just work": it seems D's philosophy, to prefer to do the simple and "right" thing instead of raw efficiency, seems to be abandoned. Like std.algorithm uses weird string mixins, and the predicate is a template argument instead of a normal one (that's IMHO).

Well I can only welcome the return of persistent flamebaiting to this group. D's philosophy is to prefer to do the right thing, even when that

Why is it "flamebaiting" when I say my opinion? Others are doing it too. Here, I even used "IMHO" to mark that this is my opinion, and not necessarily some kind of objective truth. And if you think I'm trolling, then you obviously shouldn't take the bait. Else you'd add quite some fuel for an actual flamewar. I really don't understand this.
 doesn't align with what you believe that is. I don't understand why the 
 pattern of some of your posts is to throw some random (sic) semantic 
 grenade in hope to pick a fight.

Well, sorry. I didn't mean to offend anyone. Actually, I'm a bit shocked by your reaction.
 Speaking of "normal" passing of the predicate, I'd been curious this 
 morning to see how a hand-written max fares compared to reduce!(max), 
 with reduce!("b > a ? b : a"), and with an indirect reduce using a 
 delegate passed, ahem, "normally". This is because a nearest-neighbor 
 algorithm I work on needs to compute max over an array in its inner 
 loop. I wanted to see whether I need to roll my own max implementation 
 versus using std.algorithm.
 
 For 20000 evaluations over a 100000-integers array, reduce!(min), 
 reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 
 seconds on my machine. The "normal" version took 11.6 seconds to 
 complete. Trials with various sizes reveal a similar pattern throughout: 
 pass-by-delegate is way behind the others, which work about as fast.
 
 This means that, had std.algorithm taken the "normal" route, it would 
 have been plenty useless for any serious needs. Not taking the "normal" 
 route means that its abstractions rival hand-tuned code so I can use 
 them without worrying they will come unglued when shown a real challenge.

That's what I said. You preferred this design because it's faster. One can agree or disagree if this approach introduces "syntax artifacts" or if it violates KISS or whatever. Everything has its pros and cons. I just said I disagree. Also, dsimcha said:
 This is purely a convenience feature for when you want
 random number generation to "just work" even if it's not the most
 efficient thing
 in the world.

I agree to him. If you need efficiency, you can write your own code or use a highly specialized library fit for the job. Most users probably wouldn't rely on the standard library if they're optimizing their code. As long as the user _can_ write his own code, the standard library should prefer to be simple and easy to use. It also prevents bitrot, bloat, and incomprehensible library code.
 So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If 
 you want to shoot, shoot, don't troll.
 
 
 Andrei

Feb 14 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

local storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG that is used if a check to thread_needLock() returns false.

I don't understand this. If implemented right (D2.0 or gcc thread variables), TLS can be as fast as a normal global variable. You only need an additional check (a simple if()) to lazily initialize the RNG. Regarding "just work": it seems D's philosophy, to prefer to do the simple and "right" thing instead of raw efficiency, seems to be abandoned. Like std.algorithm uses weird string mixins, and the predicate is a template argument instead of a normal one (that's IMHO).

Well I can only welcome the return of persistent flamebaiting to this group. D's philosophy is to prefer to do the right thing, even when that

Why is it "flamebaiting" when I say my opinion? Others are doing it too. Here, I even used "IMHO" to mark that this is my opinion, and not necessarily some kind of objective truth. And if you think I'm trolling, then you obviously shouldn't take the bait. Else you'd add quite some fuel for an actual flamewar. I really don't understand this.

Yah, you're right. I shouldn't have answered, and my answer ended up going firmly further out of line. I am sorry about that. If you care for my reasons, please read on. Now I do understand in the end you're doing nothing but stating an opinion, which is of course what we do most of the time. All I'm saying is that it's more beneficial to discuss an opinion of the form: "I prefer froobenglobs to std.brooblenglabs. I think std.brooblenglabs are unnecessarily complicated with too few redeeming qualities. In contrast, froobenglobs have the simplicity given by using the universal trilegate feature, and their inherent loss of quantic traction is negligible. So I suggest froobenglobs to be considered as a supplement or replacement for a future std.brooblenglabs." This is an opinion that can be discussed because it brings supporting arguments that can in turn be debated. Many good things transpire that way. Should we reach the conclusion that our views on the matter differ in some irreducible manner, we'll still be gained from the discussion. Contrast this with not an opinion, but an umbrella statement that implicitly assumes a number of opinions are true facts: "Everything about C++, language and libraries, reeks of bad design" (this is how I read your ironic statement to the contrary). or "It looks like D's philosophy to do the simple and right thing instead of raw efficiency has been abandoned." This (and the subsequent example) after I've spent time on several occasions on explaining you how and why exactly pass-by-delegate is not defensible versus pass-by-alias, which includes pass-by-delegate in its jeans pocket. The "IMHO" license you use does little to help such statements. I agree the odd rules of pass-by-string may seem unsavory to some, but please agree with me that there is little in the way of a salient answer that could be given to the umbrella statement above.
 This means that, had std.algorithm taken the "normal" route, it would 
 have been plenty useless for any serious needs. Not taking the 
 "normal" route means that its abstractions rival hand-tuned code so I 
 can use them without worrying they will come unglued when shown a real 
 challenge.

That's what I said. You preferred this design because it's faster. One can agree or disagree if this approach introduces "syntax artifacts" or if it violates KISS or whatever. Everything has its pros and cons. I just said I disagree.

I preferred that design not only because it's faster, but also because it _includes_ *all* /benefits/ of the "slower" rule of pass-by-delegate, i.e. it can be slower and more flexible if that's what's needed. The only "cost" is a minor change in usage syntax. Essentially I think D has hit mother lode with the local instantiation feature because it has all the benefits and next to no problem. I sincerely believe it will be instrumental to the success of D. Given its advantages, it would be an extraordinary uphill battle to argue that pass-by-alias with local instantiation is a lesser design. What pushed my button is not that you undertook said battle, but that you made statements that imply you've already won it.
 Also, dsimcha said:
  > This is purely a convenience feature for when you want
  > random number generation to "just work" even if it's not the most
  > efficient thing
  > in the world.
 
 I agree to him. If you need efficiency, you can write your own code or 
 use a highly specialized library fit for the job. Most users probably 
 wouldn't rely on the standard library if they're optimizing their code. 

Well, you see, here's where we can "agree to disagree" (as much as I dislike the PCness of the phrase). IMHO, if most users in search for efficiency need to roll their own counterparts of the abstractions provided by Phobos, I've majorly wasted my time. Costly abstractions are a dime a dozen. I have nothing but contempt for them. Every dog language has them in spades. The real challenge is not to write sort in three lines. It's to write it in three lines and have it beat the 2000 lines sort. (Alas, we're not there yet.)
 As long as the user _can_ write his own code, the standard library
 should prefer to be simple and easy to use. It also prevents bitrot,
 bloat, and incomprehensible library code.

This approach seems to be pushing bitrot, bloat, and incomprehensibility into the user code, while keeping the library a collection of useless toys. Hardly an approach I'd go for. I am sorry. I just disagree. Andrei
Feb 14 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Also, dsimcha said:
  > This is purely a convenience feature for when you want
  > random number generation to "just work" even if it's not the most
  > efficient thing
  > in the world.

 I agree to him. If you need efficiency, you can write your own code or
 use a highly specialized library fit for the job. Most users probably
 wouldn't rely on the standard library if they're optimizing their code.

dislike the PCness of the phrase). IMHO, if most users in search for efficiency need to roll their own counterparts of the abstractions provided by Phobos, I've majorly wasted my time. Costly abstractions are a dime a dozen. I have nothing but contempt for them. Every dog language has them in spades. The real challenge is not to write sort in three lines. It's to write it in three lines and have it beat the 2000 lines sort. (Alas, we're not there yet.) > As long as the user _can_ write his own code, the standard library > should prefer to be simple and easy to use. It also prevents bitrot, > bloat, and incomprehensible library code. This approach seems to be pushing bitrot, bloat, and incomprehensibility into the user code, while keeping the library a collection of useless toys. Hardly an approach I'd go for. I am sorry. I just disagree. Andrei

Just to set the record straight, I actually agree with Andrei here. Elegant but costly abstractions are simply not D's niche, and are relatively easy to come by. If you want them, use a good dynamic interpreted language instead. The quote above was taken out of context. I believe that a library should provide efficient and flexible abstractions that eliminate the need for people to roll their own and that the goal of Phobos2 in this regard is a good one. I probably wouldn't use much of what was in std.algorithm, at least in the performance critical parts of my code, if it were pass-by-delegate instead of pass-by-alias. My only point was that, when complexity in the API (or language) is unavoidable if efficiency and flexibility are to be achieved, I think it's ok to slap a few not-so-efficient convenience features on top of it as long as the core remains efficient, flexible and reasonably usable.
Feb 14 2009
parent grauzone <none example.net> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Also, dsimcha said:
  > This is purely a convenience feature for when you want
  > random number generation to "just work" even if it's not the most
  > efficient thing
  > in the world.

 I agree to him. If you need efficiency, you can write your own code or
 use a highly specialized library fit for the job. Most users probably
 wouldn't rely on the standard library if they're optimizing their code.

dislike the PCness of the phrase). IMHO, if most users in search for efficiency need to roll their own counterparts of the abstractions provided by Phobos, I've majorly wasted my time. Costly abstractions are a dime a dozen. I have nothing but contempt for them. Every dog language has them in spades. The real challenge is not to write sort in three lines. It's to write it in three lines and have it beat the 2000 lines sort. (Alas, we're not there yet.) > As long as the user _can_ write his own code, the standard library > should prefer to be simple and easy to use. It also prevents bitrot, > bloat, and incomprehensible library code. This approach seems to be pushing bitrot, bloat, and incomprehensibility into the user code, while keeping the library a collection of useless toys. Hardly an approach I'd go for. I am sorry. I just disagree. Andrei

Just to set the record straight, I actually agree with Andrei here. Elegant but costly abstractions are simply not D's niche, and are relatively easy to come by. If you want them, use a good dynamic interpreted language instead. The quote

Well, I'm not necessarily talking about "elegant but costly abstractions". Rather, D is full of design decisions, where convenience and simplicity was preferred to maximal achievable performance: - relies on GC, rather bad support for fully manual MM (also consider the new closures in D2.0) - expensive checks in non-release mode (array bounds checking, object invariants) - methods are virtual by default - foreach used to call the enclosed code as a delegate - lazy keyword - associative arrays, also .sort - many runtime core and std-lib functions are designed that way Maybe there's even more. D always has been a nice blend of performance centered low level languages like C/C++, and higher level ones liek dynamic or functional languages.
 above was taken out of context.  I believe that a library should provide
efficient
 and flexible abstractions that eliminate the need for people to roll their own
and
 that the goal of Phobos2 in this regard is a good one.  I probably wouldn't use
 much of what was in std.algorithm, at least in the performance critical parts
of
 my code, if it were pass-by-delegate instead of pass-by-alias.  My only point
was
 that, when complexity in the API (or language) is unavoidable if efficiency and
 flexibility are to be achieved, I think it's ok to slap a few not-so-efficient
 convenience features on top of it as long as the core remains efficient,
flexible
 and reasonably usable.

Feb 14 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Sat, 14 Feb 2009 21:38:19 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 For 20000 evaluations over a 100000-integers array, reduce!(min), 
 reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 
 seconds on my machine. The "normal" version took 11.6 seconds to 
 complete. Trials with various sizes reveal a similar pattern 
 throughout: pass-by-delegate is way behind the others, which work 
 about as fast.

 This means that, had std.algorithm taken the "normal" route, it would 
 have been plenty useless for any serious needs. Not taking the 
 "normal" route means that its abstractions rival hand-tuned code so I 
 can use them without worrying they will come unglued when shown a real 
 challenge.

 So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": 
 If you want to shoot, shoot, don't troll.


 Andrei

Did you try passing a comparator (C++ way to do things)? It should be pretty fast, to.

Is this what you meant? struct StructMax { static uint opCall(uint a, uint b) { return b > a ? b : a; } } uint reduceStruct(uint[] x, StructMax s) { invariant n = x.length; uint m = x[0]; for (uint i = 1; i < n; ++i) { m = s(m, x[i]); } return m; } I did, and as expected the running time is similar to the other fast approaches. Removing "static" does not affect the timings sensibly. Andrei
Feb 14 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 14 Feb 2009 21:38:19 +0300, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

local storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG that is used if a check to thread_needLock() returns false.

variables), TLS can be as fast as a normal global variable. You only need an additional check (a simple if()) to lazily initialize the RNG. Regarding "just work": it seems D's philosophy, to prefer to do the simple and "right" thing instead of raw efficiency, seems to be abandoned. Like std.algorithm uses weird string mixins, and the predicate is a template argument instead of a normal one (that's IMHO).

Well I can only welcome the return of persistent flamebaiting to this group. D's philosophy is to prefer to do the right thing, even when that doesn't align with what you believe that is. I don't understand why the pattern of some of your posts is to throw some random (sic) semantic grenade in hope to pick a fight. Speaking of "normal" passing of the predicate, I'd been curious this morning to see how a hand-written max fares compared to reduce!(max), with reduce!("b > a ? b : a"), and with an indirect reduce using a delegate passed, ahem, "normally". This is because a nearest-neighbor algorithm I work on needs to compute max over an array in its inner loop. I wanted to see whether I need to roll my own max implementation versus using std.algorithm. For 20000 evaluations over a 100000-integers array, reduce!(min), reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 seconds on my machine. The "normal" version took 11.6 seconds to complete. Trials with various sizes reveal a similar pattern throughout: pass-by-delegate is way behind the others, which work about as fast. This means that, had std.algorithm taken the "normal" route, it would have been plenty useless for any serious needs. Not taking the "normal" route means that its abstractions rival hand-tuned code so I can use them without worrying they will come unglued when shown a real challenge. So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If you want to shoot, shoot, don't troll. Andrei

Did you try passing a comparator (C++ way to do things)? It should be pretty fast, to.
Feb 14 2009
prev sibling next sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Fri, 13 Feb 2009 17:10:29 -0800, Andrei Alexandrescu wrote:

 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum?

I'm used to float generators [0,1] and integer generators [0,max). OTOH, if uniform(0.0,256.0) is uniform [0.0,256.0) then cast(int)(uniform(0.0,256.0)) is also uniform [0,256) which is good.
Feb 14 2009
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).
Feb 14 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or [a, b) for everything? Andrei
Feb 14 2009
parent reply Don Clugston <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or [a, b) for everything? Andrei

I'm leaning towards [a,b) for everything (consistency with arrays), but I'd want to know what the reasoning of the boost/c++0x guys was.
Feb 15 2009
next sibling parent reply Don <nospam nospam.com> writes:
Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

b) for everything? Andrei

want to know what the reasoning of the boost/c++0x guys was.

How do you create a random uint that can take on any of uint's values with [a,b)? That's the main reason I can think of to go with [a,b] for integral types. With floats it's never useful to use the entire value range. --bb

I think that is the primary argument. BUT: * You _can_ still use "[]" in that case. * I also think it'd be worth having a "create a random n-byte number", as well, which would be the main use for a full uint random number. * If you're creating a number in the range 0..uint.max+1, you're going to have to be careful in lots of places. You can't get that number from an array length, for example. * I think that hard-core scientific/mathematical users are the main users of the more esoteric cases, and can be expected to get it right (and have no problem with the "[]","()","(]"... notation. I think that what's important for the default is that be correct and obvious for as many cases as possible. The strength of "[)" is that if we can say "ALL ranges in D are [) unless otherwise stated by the user", it's hard to ever justify breaking that convention.
Feb 15 2009
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Don:
 * I also think it'd be worth having a "create a random n-byte number", 
 as well, which would be the main use for a full uint random number.

I mostly agree. For example a basic randUint is useful (I have it in dlibs and I use it once in a while). I think a randomUint and randUlong may be enough (for the other situations you can use something like randInt with a limited range). I am not sure, probably others have to state what they would like the random functions to be. Bye, bearophile
Feb 15 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 I guess I'd find it more useful to have a function that gave a random
 value of a given type, like random_value!(short).  That way you get
 the signed/unsigned specification automatically.  But both could be
 useful.

You can, just pass short values into uniform() or specify type arguments for it. Actually uniform() used to ask compulsively for its parameter type, but I changed that because it was too verbose for no good reason.
 * If you're creating a number in the range 0..uint.max+1, you're going to
 have to be careful in lots of places. You can't get that number from an
 array length, for example.
 * I think that hard-core scientific/mathematical users are the main users of
 the more esoteric cases, and can be expected to get it right (and have no
 problem with the "[]","()","(]"... notation. I think that what's important
 for the default is that be correct and obvious for as many cases as
 possible.

 The strength of "[)" is that if we can say "ALL ranges in D are [) unless
 otherwise stated by the user", it's hard to ever justify breaking that
 convention.

That sounds reasonable to me. As long as there's an easy way to also create a random number that covers an entire range of a given type (perhaps limited to integral types).

It's a good litmus test. With the API suggested by Don, in order to generate a random byte you'd have to say: auto b = uniform!("[]")(rng, byte.min, byte.max); Is this acceptable? Andrei
Feb 15 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

If that's what it took I'd probably try this instead: ubyte b = uniform(rng, 0,256); And then add an explicit cast to ubyte if the compiler didn't like that. So no, I don't really think that's good enough. Others may disagree about how important a use case this is, though.

This will indeed work: ubyte b = cast(ubyte) uniform(rng, 0,256); I just showed the solution involving no cast. So are you ok with that? Andrei
Feb 15 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 4:07 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

ubyte b = uniform(rng, 0,256); And then add an explicit cast to ubyte if the compiler didn't like that. So no, I don't really think that's good enough. Others may disagree about how important a use case this is, though.

ubyte b = cast(ubyte) uniform(rng, 0,256); I just showed the solution involving no cast. So are you ok with that?

No, but if you are then I'll just make my own 1-liner library functions, so it's not a big deal.

What signatures would the functions have? Andrei
Feb 15 2009
prev sibling next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or [a, b) for everything? Andrei

I'm leaning towards [a,b) for everything (consistency with arrays), but I'd want to know what the reasoning of the boost/c++0x guys was.

with [a,b)? That's the main reason I can think of to go with [a,b] for integral types. With floats it's never useful to use the entire value range. --bb

Keep in mind that we're talking only about defaults here. At least in the current implementation, "[)" vs. "[]" can be specified by template parameters. Wanting the entire range of an integer type is an edge case, so it's not important that it be well-supported by the defaults, as long as it can be done without resorting to serious kludges.
Feb 15 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 4:36 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 4:07 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

If that's what it took I'd probably try this instead: ubyte b = uniform(rng, 0,256); And then add an explicit cast to ubyte if the compiler didn't like that. So no, I don't really think that's good enough. Others may disagree about how important a use case this is, though.

This will indeed work: ubyte b = cast(ubyte) uniform(rng, 0,256); I just showed the solution involving no cast. So are you ok with that?

No, but if you are then I'll just make my own 1-liner library functions, so it's not a big deal.

What signatures would the functions have?

What I'd like best is if all the things you listed would work, but in addition T x = uniform!(T)(); would generate a uniform value over T's entire range, using a default rng. And rng can be specified if you care what gets used: T x = uniform!(T)(rng); --bb
Feb 15 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or [a, b) for everything? Andrei

I'm leaning towards [a,b) for everything (consistency with arrays), but I'd want to know what the reasoning of the boost/c++0x guys was.

How do you create a random uint that can take on any of uint's values with [a,b)? That's the main reason I can think of to go with [a,b] for integral types. With floats it's never useful to use the entire value range. --bb
Feb 15 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 3:33 AM, Don <nospam nospam.com> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or [a, b) for everything? Andrei

I'm leaning towards [a,b) for everything (consistency with arrays), but I'd want to know what the reasoning of the boost/c++0x guys was.

How do you create a random uint that can take on any of uint's values with [a,b)? That's the main reason I can think of to go with [a,b] for integral types. With floats it's never useful to use the entire value range. --bb

I think that is the primary argument. BUT: * You _can_ still use "[]" in that case. * I also think it'd be worth having a "create a random n-byte number", as well, which would be the main use for a full uint random number.

I guess I'd find it more useful to have a function that gave a random value of a given type, like random_value!(short). That way you get the signed/unsigned specification automatically. But both could be useful.
 * If you're creating a number in the range 0..uint.max+1, you're going to
 have to be careful in lots of places. You can't get that number from an
 array length, for example.
 * I think that hard-core scientific/mathematical users are the main users of
 the more esoteric cases, and can be expected to get it right (and have no
 problem with the "[]","()","(]"... notation. I think that what's important
 for the default is that be correct and obvious for as many cases as
 possible.

 The strength of "[)" is that if we can say "ALL ranges in D are [) unless
 otherwise stated by the user", it's hard to ever justify breaking that
 convention.

That sounds reasonable to me. As long as there's an easy way to also create a random number that covers an entire range of a given type (perhaps limited to integral types). --bb
Feb 15 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 3:41 AM, dsimcha <dsimcha yahoo.com> wrote:
 == Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving the issue of numeric ranges some thought, and I have begun an implementation of a general abstraction. Any open range can be converted into a closed range, but the converse does not apply. So any implementation will be using "[]" internally. -range("[)", a, b) == range("(]", -b, -a) range("[)", a, b) == range("[]", a, predecessor(b)) range("()", a, b) == range("[]", successor(a), predecessor(b)) There's a couple of difficult situations involving floating-point numbers. * "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but not -1, whereas (1, 2, rng) includes 1 but not 2. * any floating point range which includes 0 is difficult, because there are so many numbers which are almost zero. The probability of getting a zero for an 80-bit real is so small that you probably wouldn't encounter it in your lifetime. I think this weakens arguments based on analogy with the integer case. However, it is much easier to make an unbiased rng for [1,2) than for [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or [a, b) for everything? Andrei

I'm leaning towards [a,b) for everything (consistency with arrays), but I'd want to know what the reasoning of the boost/c++0x guys was.

with [a,b)? That's the main reason I can think of to go with [a,b] for integral types. With floats it's never useful to use the entire value range. --bb

Keep in mind that we're talking only about defaults here. At least in the current implementation, "[)" vs. "[]" can be specified by template parameters. Wanting the entire range of an integer type is an edge case, so it's not important that it be well-supported by the defaults, as long as it can be done without resorting to serious kludges.

I don't agree that it's such an edge case. I often want a random ubyte that covers the entire range. --bb
Feb 15 2009
prev sibling next sibling parent reply "Joel C. Salomon" <joelcsalomon gmail.com> writes:
Don wrote:
 There's a couple of difficult situations involving floating-point numbers.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero for an 80-bit real is so small that you probably wouldn't encounter
 it in your lifetime. I think this weakens arguments based on analogy
 with the integer case.

In that vein: for the floating-point case, do you want to emulate a “true” uniform random distribution in the range [x, y), or make every expressible IEEE 754 value in that range come out with equal likelihood? One is the logarithm of the other, in some sense. —Joel Salomon
Feb 15 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Joel C. Salomon wrote:
 Don wrote:
 There's a couple of difficult situations involving floating-point numbers.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero for an 80-bit real is so small that you probably wouldn't encounter
 it in your lifetime. I think this weakens arguments based on analogy
 with the integer case.

In that vein: for the floating-point case, do you want to emulate a true uniform random distribution in the range [x, y), or make every expressible IEEE 754 value in that range come out with equal likelihood? One is the logarithm of the other, in some sense.

The former! Otherwise the distribution will be biased toward numbers closer to 0 (which are denser than larger numbers). That being said, for the range [0, 1) the exponent is constant so it's nice to generate all possible mantissa values, something that simple division by uint.max does not achieve. Andrei
Feb 15 2009
parent Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Joel C. Salomon wrote:
 Don wrote:
 There's a couple of difficult situations involving floating-point 
 numbers.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero for an 80-bit real is so small that you probably wouldn't encounter
 it in your lifetime. I think this weakens arguments based on analogy
 with the integer case.

In that vein: for the floating-point case, do you want to emulate a true uniform random distribution in the range [x, y), or make every expressible IEEE 754 value in that range come out with equal likelihood? One is the logarithm of the other, in some sense.

The former! Otherwise the distribution will be biased toward numbers closer to 0 (which are denser than larger numbers). That being said, for the range [0, 1) the exponent is constant so it's nice to generate all possible mantissa values, something that simple division by uint.max does not achieve.

You mean [1, 2). The code for complete uniform distribution is in tango.math.random, thanks to Fawzi. Looking closely at his code, I find he's using (a,b) for the floating-point case, not the [a,b) which I had expected.
Feb 15 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 I guess I'd find it more useful to have a function that gave a random
 value of a given type, like random_value!(short).  That way you get
 the signed/unsigned specification automatically.  But both could be
 useful.

You can, just pass short values into uniform() or specify type arguments for it. Actually uniform() used to ask compulsively for its parameter type, but I changed that because it was too verbose for no good reason.
 * If you're creating a number in the range 0..uint.max+1, you're going to
 have to be careful in lots of places. You can't get that number from an
 array length, for example.
 * I think that hard-core scientific/mathematical users are the main users
 of
 the more esoteric cases, and can be expected to get it right (and have no
 problem with the "[]","()","(]"... notation. I think that what's
 important
 for the default is that be correct and obvious for as many cases as
 possible.

 The strength of "[)" is that if we can say "ALL ranges in D are [) unless
 otherwise stated by the user", it's hard to ever justify breaking that
 convention.

That sounds reasonable to me. As long as there's an easy way to also create a random number that covers an entire range of a given type (perhaps limited to integral types).

It's a good litmus test. With the API suggested by Don, in order to generate a random byte you'd have to say: auto b = uniform!("[]")(rng, byte.min, byte.max); Is this acceptable?

If that's what it took I'd probably try this instead: ubyte b = uniform(rng, 0,256); And then add an explicit cast to ubyte if the compiler didn't like that. So no, I don't really think that's good enough. Others may disagree about how important a use case this is, though. --bb
Feb 15 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 4:07 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

If that's what it took I'd probably try this instead: ubyte b = uniform(rng, 0,256); And then add an explicit cast to ubyte if the compiler didn't like that. So no, I don't really think that's good enough. Others may disagree about how important a use case this is, though.

This will indeed work: ubyte b = cast(ubyte) uniform(rng, 0,256); I just showed the solution involving no cast. So are you ok with that?

No, but if you are then I'll just make my own 1-liner library functions, so it's not a big deal. --bb
Feb 15 2009
prev sibling next sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I always have to look up whether "[]" is exclusive or inclusive. It's succinct, but you only have to write code once, and you have to read it often. Granted, people who are more math-intensive than I can read that without thinking about it, but the preponderance of programmers have no more math skills than I. So, might I suggest using an enum rather than a string mixin?
Feb 14 2009
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Christopher Wright:
 I always have to look up whether "[]" is exclusive or inclusive.

A cute thing for you: where I live in Mathematics we use a different syntax: [1, 10] <= closed interval ]1, 10[ <= open interval [1, 10[ <= interval open on the right ]1, 10] <= interval open on the left
 So, might I suggest using an enum rather than a string mixin?

Argh! :-) Bye, bearophile
Feb 14 2009
prev sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
Christopher Wright escribi:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I always have to look up whether "[]" is exclusive or inclusive. It's succinct, but you only have to write code once, and you have to read it often.

You can remember "]" is more in touch with the number of the right (the stick is in full touch with the number) while the ")" only touches the number in that tiny right bit of the parenthesis. So the "]" includes it, but the ")" merely touches it but doesn't include it. At least that's what comes to my mind with those symbols. :-P
Feb 14 2009
parent Christopher Wright <dhasenan gmail.com> writes:
Ary Borenszweig wrote:
 Christopher Wright escribi:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I always have to look up whether "[]" is exclusive or inclusive. It's succinct, but you only have to write code once, and you have to read it often.

You can remember "]" is more in touch with the number of the right (the stick is in full touch with the number) while the ")" only touches the number in that tiny right bit of the parenthesis. So the "]" includes it, but the ")" merely touches it but doesn't include it. At least that's what comes to my mind with those symbols. :-P

Right, but these helpful mnemonics will flee well before I see any use of uniform.
Feb 14 2009
prev sibling next sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Andrei Alexandrescu wrote:
 1. Are you cool with making the rng the last parameter and give it a 
 default value?

Yes.
 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

Yes.
 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

If you go with a rng per thread, then it's probably safest NOT to generate the same sequence by default. Somebody how knows what he's doing can always seed with a constant (for tests and such.)
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I love your templates as much as the next guy, but let's not forget the possibility of using different function names instead of templating everything!? And, as other have said, I'm ok with [) since that's what rand()%max boiled down to as well. Futhermore, since arrays in D are also [), users of D get used to it so it's not odd at all. L.
Feb 15 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I love your templates as much as the next guy, but let's not forget the possibility of using different function names instead of templating everything!?

I couldn't find good names. uniformOpenClosed, uniformOC, uniformOpenLeft... what would be some good names?
 And, as other have said, I'm ok with [) since that's what rand()%max 
 boiled down to as well. Futhermore, since arrays in D are also [), users 
 of D get used to it so it's not odd at all.

Ok. Let me just note that rand()%max is a lousy method of generating random numbers between 0 and max-1 and everybody should put that in the bin with Popular Examples That Should Never Be Used, together with exponential Fibonacci, linear-space factorial, and bubblesort. Andrei
Feb 15 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steve Schveighoffer wrote:
 On Mon, 16 Feb 2009 12:20:34 +0900, Bill Baxter wrote:
 
 On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in
 the bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

()? Or using %max on any random number generator in general. If the first case, I totally agree, and I found out that from experience first, then googling second :) But if the latter, I wasn't aware that all RNGs were bad at randomizing the lower bits, so could you explain why? It's also possible that Lionello meant the latter case as well.

between 0 and 10. Now you say RNG()%7 to get values 0..6 Not all inputs map to outputs evenly now. More inputs map to 0 than to 6, so the distribution is no longer uniform. Of course if you % by something that divides evenly into the RNG's range then it shouldn't be a problem, other than the lack of randomness in lower bits of some rand() implementations. --bb

Hm... good point, but probably not significant unless you are generating numbers in a range close to the maximum range. For example, the 0..6 range isn't going to divide evenly into int.max, but the bias towards certain values is going to be very slight. I would guess then that the correct method is to add n to the nth random number generated? Or is there a better way?

See the implementation of uniform in std.random. Andrei
Feb 15 2009
prev sibling next sibling parent reply "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gnafec$1cko$1 digitalmars.com...
 Ok. Let me just note that rand()%max is a lousy method of generating 
 random numbers between 0 and max-1 and everybody should put that in the 
 bin with Popular Examples That Should Never Be Used, together with 
 exponential Fibonacci, linear-space factorial, and bubblesort.

Yes yes, I know I know, but, honestly, when I'm generating noise for some terrain generation or a random color, or some fuzzy AI behaviour, I don't care about random periods that much and %max does the job. Nobody is using %max for private key generation, AFAIK :) L.
Feb 15 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Lionello Lunesu wrote:
 
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
 news:gnafec$1cko$1 digitalmars.com...
 Ok. Let me just note that rand()%max is a lousy method of generating 
 random numbers between 0 and max-1 and everybody should put that in 
 the bin with Popular Examples That Should Never Be Used, together with 
 exponential Fibonacci, linear-space factorial, and bubblesort.

Yes yes, I know I know, but, honestly, when I'm generating noise for some terrain generation or a random color, or some fuzzy AI behaviour, I don't care about random periods that much and %max does the job.

Ahhh, so that's why all my enemies in Far Cry 1 and 2 behave so predictably! :o) Andrei
Feb 16 2009
prev sibling parent Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I love your templates as much as the next guy, but let's not forget the possibility of using different function names instead of templating everything!?

I couldn't find good names. uniformOpenClosed, uniformOC, uniformOpenLeft... what would be some good names?

I don't like that approach. The "[)", "[]" issue comes up every time that you have a numeric range. It's not just random number generation. Unfortunately something like the following is too ugly. uniform(rng, Interval!("[)")(a, b)); A more radical possibility would be to make .nextUp, .nextDown built-in primitives for floating-point types. Then you could make all the ranges "[]".
 
 And, as other have said, I'm ok with [) since that's what rand()%max 
 boiled down to as well. Futhermore, since arrays in D are also [), 
 users of D get used to it so it's not odd at all.

Ok. Let me just note that rand()%max is a lousy method of generating random numbers between 0 and max-1 and everybody should put that in the bin with Popular Examples That Should Never Be Used, together with exponential Fibonacci, linear-space factorial, and bubblesort. Andrei

Feb 16 2009
prev sibling next sibling parent reply Steve Schveighoffer <schveiguy yahoo.com> writes:
On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in the
 bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

Do you mean rand()%max specifically in the case of the C function rand ()? Or using %max on any random number generator in general. If the first case, I totally agree, and I found out that from experience first, then googling second :) But if the latter, I wasn't aware that all RNGs were bad at randomizing the lower bits, so could you explain why? It's also possible that Lionello meant the latter case as well. -Steve
Feb 15 2009
parent "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Steve Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:gnal9n$19a9$3 digitalmars.com...
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in the
 bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

Do you mean rand()%max specifically in the case of the C function rand ()? Or using %max on any random number generator in general. If the first case, I totally agree, and I found out that from experience first, then googling second :) But if the latter, I wasn't aware that all RNGs were bad at randomizing the lower bits, so could you explain why? It's also possible that Lionello meant the latter case as well.

Actually, I forgot to add that I always use divisors of RAND_MAX ;-) L.
Feb 15 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
<schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in the
 bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

Do you mean rand()%max specifically in the case of the C function rand ()? Or using %max on any random number generator in general. If the first case, I totally agree, and I found out that from experience first, then googling second :) But if the latter, I wasn't aware that all RNGs were bad at randomizing the lower bits, so could you explain why? It's also possible that Lionello meant the latter case as well.

It's bad in general. Say your RNG generates perfectly uniform numbers between 0 and 10. Now you say RNG()%7 to get values 0..6 Not all inputs map to outputs evenly now. More inputs map to 0 than to 6, so the distribution is no longer uniform. Of course if you % by something that divides evenly into the RNG's range then it shouldn't be a problem, other than the lack of randomness in lower bits of some rand() implementations. --bb
Feb 15 2009
prev sibling next sibling parent Steve Schveighoffer <schveiguy yahoo.com> writes:
On Mon, 16 Feb 2009 12:20:34 +0900, Bill Baxter wrote:

 On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in
 the bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

Do you mean rand()%max specifically in the case of the C function rand ()? Or using %max on any random number generator in general. If the first case, I totally agree, and I found out that from experience first, then googling second :) But if the latter, I wasn't aware that all RNGs were bad at randomizing the lower bits, so could you explain why? It's also possible that Lionello meant the latter case as well.

It's bad in general. Say your RNG generates perfectly uniform numbers between 0 and 10. Now you say RNG()%7 to get values 0..6 Not all inputs map to outputs evenly now. More inputs map to 0 than to 6, so the distribution is no longer uniform. Of course if you % by something that divides evenly into the RNG's range then it shouldn't be a problem, other than the lack of randomness in lower bits of some rand() implementations. --bb

Hm... good point, but probably not significant unless you are generating numbers in a range close to the maximum range. For example, the 0..6 range isn't going to divide evenly into int.max, but the bias towards certain values is going to be very slight. I would guess then that the correct method is to add n to the nth random number generated? Or is there a better way? -Steve
Feb 15 2009
prev sibling parent Steve Schveighoffer <schveiguy yahoo.com> writes:
On Sun, 15 Feb 2009 20:30:06 -0800, Andrei Alexandrescu wrote:

 Steve Schveighoffer wrote:
 On Mon, 16 Feb 2009 12:20:34 +0900, Bill Baxter wrote:
 
 On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in
 the bin with Popular Examples That Should Never Be Used, together
 with exponential Fibonacci, linear-space factorial, and bubblesort.

rand ()? Or using %max on any random number generator in general. If the first case, I totally agree, and I found out that from experience first, then googling second :) But if the latter, I wasn't aware that all RNGs were bad at randomizing the lower bits, so could you explain why? It's also possible that Lionello meant the latter case as well.

between 0 and 10. Now you say RNG()%7 to get values 0..6 Not all inputs map to outputs evenly now. More inputs map to 0 than to 6, so the distribution is no longer uniform. Of course if you % by something that divides evenly into the RNG's range then it shouldn't be a problem, other than the lack of randomness in lower bits of some rand() implementations. --bb

Hm... good point, but probably not significant unless you are generating numbers in a range close to the maximum range. For example, the 0..6 range isn't going to divide evenly into int.max, but the bias towards certain values is going to be very slight. I would guess then that the correct method is to add n to the nth random number generated? Or is there a better way?

See the implementation of uniform in std.random. Andrei

Ugh, so you have to repeatedly generate random numbers until it gives you a value inside your range? Using your calculation for bucketsize, if I want to generate a random number between 0 and (urng.max - urng.min) / 2 + 1 will enter a loop for which it throws away half the random numbers generated? It seems like there should be a better method, but I can't think of one right now. You also can add an optimization to your code, change your final code to: auto maxvalue = urng.min + myRange * bucketSize + (bucketSize - 1); do { r = urng.next; // avoid calculation in loop } while (r > maxvalue); return _a + (r - urng.min) / bucketSize; Not sure if it makes a huge difference, or has a huge negative impact when it's not important. -Steve
Feb 15 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 13 de febrero a las 17:10 me escribiste:
 Leonardo suggested that some functions in std.random should not require their 
 user to be bothered with creating a random object, i.e.:
 
 auto r = Random(unpredictableSeed);
 auto n = uniform(r, 0, 100);
 
 Instead the library should simply support:
 
 auto n = uniform(0, 100);
 
 and do this by presumably using a global RNG under the hood. So I wanted to
ask 
 all y'all:
 
 1. Are you cool with making the rng the last parameter and give it a default 
 value?

Yes
 2. The global random generator will be allocated per thread. Are you cool with 
 this too?

Yes
 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

NO! I think this feature is for small script-like programs, where the most common case is you want a random number to be random =) If you want to do some more complicated stuff, just set the seed yourself.
 4. While we're at it, should uniform(a, b) generate by default something in
[a, 
 b] or [a, b)? Someone once explained to me that generating [a, b] for floating 
 point numbers is the source of all evils and that Hitler, Stalin and Kim Il
Sung 
 (should he still be alive) must be using that kind of generator. Conversely, 
 generating [a, b) is guaranteed to bring in the long term everlasting peace to 
 Earth. My problem however is that in the integer realm I always want to
generate 
 [a, b]. Furthermore, I wouldn't be happy if the shape of the interval was 
 different for integers and floating point numbers. How to break this
conundrum? 
 Don't forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:

Being a C and Python programmer mostly, I find more intuitive [a, b) (The 'for' statement is usually used that way, same for the Python's range() function) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Un paracaidista, que no deja de caer. Lo que me lleva hacia arriba, es lo que me tira hacia abajo.
Feb 16 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Fri, Feb 13, 2009 at 8:10 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

I feel like I'm mimicking everyone else by now, but:

 1. Are you cool with making the rng the last parameter and give it a default
 value?

Yes.
 2. The global random generator will be allocated per thread. Are you cool
 with this too?

Yes.
 3. How should the global rng be initialized? To always generate the same
 sequence, or not?

Hell no.
 4. While we're at it, should uniform(a, b) generate by default something in
 [a, b] or [a, b)? Someone once explained to me that generating [a, b] for
 floating point numbers is the source of all evils and that Hitler, Stalin
 and Kim Il Sung (should he still be alive) must be using that kind of
 generator. Conversely, generating [a, b) is guaranteed to bring in the long
 term everlasting peace to Earth. My problem however is that in the integer
 realm I always want to generate [a, b]. Furthermore, I wouldn't be happy if
 the shape of the interval was different for integers and floating point
 numbers. How to break this conundrum? Don't forget that we're only worrying
 about defaults, explicit generation is always possible with self-explanatory
 code:

Almost all of my uses of random numbers involve indexing arrays, so [a, b) would be most useful for me. But again, I could certainly "alias uniform!("[)") uniformIdx;".
Feb 16 2009
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Slightly off-topic, but I just found this and thought it might be of
interest.

http://mjolnirstudios.com/IanBullard/files/79ffbca75a75720f066d491e9ea935a0-10.php

For reference, here is the ASM source for the mentioned Flipcode random
number generator:

 pshufw mm1, mm0, 0x1E
 paddd mm0, mm1

Finally, here's a link to the DieHard test on Wikipedia: http://en.wikipedia.org/wiki/Diehard_tests -- Daniel
Feb 19 2009
next sibling parent BCS <none anon.com> writes:
Hello Daniel,

 Slightly off-topic, but I just found this and thought it might be of
 interest.
 
 http://mjolnirstudios.com/IanBullard/files/79ffbca75a75720f066d491e9ea
 935a0-10.php
 
 For reference, here is the ASM source for the mentioned Flipcode
 random number generator:
 
 pshufw mm1, mm0, 0x1E
 paddd mm0, mm1

http://en.wikipedia.org/wiki/Diehard_tests -- Daniel

someone should make a "double DieHard(uint function())" so we can test the quality of rand's.
Feb 19 2009
prev sibling parent Don <nospam nospam.com> writes:
Daniel Keep wrote:
 Slightly off-topic, but I just found this and thought it might be of
 interest.
 
 http://mjolnirstudios.com/IanBullard/files/79ffbca75a75720f066d491e9ea935a0-10.php
 
 For reference, here is the ASM source for the mentioned Flipcode random
 number generator:
 
 pshufw mm1, mm0, 0x1E
 paddd mm0, mm1

Finally, here's a link to the DieHard test on Wikipedia: http://en.wikipedia.org/wiki/Diehard_tests -- Daniel

Yeah. It's worth knowing that reason that linear congruential RNGs have historically been used so much is because _they are the easiest to analyze mathematically_, NOT because they are good in either speed or randomness. Since the early days, the techniques for evaluating RNGs have improved dramatically (and the DieHard test suite incorporates much of the analysis), so their original selling point is gone. There is NO REASON to use a linear congruential RNG as the default any more.
Feb 20 2009