digitalmars.D - floating point

digitalmars.D - floating point - nan initializers

Dave (32/32) Feb 18 2006 Rational: nan will propogate through an entire calculation if the develo...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (9/14) Feb 18 2006 Characters also have non-zero init values...

Dave (5/19) Feb 18 2006 Yea - that should have read 'native numeric types', which is what I had ...

Ivan Senji (7/23) Feb 18 2006 And this is good because you know that there is a problem. What if you

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (33/37) Feb 18 2006 An age-old bug made it not initalize dynamic arrays with .init values.

Walter Bright (3/4) Feb 18 2006 That's a bug. I'll add it to the list.

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (3/6) Feb 19 2006 Oh, I thought it was already on it... Glad I could help out, though.

Dave (17/48) Feb 18 2006 I'd get 0 instead of nan and just as easily know there was something wro...

Jarrett Billingsley (21/22) Feb 18 2006 I agree. I've ended up getting _more_ bugs with the auto-initialization...

Dave (23/42) Feb 18 2006 Exactly - it just kind of feels like an academic addition to the languag...

Jarrett Billingsley (9/14) Feb 18 2006 They used to be, but Walter changed it when Arcane Jill (i.e. the founde...
Walter Bright (32/36) Feb 18 2006 I'll go out on a limb here, and say that Java was designed by people who...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (7/11) Feb 19 2006 IMHO:

Walter Bright (4/7) Feb 19 2006 I suspect any such machines will not support 80 bit floats, so the 128 b...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (11/15) Feb 19 2006 Right...

Jarrett Billingsley (11/18) Feb 19 2006 And most of the time, I *think* that I want it to be 0. It's a rare

Walter Bright (19/34) Feb 19 2006 If it was default initialized to 0, there'd be no way to detect the erro...

Dave (48/65) Feb 19 2006 Excepting nan init., I'm all for the great support that D gives to numer...

Ivan Senji (16/61) Feb 19 2006 or maybe sum *= dx + dy;

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (7/16) Feb 19 2006 Then again, it's usually better to catch those bugs at compile time ?

Ivan Senji (4/19) Feb 19 2006 Compile time would be best. C# compiler does the same thing, and

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (3/8) Feb 19 2006 D doesn't do warnings, so run-time errors seem to be preferred.

Walter Bright (9/13) Feb 19 2006 "may" be? That's why it isn't in D. Wishy-washy messages aren't a soluti...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (20/31) Feb 19 2006 Jikes is always so polite about it, other compilers are more terse:

Walter Bright (5/7) Feb 20 2006 No two compilers emit the same warnings. So a general statement about

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (8/9) Feb 20 2006 Haven't used DMC enough, I'm afraid, just GCC...

Walter Bright (9/11) Feb 19 2006 That's right. The default initialization is *not* about being convenient...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (8/15) Feb 19 2006 So it's an error to use ints before they're initialized ?

Walter Bright (8/22) Feb 19 2006 Sometimes I get lazy. :-) I'll say it's poor style, and yes, I'm guilty ...

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (6/15) Feb 19 2006 As long as it is clear... (In some other languages, it's stylish)

Derek Parnell (12/12) Feb 20 2006 I support the use of NaN as an initializer. I only wish it could also be...

Walter Bright (5/7) Feb 20 2006 That appears to implement in software my idea that memory should have a

Sean Kelly (4/12) Feb 20 2006 Could make it an optional/debug feature, but it might be a lot of work
Derek Parnell (13/21) Feb 20 2006 Yes, I know that, and I wasn't suggesting that D adopt this strategy.

Walter Bright (3/10) Feb 20 2006 Thanks for clarifying this.

John Stoneham (20/32) Feb 19 2006 I agree. I'm currently working on an involved combinatorial calculation,...

Sean Kelly (3/30) Feb 19 2006 This would be nice.

Sean Kelly (4/34) Feb 19 2006 I take it back:

Sean Kelly (6/41) Feb 19 2006 Actually, does this work?

Ben Hinkle (17/23) Feb 19 2006 [snip]

Dave <Dave_member pathlink.com> writes:

Rational: nan will propogate through an entire calculation if the developer
forgets to initialize a floating point variable.

Problems:

- Inconsistent:
- Every other native type is initialized to '0'.
- Makes compilers and/or GC's more complicated to implement and maintain.
- Will probably make 'generic' programming with templates more complicated.
- using array.length on any native type will result in the elements being
initialized to 0, which is consistent with new for everything but floating
point. So for native types there's not only the default initializer
inconsistency but also another inconsistency between new and array.length.

- Unexpected/Non-intuitive:
- Because it's inconsistent within D itself (as above)
- No major language prior to D (that I'm aware of) does this
- Confusing to D newbies: Automatic initialization for a numeric type means '0'

fp types so 'auto-initialization' to them will mean '0' as well.

- Unefficient:
- instead of just memset() to 0, the compiler has to also init to nan when
new'ing an array. new is literally 6x slower in a simple loop when compared to
array.length.
- Instead of using this feature, I've recently found myself working around it
using array.length (instead of new) to avoid the nan initialization overhead.
For me this also has the added benefit of initializing the added elements to 0,
which is what I expect.

- Kludgy:
- The cases where people will have to needlessly complicate their code with
things like: "fparr[] = 0;" will happen much more often than people will want to
do: "fparr[] = double.nan;"

Maybe I'm wrong.. Opinions?

Thanks,

- Dave

Feb 18 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Dave wrote:

 Problems:
 
 - Inconsistent:
 - Every other native type is initialized to '0'.

[...]

 Maybe I'm wrong.. Opinions?

Characters also have non-zero init values...

http://www.digitalmars.com/d/type.html:
char 	unsigned 8 bit UTF-8 	0xFF
wchar 	unsigned 16 bit UTF-16 	0xFFFF
dchar 	unsigned 32 bit UTF-32 	0x0000FFFF

But I think those should all be zero, as well.

--anders

Feb 18 2006

"Dave" <Dave_member pathlink.com> writes:

In article <dt862v$2vi4$1 digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
Dave wrote:

 Problems:

 - Inconsistent:
 - Every other native type is initialized to '0'.

[...]

 Maybe I'm wrong.. Opinions?

Characters also have non-zero init values...

Yea - that should have read 'native numeric types', which is what I had in 
mind.

http://www.digitalmars.com/d/type.html:
char unsigned 8 bit UTF-8 0xFF
wchar unsigned 16 bit UTF-16 0xFFFF
dchar unsigned 32 bit UTF-32 0x0000FFFF

But I think those should all be zero, as well.

Another vote for that as well.

--anders

Feb 18 2006

Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:

Dave wrote:
 Rational: nan will propogate through an entire calculation if the developer
 forgets to initialize a floating point variable.
 

And this is good because you know that there is a problem. What if you 
are multiplying a bunch of numbers and didn't initialize to 1?

 - Unefficient:
 - instead of just memset() to 0, the compiler has to also init to nan when
 new'ing an array. new is literally 6x slower in a simple loop when compared to
 array.length.
 - Instead of using this feature, I've recently found myself working around it
 using array.length (instead of new) to avoid the nan initialization overhead.

What? This sounds like a bug to me. Seting arrays lenght with length 
property should also init the array. Shouldn't it?

 For me this also has the added benefit of initializing the added elements to 0,
 which is what I expect.
 
 - Kludgy:
 - The cases where people will have to needlessly complicate their code with
 things like: "fparr[] = 0;" will happen much more often than people will want
to
 do: "fparr[] = double.nan;"

Maybe, but you can always create a uninitialized array with:
float[100] fparr = void //or something like that (don't remember exactly)

Feb 18 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Ivan Senji wrote:

 - Instead of using this feature, I've recently found myself working around it
 using array.length (instead of new) to avoid the nan initialization overhead.


 What? This sounds like a bug to me. Seting arrays lenght with length 
 property should also init the array. Shouldn't it?

An age-old bug made it not initalize dynamic arrays with .init values.
This seems to be a leftover, when later resizing it by using .length...


Here's an example D program, showing that using length makes it zero:

import std.stdio;
void main()
{
    int i;
    writefln("%d", i);
    int[] ia;
    ia.length = 1;
    foreach (int i; ia)
      writefln("%d", i);

    float f;
    writefln("%f", f);
    float[] fa;
    fa.length = 1;
    foreach (float f; fa)
      writefln("%f", f);

    char c;
    writefln("%x", c);
    char[] ca;
    ca.length = 1;
    foreach (char c; ca)
      writefln("%x", c);
}

Changing the above to constructors, makes it get the proper/init values:

    int[] ia = new int[1];
    float[] fa = new float[1];
    char[] ca = new char[1];

If you resize an array, only the new parts will be zeroed out. (realloc)


See also http://www.digitalmars.com/d/archives/digitalmars/D/19780.html

--anders

Feb 18 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message 
news:dt8eo8$5ol$1 digitaldaemon.com...
 Here's an example D program, showing that using length makes it zero:

That's a bug. I'll add it to the list.

Feb 18 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

Here's an example D program, showing that using length makes it zero:

 
 That's a bug. I'll add it to the list. 

Oh, I thought it was already on it... Glad I could help out, though.

--anders

Feb 19 2006

"Dave" <Dave_member pathlink.com> writes:

"Ivan Senji" <ivan.senji_REMOVE_ _THIS__gmail.com> wrote in message 
news:dt8agm$2pm$1 digitaldaemon.com...
 Dave wrote:
 Rational: nan will propogate through an entire calculation if the 
 developer
 forgets to initialize a floating point variable.

 And this is good because you know that there is a problem. What if you are 
 multiplying a bunch of numbers and didn't initialize to 1?

I'd get 0 instead of nan and just as easily know there was something wrong. 
Plus it would be consistent with shorts, ints and longs.

Look, the majority of programmers out there want and expect floating point 
types to act like integral types except with a decimal point and greater 
precision, at least in the great majority of cases.

Consistency is huge - if there was a way to initialize integrals with nan 
then I'd say go for it across the board. But since fp types are the odd man 
out here I say make it consistent (0) for all.

For the programmers who want nan init they can typedef:

typedef double ndouble = double.nan;

 - Unefficient:
 - instead of just memset() to 0, the compiler has to also init to nan 
 when
 new'ing an array. new is literally 6x slower in a simple loop when 
 compared to
 array.length.
 - Instead of using this feature, I've recently found myself working 
 around it
 using array.length (instead of new) to avoid the nan initialization 
 overhead.

 What? This sounds like a bug to me. Seting arrays lenght with length 
 property should also init the array. Shouldn't it?

This is consistent for everything - native types and typedefs with init, so 
I'm not sure this is an un-intended bug. .length will initialize the 
*memory*, but not the elements to the default initializer.

 For me this also has the added benefit of initializing the added elements 
 to 0,
 which is what I expect.

 - Kludgy:
 - The cases where people will have to needlessly complicate their code 
 with
 things like: "fparr[] = 0;" will happen much more often than people will 
 want to
 do: "fparr[] = double.nan;"

 Maybe, but you can always create a uninitialized array with:
 float[100] fparr = void //or something like that (don't remember exactly)

Yes, but static arrays only though.

- Dave

Feb 18 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Dave" <Dave_member pathlink.com> wrote in message 
news:dt85ll$2vis$1 digitaldaemon.com...
 Maybe I'm wrong.. Opinions?

I agree.  I've ended up getting _more_ bugs with the auto-initialization to 
nan, especially in large, complex calculations (i.e. calculating the 
transform matrices from position, rotation, and scaling for hierarchies of 
3D objects).

On paper, having the nan propagate sounds like a good idea, but in practice, 
it usually ends up just being a hard, hard bug to find.  Most programs don't 
involve just printing out lists of floating point numbers, and so a nan in 
the calculations just ends up making things act strange instead of 
predictably (i.e. objects disappear or get skewed horribly instead of 
sitting still, in my 3D engine).  And 99.999% of the time I need a float, I 
need it initialized to 0.

Finally, the "uninitialized variables are an error" proposal seems to have 
fallen by the wayside (thankfully), and I make heavy use of the default 
value of 0 for int types, so I think that a useful default value for floats 
would make sense.

As for [w|d]char initializers - I think 0xFF[FF[FFFF]] is fine.  It's not 
like I really need chars initialized to anything - most of the time I use 
chars, I'm reading them in from something and so they get immediately filled 
with a value anyway.

Feb 18 2006

"Dave" <Dave_member pathlink.com> writes:

"Jarrett Billingsley" <kb3ctd2 yahoo.com> wrote in message 
news:dt8fmn$6vs$1 digitaldaemon.com...
 "Dave" <Dave_member pathlink.com> wrote in message 
 news:dt85ll$2vis$1 digitaldaemon.com...
 Maybe I'm wrong.. Opinions?

 On paper, having the nan propagate sounds like a good idea, but in 
 practice, it usually ends up just being a hard, hard bug to find.  Most 
 programs don't

Exactly - it just kind of feels like an academic addition to the language 
that sounds good in the lab but doesn't work out in the field.

Why do no other wide-spread languages (that auto-initialize) initialize 
floats to nan? I find it hard to believe that D gets this right and the rest 
somehow missed it, since hardware support for nan has been wide-spread for a 
long time. Heck even Fortran 95 doesn't auto-initialize at all, much less to 
nan.

 involve just printing out lists of floating point numbers, and so a nan in 
 the calculations just ends up making things act strange instead of 
 predictably (i.e. objects disappear or get skewed horribly instead of 
 sitting still, in my 3D engine).  And 99.999% of the time I need a float, 
 I need it initialized to 0.

Auto-initialization is great, just as long as it's consistent. One of the 
consistent hacks against C++ is all the special little corner cases you have 
to remember. nan init for floats is one of those for D, and I'm convinced 
will cause a lot of lively chatter on D newsgroups for years to come it left 
as-is once more people (like you) start using D for floating point.


fp types to 0 - I never hear anyone complaining "if doubles were initialized 
to nans instead of 0, I wouldn't have all these bugs in my arithmetic".

 Finally, the "uninitialized variables are an error" proposal seems to have 
 fallen by the wayside (thankfully), and I make heavy use of the default 
 value of 0 for int types, so I think that a useful default value for 
 floats would make sense.

 As for [w|d]char initializers - I think 0xFF[FF[FFFF]] is fine.  It's not 
 like I really need chars initialized to anything - most of the time I use 
 chars, I'm reading them in from something and so they get immediately 
 filled with a value anyway.

I'm with you, but would like to see these init'd to 0, primarily for 
consistency (but the OP was really about arithmetic types anyhow). Also, 

0, as they do with *all* native types (again easy to remember consistency).

But, i'd be real happy to just get rid of nan.

- Dave

Feb 18 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Dave" <Dave_member pathlink.com> wrote in message 
news:dt8o24$djh$1 digitaldaemon.com...
 I'm with you, but would like to see these init'd to 0, primarily for 
 consistency (but the OP was really about arithmetic types anyhow). Also, 

 to 0, as they do with *all* native types (again easy to remember 
 consistency).

They used to be, but Walter changed it when Arcane Jill (i.e. the founder of 
Unicodism, a rabid religion whose platform is that of strict acceptance and 
compliance of Unicode) said that it should be 0xFF, which is the "nan" 
equivalent for Unicode characters - that is, 0xFF means "not a valid 
character."  Of course, this still leaves out integral types, for which 
there is no nan equivalent.

One thing that all three types can represent, however, is 0 :)

Feb 18 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Dave" <Dave_member pathlink.com> wrote in message 
news:dt8o24$djh$1 digitaldaemon.com...

 initializing fp types to 0 - I never hear anyone complaining "if doubles 
 were initialized to nans instead of 0, I wouldn't have all these bugs in 
 my arithmetic".

I'll go out on a limb here, and say that Java was designed by people who 
were unfamiliar with numerical analysis. Prof. W. Kahan points out in great 
detail how they missed the boat with floating point. I am not too familiar 

of numerical analysts either.

Neither have the designers of C and C++, nor the vendors. Heck, look how 
most C++ compilers turned their backs on 80 bit floating point!!!

Any language that doesn't even fully support the floating point precision on 
the chip is not a precedent worth following, as nobody with understanding of 
numerical analysis has had any influence on its design. Such languages 
typically just copy what was done before without thinking about it.

In a previous engineering life, I *have* done numerical analysis engineering 
work. Not a great deal, but enough to be familiar with why floating point is 
the way it is, and what sorts of problems and tradeoffs it is set up to deal 
with. I've participated in the Numerical C Extensions Group, which tried to 
make C into a more reasonable language for numerical work. Prof. Kahan was 
instrumental in the design of IEEE 754 floating point, and its first 
hardware implementation on the 8087 numeric coprocessor. I've read some of 
his papers, and have corresponded with him. He makes a whole lot of sense.

I've also heard from people who do serious numerical work that, at last, D 
is a language that cares about numerical analysis and its needs. Default 
initializing to nan is part of that - it forces the user to *think* about 
what he wants the initial value to be. Initializing it by default to 0 means 
that it can easilly be overlooked, and 0.0 can introduce undetected, subtle 
errors in the result.

I know FORTRAN doesn't deal with nans. That's unsurprising, since FORTRAN 
had been around for 25 years before nans were invented.

There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 chars - 
0xFF - which is an illegal UTF-8 character. If there was a 'nan' value for 
ints, D would use it as the default, too.

Feb 18 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

 Any language that doesn't even fully support the floating point precision on 
 the chip is not a precedent worth following, as nobody with understanding of 
 numerical analysis has had any influence on its design. Such languages 
 typically just copy what was done before without thinking about it.

IMHO:
Adding 128-bit floating point to the D language spec would help here...
It fits in nicely with the other types, and shows of looking forward ?

quad         128 bit floating point (reserved for future use)

--anders


See http://www.digitalmars.com/d/archives/digitalmars/D/31899.html (RFE)

Feb 19 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message 
news:dt9bsu$1s3o$1 digitaldaemon.com...
 Adding 128-bit floating point to the D language spec would help here...
 It fits in nicely with the other types, and shows of looking forward ?

 quad         128 bit floating point (reserved for future use)

I suspect any such machines will not support 80 bit floats, so the 128 bit 
ones would just be 'real'.

Feb 19 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

quad         128 bit floating point (reserved for future use)

 
 I suspect any such machines will not support 80 bit floats, so the 128 bit 
 ones would just be 'real'. 

Right...


The list I had was for fixed-size formats, not variable ones.

i.e.
"half" - 16 bit floating point (storage only)
"float" - 32 bit floating point (a.k.a. single)
"double" - 64 bit floating point
"extend" - 80 bit floating point (or "extended")
"quad" - 128 bit floating point (future use)

Where "real" would just be an alias, and not a language type.

--anders

Feb 19 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Walter Bright" <newshound digitalmars.com> wrote in message 
news:dt98cn$1ord$4 digitaldaemon.com...
 Default initializing to nan is part of that - it forces the user to 
 *think* about what he wants the initial value to be. Initializing it by 
 default to 0 means that it can easilly be overlooked, and 0.0 can 
 introduce undetected, subtle errors in the result.

And most of the time, I *think* that I want it to be 0.  It's a rare 
occasion that I don't want it to be.  And I already explained that nan gives 
me more problems than 0 would - especially in systems where the values of 
the numbers aren't displayed in numerical form.

 There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 
 chars - 0xFF - which is an illegal UTF-8 character. If there was a 'nan' 
 value for ints, D would use it as the default, too.

And so it is, then, that the two numerical types in D - integral and 
floating point - have two different initializers.  It makes no sense.

And again, I wonder to myself why I'm trying to convince you, because I know 
it won't happen.  I keep forgetting that one has to be named Don, Jill, or 
Matthew for one's suggestions to even be considered.

Feb 19 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Jarrett Billingsley" <kb3ctd2 yahoo.com> wrote in message 
news:dt9cl0$1son$1 digitaldaemon.com...
 "Walter Bright" <newshound digitalmars.com> wrote in message 
 news:dt98cn$1ord$4 digitaldaemon.com...
 Default initializing to nan is part of that - it forces the user to 
 *think* about what he wants the initial value to be. Initializing it by 
 default to 0 means that it can easilly be overlooked, and 0.0 can 
 introduce undetected, subtle errors in the result.

 And most of the time, I *think* that I want it to be 0.  It's a rare 
 occasion that I don't want it to be.  And I already explained that nan 
 gives me more problems than 0 would - especially in systems where the 
 values of the numbers aren't displayed in numerical form.

If it was default initialized to 0, there'd be no way to detect the errors 
that would happen in those rare cases where you didn't want it initialized 
to 0. At least with the nan initialization you *know* there's a bug. The nan 
default initialization is not there for convenience, it's there to flush out 
bugs.

 And so it is, then, that the two numerical types in D - integral and 
 floating point - have two different initializers.  It makes no sense.

Every type has the default initializer that is most appropriate for that 
type, and each makes sense in the context of its type. Float does doesn't 
need to make sense in the context of int - it's very different from int, 
more than just nans, and treating it like an int is a sure route to bugs.

 And again, I wonder to myself why I'm trying to convince you, because I 
 know it won't happen.  I keep forgetting that one has to be named Don, 
 Jill, or Matthew for one's suggestions to even be considered.

Consideration is not the same thing as adopting. D can't adopt every 
suggestion - especially since most are mutually contradictory. (Jill thought 
default 0 initialization for chars makes no sense, you think non-zero makes 
no sense.) I'm trying to explain the rationale for how it works, that it is 
not some thoughtless irrational decision. If you don't agree with the 
rationale, that's fine, but there *is* a rationale.

If anyone thinks I've agreed with every proposal they've made, or even a 
fraction of them, I think they'd vehemently disagree <g>.

Feb 19 2006

Dave <Dave_member pathlink.com> writes:

In article <dt98cn$1ord$4 digitaldaemon.com>, Walter Bright says...
"Dave" <Dave_member pathlink.com> wrote in message 
news:dt8o24$djh$1 digitaldaemon.com...

 initializing fp types to 0 - I never hear anyone complaining "if doubles 
 were initialized to nans instead of 0, I wouldn't have all these bugs in 
 my arithmetic".

I've also heard from people who do serious numerical work that, at last, D 
is a language that cares about numerical analysis and its needs. Default 
initializing to nan is part of that - it forces the user to *think* about 
what he wants the initial value to be. Initializing it by default to 0 means 
that it can easilly be overlooked, and 0.0 can introduce undetected, subtle 
errors in the result.

Excepting nan init., I'm all for the great support that D gives to numerical
work, but not at the expense of the great majority of developers who don't do a
lot of "serious" numerical work day-in and day-out. Great numerics support won't
mean anything if the general developer community does not pick up the language
because they don't like (what they may well see as) inconsistencies in the
initialization of basic types. You mentioned that you recently got some good
feedback on D from the NWCUG - I wonder what they would think of initializing
with nan vs. 0 for integral types?

I know that internal to the machine, integral and fp types are worlds apart. But
a 'typical' developer uses floating point because they need precision past the
decimal point, but beyond that expects integrals and floats to pretty much act
the same.

IMO, the typical developer writing the typical program using some fp will end up
writing most of their code like this:

double foo(...)
{
int i,j,k;
double sum, dx, dy;
//...
for(...)
{
//...
sum += dx + dy;
//...
}
return sum;
}

[compile, run, oh shit]

double foo(...)
{
int i,j,k;
double sum = 0;
//...
for(...)
{
double dx = ..., dy = ...;
//...
sum += dx + dy;
//...
}
return sum;
}

I know FORTRAN doesn't deal with nans. That's unsurprising, since FORTRAN 
had been around for 25 years before nans were invented.

Yes, but even Fortran 95 doesn't do nan initialization - why?

There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 chars - 
0xFF - which is an illegal UTF-8 character. If there was a 'nan' value for 
ints, D would use it as the default, too. 

As I mentioned previously, I wouldn't care about fp nans then because it would
be consistent for both integral and fp math operations (as seen from a 'typical'
developers perspective - they'd be use to initializing everything, integral and
fp alike).

Feb 19 2006

Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:

Dave wrote:
 In article <dt98cn$1ord$4 digitaldaemon.com>, Walter Bright says...
 
"Dave" <Dave_member pathlink.com> wrote in message 
news:dt8o24$djh$1 digitaldaemon.com...


initializing fp types to 0 - I never hear anyone complaining "if doubles 
were initialized to nans instead of 0, I wouldn't have all these bugs in 
my arithmetic".

I've also heard from people who do serious numerical work that, at last, D 
is a language that cares about numerical analysis and its needs. Default 
initializing to nan is part of that - it forces the user to *think* about 
what he wants the initial value to be. Initializing it by default to 0 means 
that it can easilly be overlooked, and 0.0 can introduce undetected, subtle 
errors in the result.

 
 
 Excepting nan init., I'm all for the great support that D gives to numerical
 work, but not at the expense of the great majority of developers who don't do a
 lot of "serious" numerical work day-in and day-out. Great numerics support
won't
 mean anything if the general developer community does not pick up the language
 because they don't like (what they may well see as) inconsistencies in the
 initialization of basic types. You mentioned that you recently got some good
 feedback on D from the NWCUG - I wonder what they would think of initializing
 with nan vs. 0 for integral types?
 
 I know that internal to the machine, integral and fp types are worlds apart.
But
 a 'typical' developer uses floating point because they need precision past the
 decimal point, but beyond that expects integrals and floats to pretty much act
 the same.
 
 IMO, the typical developer writing the typical program using some fp will end
up
 writing most of their code like this:
 
 double foo(...)
 {
 int i,j,k;
 double sum, dx, dy;
 //...
 for(...)
 {
 //...
 sum += dx + dy;

or maybe sum *= dx + dy;

In that case 0 is as bad as anything else. Atleast with nan, a developer 
may decide to print sum to console and find out that it is nan. But what 
if the formula is
sum = sum*dx + dy; -> this will cause subtle errors if you expected sum 
to be 1 for example. But you will get some numbers as a result but they 
are wrong even though you think they are allright, and the next thing 
you know your D-driven spaceship is falling onto Mars too fast and you 
are loosing alot of money an thinking: if only that sum was initialized 
to nan! :)

PS
I was against float.init==nan at first but got used to it, and realized 
that it is a good thing.

The policy of D is: initialize things to error states if possible. That 
unfortunatelly is imposible for int's but is for floats and chars.

Feb 19 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Ivan Senji wrote:

 In that case 0 is as bad as anything else. Atleast with nan, a developer 
 may decide to print sum to console and find out that it is nan. But what 
 if the formula is
 sum = sum*dx + dy; -> this will cause subtle errors if you expected sum 
 to be 1 for example. But you will get some numbers as a result but they 
 are wrong even though you think they are allright, and the next thing 
 you know your D-driven spaceship is falling onto Mars too fast and you 
 are loosing alot of money an thinking: if only that sum was initialized 
 to nan! :)

Then again, it's usually better to catch those bugs at compile time ?

Like in Java: (jikes compiler)
*** Semantic Error: The variable "sum" may be accessed here before 
having been definitely assigned a value.

Sometimes I think D has too many runtime errors, waiting to happen...

--anders

Feb 19 2006

Ivan Senji <ivan.senji_REMOVE_ _THIS__gmail.com> writes:

Anders F Bj�rklund wrote:
 Ivan Senji wrote:
 
 In that case 0 is as bad as anything else. Atleast with nan, a 
 developer may decide to print sum to console and find out that it is 
 nan. But what if the formula is
 sum = sum*dx + dy; -> this will cause subtle errors if you expected 
 sum to be 1 for example. But you will get some numbers as a result but 
 they are wrong even though you think they are allright, and the next 
 thing you know your D-driven spaceship is falling onto Mars too fast 
 and you are loosing alot of money an thinking: if only that sum was 
 initialized to nan! :)

 
 
 Then again, it's usually better to catch those bugs at compile time ?
 


although it can be a little irritating it is uesfull. Maybe this could 
be warning? But it isn't that easy to detect.

Feb 19 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Ivan Senji wrote:

 Then again, it's usually better to catch those bugs at compile time ?

 

 although it can be a little irritating it is uesfull. Maybe this could 
 be warning? But it isn't that easy to detect.

D doesn't do warnings, so run-time errors seem to be preferred.

--anders

Feb 19 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message 
news:dtaf3q$2qlp$1 digitaldaemon.com...
 Then again, it's usually better to catch those bugs at compile time ?

 Like in Java: (jikes compiler)
 *** Semantic Error: The variable "sum" may be accessed here before having 
 been definitely assigned a value.

"may" be? That's why it isn't in D. Wishy-washy messages aren't a solution. 
Furthermore, they encourage programmers to introduce dead assignments to get 
rid of the message, leaving the mysterious assignment to confuse the 
maintenance programmer trying to figure out what the code does.

Good, clean code should have every statement be reachable, and every 
assignment mean something. Having the compiler force you to insert 
meaningless assignments and unreachable code is just not helpful.

Feb 19 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

Like in Java: (jikes compiler)
*** Semantic Error: The variable "sum" may be accessed here before having 
been definitely assigned a value.

 
 "may" be? That's why it isn't in D. Wishy-washy messages aren't a solution. 

Jikes is always so polite about it, other compilers are more terse:
* javac: "variable sum might not have been initialized"
* gcc: "warning: `sum' might be used uninitialized in this function"

But you are right, as it's leaving the possibility that it is wrong...
(it's an error in Java, and an optional -Wuninitialized warning in GCC)

 Furthermore, they encourage programmers to introduce dead assignments to get 
 rid of the message, leaving the mysterious assignment to confuse the 
 maintenance programmer trying to figure out what the code does.

Yes, sometimes workarounds like that are needed to "silence" GCC...

The warning also only occurs during optimization/register candidates:
http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Warning-Options.html#index-Wuninitialized-214

There's also special attributes and preprocessor tricks to "help" it.

 Good, clean code should have every statement be reachable, and every 
 assignment mean something. Having the compiler force you to insert 
 meaningless assignments and unreachable code is just not helpful. 

You're only "forced" if you want it to compile without warnings, though.

Having it lintfree/warningless takes some initial effort, no doubt about
that. But after that it is usually "effortless", and helps catch bugs...

I know what you think about warnings :-), so it is probably not for D.
(Having Phobos compile with the -w flag takes same about of workarounds)


The only thing that has me confused, is the D overview says that D is 
for people who like using lint and compile with all warnings as errors.

And activities such as the above "not helpful" ones is what I associate
with those two extra code analysis passes. Maybe I'm stuck in the past ?

--anders

Feb 19 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message 
news:dtbs2u$15dl$1 digitaldaemon.com...
 The only thing that has me confused, is the D overview says that D is for 
 people who like using lint and compile with all warnings as errors.

No two compilers emit the same warnings. So a general statement about 
warnings does not mean I consider a particular warning to be worth having. 
DMC doesn't emit any wishy-washy warnings.

Feb 20 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

 DMC doesn't emit any wishy-washy warnings. 

Haven't used DMC enough, I'm afraid, just GCC...
(something of a side effect from not using Win)

Nothing personal :-)
--anders


PS.
Doing some Windows now, so I will play with it.
(probably not enough to purchase the CD, though)

Feb 20 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Ivan Senji" <ivan.senji_REMOVE_ _THIS__gmail.com> wrote in message 
news:dtadui$2pt4$1 digitaldaemon.com...
 The policy of D is: initialize things to error states if possible. That 
 unfortunatelly is imposible for int's but is for floats and chars.

That's right. The default initialization is *not* about being convenient or 
a shorthand. It's about being an aid to writing bug free code.

If there was a nan value for ints, that would be the default initialization 
for that, too. I'd love it if you could set a bit for a memory address that 
is cleared when the address is written to, and generates a hardware fault if 
it is read with that bit set. But there is no such thing, and nan is the 
best we can do otherwise.

Feb 19 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

 That's right. The default initialization is *not* about being convenient or 
 a shorthand. It's about being an aid to writing bug free code.

So it's an error to use ints before they're initialized ?

I thought it was "OK" to assume they all started at zero...
You know, like in: http://www.digitalmars.com/d/wc.html ;-)

 If there was a nan value for ints, that would be the default initialization 
 for that, too. I'd love it if you could set a bit for a memory address that 
 is cleared when the address is written to, and generates a hardware fault if 
 it is read with that bit set. But there is no such thing, and nan is the 
 best we can do otherwise. 

So if the int.init is ever changed to something "nan"-ish, like
-1 or 0xDEADBEEF or something, it could stop to work later on ?

Guess this means to start using an "= 0;" explicit init value.

--anders

Feb 19 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message 
news:dtah5j$2snt$1 digitaldaemon.com...
 Walter Bright wrote:
 That's right. The default initialization is *not* about being convenient 
 or a shorthand. It's about being an aid to writing bug free code.

 So it's an error to use ints before they're initialized ?
 I thought it was "OK" to assume they all started at zero...
 You know, like in: http://www.digitalmars.com/d/wc.html ;-)

Sometimes I get lazy. :-) I'll say it's poor style, and yes, I'm guilty of 
it.

 If there was a nan value for ints, that would be the default 
 initialization for that, too. I'd love it if you could set a bit for a 
 memory address that is cleared when the address is written to, and 
 generates a hardware fault if it is read with that bit set. But there is 
 no such thing, and nan is the best we can do otherwise.

 So if the int.init is ever changed to something "nan"-ish, like
 -1 or 0xDEADBEEF or something, it could stop to work later on ?

It isn't going to change, for the pragmatic reason it'll break too much 
code. There's too much water under that bridge.

 Guess this means to start using an "= 0;" explicit init value.

I think it's good style to let the maintainer know that one intended it to 
be 0.

Feb 19 2006

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Walter Bright wrote:

I thought it was "OK" to assume they all started at zero...
You know, like in: http://www.digitalmars.com/d/wc.html ;-)

 
 Sometimes I get lazy. :-) I'll say it's poor style, and yes, I'm guilty of 
 it.

As long as it is clear... (In some other languages, it's stylish)
But in D, the .init values are a debugging aid and not a shortcut.

Guess this means to start using an "= 0;" explicit init value.

 
 I think it's good style to let the maintainer know that one intended it to 
 be 0. 

Yeah, I usually do this anyway - just to avoid warnings in C:
"warning: `sum' might be used uninitialized in this function"

--anders

Feb 19 2006

"Derek Parnell" <derek psych.ward> writes:

I support the use of NaN as an initializer. I only wish it could also be  
done for all numerics.

I work with a language (Euphoria) in which all variables are set to  
uninitialized and it doesn't allow you to use an uninitialized variable.  
And it  doesn't have syntax to initialize them at declaration time. This  
strict enforcement has made me appreciate the importance of explicitly  
initializing variables to improve readibility and removing potential bugs.

(I just wish Euphoria would allow one to initialize at declaration time  
though <g>)

-- 
Derek Parnell
Melbourne, Australia

Feb 20 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:op.s5ahr0v26b8z09 ginger.vic.bigpond.net.au...
 I work with a language (Euphoria) in which all variables are set to 
 uninitialized and it doesn't allow you to use an uninitialized variable.

That appears to implement in software my idea that memory should have a 
parallel set of bits saying whether each location is initialized or not. The 
trouble with doing it in software is it about halves the execution speed.

Feb 20 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 "Derek Parnell" <derek psych.ward> wrote in message 
 news:op.s5ahr0v26b8z09 ginger.vic.bigpond.net.au...
 I work with a language (Euphoria) in which all variables are set to 
 uninitialized and it doesn't allow you to use an uninitialized variable.

 
 That appears to implement in software my idea that memory should have a 
 parallel set of bits saying whether each location is initialized or not. The 
 trouble with doing it in software is it about halves the execution speed. 

Could make it an optional/debug feature, but it might be a lot of work 
for something that a code analyzer could possibly accomplish as well?


Sean

Feb 20 2006

Derek Parnell <derek psych.ward> writes:

On Mon, 20 Feb 2006 19:39:50 -0800, Walter Bright wrote:

 "Derek Parnell" <derek psych.ward> wrote in message 
 news:op.s5ahr0v26b8z09 ginger.vic.bigpond.net.au...
 I work with a language (Euphoria) in which all variables are set to 
 uninitialized and it doesn't allow you to use an uninitialized variable.

 
 That appears to implement in software my idea that memory should have a 
 parallel set of bits saying whether each location is initialized or not. The 
 trouble with doing it in software is it about halves the execution speed.

Yes, I know that, and I wasn't suggesting that D adopt this strategy.
Euphoria is an interpreter so its not trying to be as fast as a compiled
program, and thus the extra management overheads for this facility are not
a problem for it. I was just saying that my experience with working with
variables that must be explicitly initialized has been beneficial, and so I
was attempting to support NaN for floating point, etc...

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
21/02/2006 4:26:11 PM

Feb 20 2006

"Walter Bright" <newshound digitalmars.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:13ffimd0poz2u$.i1ldoex0oqcq$.dlg 40tude.net...
 Yes, I know that, and I wasn't suggesting that D adopt this strategy.
 Euphoria is an interpreter so its not trying to be as fast as a compiled
 program, and thus the extra management overheads for this facility are not
 a problem for it. I was just saying that my experience with working with
 variables that must be explicitly initialized has been beneficial, and so 
 I
 was attempting to support NaN for floating point, etc...

Thanks for clarifying this.

Feb 20 2006

John Stoneham <captnjameskirk moc.oohay> writes:

Walter Bright wrote:
 I've also heard from people who do serious numerical work that, at last, D 
 is a language that cares about numerical analysis and its needs. Default 
 initializing to nan is part of that - it forces the user to *think* about 
 what he wants the initial value to be. Initializing it by default to 0 means 
 that it can easilly be overlooked, and 0.0 can introduce undetected, subtle 
 errors in the result.
 

I agree. I'm currently working on an involved combinatorial calculation, 
and having one of the doubles auto-initialized to NAN help me find a bug 
in one of the calculations which would have been very difficult to find 
otherwise.

I say keep it.


 There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 chars - 
 0xFF - which is an illegal UTF-8 character. If there was a 'nan' value for 
 ints, D would use it as the default, too. 
 
 

There *is* a way get this behavior, and it can be done at compile time: 
raise an error when an int is assigned an initial value which cannot be 
calculated at compile time. This behavior could even be turned on with a 
command-line switch, -nan, or whatever.

For example:

int x = 1;
int y, z;
// initialization:
y = 7; // this is obviously ok
y = x; // this would also ok
z = abs(y); // this would raise an error, requires runtime evaluation

Or, even better, raise an error when the initialization value isn't a 
numeric literal. This would probably be even more consistent with the 
floating-point NAN behavior.

Feb 19 2006

Sean Kelly <sean f4.ca> writes:

John Stoneham wrote:
 Walter Bright wrote:
 I've also heard from people who do serious numerical work that, at 
 last, D is a language that cares about numerical analysis and its 
 needs. Default initializing to nan is part of that - it forces the 
 user to *think* about what he wants the initial value to be. 
 Initializing it by default to 0 means that it can easilly be 
 overlooked, and 0.0 can introduce undetected, subtle errors in the 
 result.

 
 I agree. I'm currently working on an involved combinatorial calculation, 
 and having one of the doubles auto-initialized to NAN help me find a bug 
 in one of the calculations which would have been very difficult to find 
 otherwise.
 
 I say keep it.
 
 
 There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 
 chars - 0xFF - which is an illegal UTF-8 character. If there was a 
 'nan' value for ints, D would use it as the default, too.

 
 There *is* a way get this behavior, and it can be done at compile time: 
 raise an error when an int is assigned an initial value which cannot be 
 calculated at compile time. This behavior could even be turned on with a 
 command-line switch, -nan, or whatever.

This would be nice.


Sean

Feb 19 2006

Sean Kelly <sean f4.ca> writes:

Sean Kelly wrote:
 John Stoneham wrote:
 Walter Bright wrote:
 I've also heard from people who do serious numerical work that, at 
 last, D is a language that cares about numerical analysis and its 
 needs. Default initializing to nan is part of that - it forces the 
 user to *think* about what he wants the initial value to be. 
 Initializing it by default to 0 means that it can easilly be 
 overlooked, and 0.0 can introduce undetected, subtle errors in the 
 result.

 I agree. I'm currently working on an involved combinatorial 
 calculation, and having one of the doubles auto-initialized to NAN 
 help me find a bug in one of the calculations which would have been 
 very difficult to find otherwise.

 I say keep it.


 There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 
 chars - 0xFF - which is an illegal UTF-8 character. If there was a 
 'nan' value for ints, D would use it as the default, too.

 There *is* a way get this behavior, and it can be done at compile 
 time: raise an error when an int is assigned an initial value which 
 cannot be calculated at compile time. This behavior could even be 
 turned on with a command-line switch, -nan, or whatever.

 
 This would be nice.

I take it back:

struct S { int i; }


Sean

Feb 19 2006

Sean Kelly <sean f4.ca> writes:

Sean Kelly wrote:
 Sean Kelly wrote:
 John Stoneham wrote:
 Walter Bright wrote:
 I've also heard from people who do serious numerical work that, at 
 last, D is a language that cares about numerical analysis and its 
 needs. Default initializing to nan is part of that - it forces the 
 user to *think* about what he wants the initial value to be. 
 Initializing it by default to 0 means that it can easilly be 
 overlooked, and 0.0 can introduce undetected, subtle errors in the 
 result.

 I agree. I'm currently working on an involved combinatorial 
 calculation, and having one of the doubles auto-initialized to NAN 
 help me find a bug in one of the calculations which would have been 
 very difficult to find otherwise.

 I say keep it.


 There is a 'nan' value for pointers - null, a 'nan' value for UTF-8 
 chars - 0xFF - which is an illegal UTF-8 character. If there was a 
 'nan' value for ints, D would use it as the default, too.

 There *is* a way get this behavior, and it can be done at compile 
 time: raise an error when an int is assigned an initial value which 
 cannot be calculated at compile time. This behavior could even be 
 turned on with a command-line switch, -nan, or whatever.

 This would be nice.

 
 I take it back:
 
 struct S { int i; }

Actually, does this work?

struct S { int i = 5; }

Not quite a flexible as a ctor, but it would be sufficient for the above 
suggestion.

Sean

Feb 19 2006

"Ben Hinkle" <ben.hinkle gmail.com> writes:

"Dave" <Dave_member pathlink.com> wrote in message 
news:dt85ll$2vis$1 digitaldaemon.com...
 Rational: nan will propogate through an entire calculation if the 
 developer
 forgets to initialize a floating point variable.

[snip]
 Maybe I'm wrong.. Opinions?

 Thanks,

 - Dave

I'm with you that nan initializer is annoying. The archives have some 
discussions on the topic - with Walter's replies. My own approach would be 
to use 0 initializer, toss 'auto' and introduce that := operator that came 
up recently that I talk about every now and then. The benefit of := is that 
*you* supply the initialization so the whole issue of worying about the 
default initializer is much less common. For example instead of
 double x;
and wondering what the initial value is (if any) one writes
 x := 0.0;
and you're done. Multiple variables are declared-and-inited using
 x := y := z := 0.0;

-Ben

ps. I hope people aren't annoyed by this plug but := is in Cx 
http://www.cxlang.org

Feb 19 2006

D Programming

C/C++ Programming

Other

digitalmars.D - floating point - nan initializers