digitalmars.D - Null references redux

Walter Bright (23/30) Sep 26 2009 I think he's wrong.

Andrei Alexandrescu (4/48) Sep 26 2009 My assessment: the chances of convincing Walter he's wrong are quite

Jeremie Pelletier (9/59) Sep 26 2009 I actually side with Walter here. I much prefer my programs to crash on

Andrei Alexandrescu (10/69) Sep 26 2009 But that's a false choice. You don't choose between a crashing program

Walter Bright (15/27) Sep 26 2009 If there was a reasonable way of having NaN values for ints, D would use...

language_fan (20/23) Sep 26 2009 Well typically if your type system supports algebraic types, you can

Walter Bright (6/38) Sep 26 2009 I don't see the improvement.

Denis Koroskin (13/16) Sep 26 2009 Oh, my! You don't even know what a non-null default is!

Walter Bright (30/50) Sep 26 2009 It's the black hole object. It prevents you from getting a seg fault,

Denis Koroskin (35/85) Sep 26 2009 I'm sorry but I can not continue discussion with you like that. You are ...
Jeremie Pelletier (11/75) Sep 26 2009 Haha that's a nice analogy, I myself was just unable to speak. I guess

Jarrett Billingsley (9/16) Sep 26 2009 You're missing the point. You wouldn't have "undefined behavior at

Jeremie Pelletier (8/29) Sep 26 2009 I don't want the language to force me to check nullable references

Jarrett Billingsley (15/23) Sep 26 2009 You don't design tight loops that dereference pointers with the

Christopher Wright (10/19) Sep 26 2009 This is not the proposal. The proposal was to codify in the type system
language_fan (11/15) Sep 26 2009 Basically if there is only one way the system can operate correctly, you...

Walter Bright (8/13) Sep 26 2009 I certainly agree that catching errors at compile time is preferable by

Jason House (7/22) Sep 26 2009 If you argued any cases other than there's no good default initializatio...

Walter Bright (44/61) Sep 27 2009 Well, we can't discuss this if we cannot agree on terms. The

Manfred_Nowak (5/7) Sep 27 2009 Except this sentence I applaud every thought.
Nick Sabalausky (20/32) Sep 27 2009 He keeps saying "safe", and every time he does you turn it into "memory

bearophile (4/9) Sep 27 2009 Likewise, I think that the name of SafeD modules is misleading, they are...
Lutger (14/24) Sep 27 2009 Not that I have an opinion on this either way, but if I understand Walte...

Jason House (7/13) Sep 27 2009 In reality, the issue becomes what will programmers do to bypass compile...
BCS (4/8) Sep 27 2009 If you can't trust the programmer to write good code, replace them with ...

Lutger (16/26) Sep 27 2009 Hi. I don't think this argument will work, for several reasons:

BCS (7/27) Sep 27 2009 Incompetent? No. But I wouldn't want to hire a programer that *habituall...

Manfred_Nowak (10/12) Sep 28 2009 In the short time of an interview its not possible to test for habits (o...

BCS (6/15) Sep 28 2009 Good point, I guess that all that is left would be to try and get a feel...

language_fan (9/26) Sep 28 2009 At least in the companies I have worked in they briefly teach you their

bearophile (5/8) Sep 27 2009 This is the nicest thing I've read this week. Thank you very much :-)

Jesse Phillips (10/28) Sep 27 2009 The thing is that memory safety is the only safety with code. In Walter'...

language_fan (11/21) Sep 27 2009 Have you ever used functional languages? When you develop in Haskell or

Jesse Phillips (3/16) Sep 28 2009 So isn't that the question? Does/can "default" (by human or machine) ini...

Steven Schveighoffer (46/56) Sep 28 2009 It creates an invalid, non-compiling program.

Jesse Phillips (17/34) Sep 28 2009 No it doesn't, I'm not referring to null as the invalid state.

Steven Schveighoffer (32/66) Sep 29 2009 I am not arguing for floats (or any value types) to be required to be

language_fan (21/46) Sep 28 2009 Value types can be incorrectly initialized and nobody notices. E.g.

language_fan (9/21) Sep 28 2009 For instance if I use the example given above, I write it like this in a...

Nick Sabalausky (27/48) Sep 28 2009 I'm not particulary accustomed to that sort of syntax. Am I correct in m...

language_fan (10/68) Sep 28 2009 Well to be honest, I thought I knew how to read D, but this is starting

Adam Burton (104/141) Sep 28 2009 I don't know if what I am about to rant about has already been discussed...
Andrei Alexandrescu (3/36) Sep 28 2009 You mean int.max :o).

Jeremie Pelletier (4/42) Sep 28 2009 He just proved how enforcing initializers can still cause errors! I
Derek Parnell (39/49) Sep 28 2009 if (list.length == 0)

Jeremie Pelletier (30/80) Sep 29 2009 But it doesn't have to follow the paranoid safety paradigm either. I

Rainer Deyke (5/10) Sep 29 2009 This only catches null errors at runtime. The whole point of a non-null

Jeremie Pelletier (8/18) Sep 29 2009 Thats what flow analysis is for, since these are mostly uninitialized

bearophile (6/12) Sep 29 2009 I agree, but I think in a well designed system such situations are reall...
Rainer Deyke (12/19) Sep 29 2009 Nitpick: there are no uninitialized variables in D (unless you

bearophile (5/6) Sep 27 2009 Nope. For example in Delphi and C# you can have a runtime integer overfl...
Rainer Deyke (14/15) Sep 27 2009 That is such bullshit. For example, this:

Jesse Phillips (3/6) Sep 28 2009 I think that is what Walter is getting at, you're not dealing with memor...

Rainer Deyke (13/18) Sep 28 2009 Type errors and null pointer errors both belong to the same class of

BCS (14/17) Sep 27 2009 This whole thread is NOT about what to do on unknown states. It is about...

Walter Bright (10/22) Sep 27 2009 Nick Sabalausky wrote:

downs (7/38) Sep 27 2009 Okay, I'm gonna have to call you out on this one because it's simply inc...

Andrei Alexandrescu (3/45) Sep 27 2009 How did Jeremie do that?

Jeremie Pelletier (13/65) Sep 27 2009 A signal handler with the undocumented kernel parameters attaches the

Yigal Chripun (3/70) Sep 27 2009 Is this Linux specific? what about other *nix systems, like BSD and

Jeremie Pelletier (4/80) Sep 27 2009 Signal handler are standard to most *nix platforms since they're part of...

Andrei Alexandrescu (28/34) Sep 27 2009 Let me write a message on behalf of Sean Kelly. He wrote that to Walter

Jeremie Pelletier (7/49) Sep 27 2009 Yes but the segfault signal handler is not made to design code that can

Denis Koroskin (5/46) Sep 28 2009 Isn't this reason alone strong enough to encourage use of non-null
Sean Kelly (19/63) Sep 29 2009 I don't think it's fair to compare Windows to Unix here because, as far ...

Sean Kelly (7/15) Sep 29 2009 I was right, it is illegal to throw an exception from a signal handler. ...

Jeremie Pelletier (11/27) Sep 29 2009 Weird, it works just fine for me. Maybe its because the exception is

Sean Kelly (33/60) Sep 29 2009 I think in practice, the issue is simply that malloc and IO routines are...

Jeremie Pelletier (17/81) Sep 29 2009 I agree, I don't mind occasional crashes within the crash handler itself...

Jeremie Pelletier (21/85) Sep 29 2009 I haven't had any problems so far, the stack trace generated was always

downs (2/73) Sep 27 2009 Woah, nice. I stand corrected. Is this in druntime already?

Jeremie Pelletier (4/75) Sep 27 2009 Not yet, its part of a custom runtime I'm working on and wish to release...

grauzone (3/82) Sep 27 2009 Some of this functionality is also in Tango (SVN version). Signals are

Leandro Lucarella (14/22) Sep 27 2009 I think this is a very bad idea. When the program receive a segfault

BCS (2/5) Sep 27 2009 Last I checked, throwing from a signal handler works on linux.

language_fan (10/13) Sep 27 2009 What I mean by safe is that no matter what you do, you cannot make the

Lionello Lunesu (13/26) Sep 27 2009 // t.d

Max Samukha (4/38) Sep 28 2009 That is a strong argument. If an object is big enough, modifying it via ...

Jeremie Pelletier (6/44) Sep 28 2009 How is that corruption? These pointers were purposely set to 0x00000002,...

Lionello Lunesu (10/54) Sep 28 2009 Uh? What pointer is being set to 0x00000002?

Andrei Alexandrescu (5/8) Sep 26 2009 This is the mistake. There would no way to default initialize a non-null...

Walter Bright (19/27) Sep 26 2009 Sure, so the user just provides "0" as the argument to the non-default

Andrei Alexandrescu (29/65) Sep 26 2009 The problem is you keep on insisting on one case "I have a non-null

Tom S (6/33) Sep 26 2009 Quoted for truth.
bearophile (7/12) Sep 26 2009 Thank you Andrei for your good efforts in trying to add some light on th...

BCS (7/11) Sep 26 2009 They don't have a default. There semantics would be such that the compil...

Jarrett Billingsley (6/13) Sep 26 2009 There is NO RUNTIME OVERHEAD in implementing nonnull reference types.

Jeremie Pelletier (16/33) Sep 26 2009 How would you do this then?

Denis Koroskin (29/56) Sep 26 2009 Let's consider the following example, first:

language_fan (10/26) Sep 26 2009 Having a functional switch() helps a lot. I write code like this every

Jarrett Billingsley (13/28) Sep 26 2009 Either use Object? (a nullable reference), or factor out the object

bearophile (18/35) Sep 26 2009 Using a separate function to initialize an nonnull reference is a possib...

language_fan (5/14) Sep 26 2009 I just LOVE to see questions like these ;) You still have SO much to
Yigal Chripun (28/64) Sep 26 2009 with current D syntax this can be implemented as:

language_fan (5/8) Sep 26 2009 Indeed, especially since in the case of D half of the userbase has a
Jeremie Pelletier (27/102) Sep 26 2009 This is something for the runtime or the debugger to deal with. My

Jarrett Billingsley (6/18) Sep 26 2009 You haven't read my reply to your post yet, have you.
bearophile (17/32) Sep 26 2009 That's life.
Daniel Keep (38/68) Sep 26 2009 See my long explanation that NPEs are only symptoms; very rarely do they

Jeremie Pelletier (34/117) Sep 26 2009 Happens to me on some issues too, I don't ask for a workaround in the

Christopher Wright (4/8) Sep 27 2009 You're complaining now because you'd try to cram 'null' down the throat

Walter Bright (5/7) Sep 26 2009 Initialize it to what?

grauzone (5/15) Sep 26 2009 You can allow a non-nullable reference to be null, just like you allow

Walter Bright (2/4) Sep 26 2009 See my reply to Denis Koroskin on that.

Jarrett Billingsley (6/14) Sep 26 2009 The point of using a nonnull type is that you *never expect it to be

Denis Koroskin (14/20) Sep 26 2009 What runtime overhead are you talking about here? Use of non-null pointe...

Walter Bright (7/10) Sep 26 2009 Should:

Denis Koroskin (29/39) Sep 26 2009 Functional languages don't distinguish between the two (reference or not...

Denis Koroskin (14/59) Sep 26 2009 One more:

Walter Bright (6/21) Sep 26 2009 It seems to me you've got null references there anyway?

Denis Koroskin (6/24) Sep 26 2009 Easy:

downs (8/40) Sep 27 2009 The case of a non-null array is, I think, worthy of some more considerat...

bearophile (5/8) Sep 27 2009 I agree.

Walter Bright (2/4) Sep 26 2009 Especially when I'm right!

grauzone (5/7) Sep 26 2009 On Linux, it just generates a segfault. And then you have no idea where

Walter Bright (4/13) Sep 26 2009 It's *still* far more useful than generating corrupt output and

grauzone (4/19) Sep 26 2009 Indeed. I was just commenting in how badly the current D implementation

Walter Bright (10/18) Sep 26 2009 It's implicit in the argument that some default should be used instead.

Andrei Alexandrescu (12/29) Sep 26 2009 I think you're starting to be wrong at the point where you don't realize...

language_fan (3/8) Sep 26 2009 Maybe Walter has not yet transitioned from the good olde Pascal/C style

Walter Bright (2/4) Sep 26 2009 Heh, there's still a Fortran influence in my code .

Jeremie Pelletier (16/21) Sep 26 2009 This may be a good time to ask about how these variables which can be

Walter Bright (3/21) Sep 27 2009 They are completely independent variables. One may get assigned to a

Jeremie Pelletier (3/26) Sep 27 2009 Ok, that's what I thought, so the good old C way of declaring variables

Rainer Deyke (6/12) Sep 27 2009 Strange how you can look at the evidence and arrive at exactly the wrong

Rainer Deyke (30/45) Sep 27 2009 OT, but declaring the variable at the top of the function increases

Walter Bright (5/37) Sep 27 2009 Not necessarily. The optimizer uses a technique called "live range

Rainer Deyke (18/36) Sep 27 2009 That's the optimization I was referring to. It works for ints, but not

bearophile (22/25) Sep 27 2009 LLVM has a good optimizer. If you try the LLVM demo on C code with LTO a...

Walter Bright (19/35) Sep 26 2009 The problem is it's worse to force people to provide an initializer.

Andrei Alexandrescu (4/17) Sep 26 2009 You're not forcing. You just change the default. Really, it's *exactly*
Christopher Wright (25/42) Sep 27 2009 You aren't forcing them. They decide for themselves. They determine

Yigal Chripun (13/26) Sep 26 2009 An exception trace is *far* better than a segfault and that does not

Walter Bright (2/4) Sep 26 2009 Seg faults are exceptions, too. You can even catch them (on windows)!

Jeremie Pelletier (9/14) Sep 26 2009 Walter, check the crash handler I submitted to D.announce, it has signal...

Andrei Alexandrescu (3/20) Sep 27 2009 I think that's great. Walter, Sean, please let's look into this.

Yigal Chripun (20/24) Sep 27 2009 No, segfaults are *NOT* exceptions. the setup you mention is windows

Denis Koroskin (10/44) Sep 26 2009 I don't understand you. You say you prefer 1, but describe the path D

Walter Bright (27/36) Sep 26 2009 d is initialized to the "invalid" unicode bit pattern of 0xFFFF. You'll

Denis Koroskin (14/41) Sep 26 2009 Change dchar to float or an int. It's still not initialized (well,

Jason House (5/28) Sep 26 2009 What do you define as a bug? Dereferencing a null pointer? Passing a nul...

Walter Bright (2/28) Sep 26 2009 The program doing something it was not deliberately programmed to do.

bearophile (6/20) Sep 26 2009 Today we think this design is not the best one, because the pilot sudden...

language_fan (8/12) Sep 26 2009 That is a really good suggestion. To me it seems that several known
Walter Bright (28/51) Sep 26 2009 I've never seen any suggestion that Boeing (or Airbus, or the FAA) has

Justin Johansson (11/78) Sep 26 2009 Walter, in the heat of this thread I hope you haven't missed the correla...

Walter Bright (4/6) Sep 26 2009 Thanks for pointing it out. The facilities in D enable one to construct

Justin Johansson (19/26) Sep 26 2009 What you just said made me think that much of this thread is talking at ...
Michel Fortin (13/16) Sep 26 2009 As far as I understand this thread, no one here is arguing that

Michel Fortin (36/51) Sep 26 2009 I just want to add: some people here are suggesting the compiler adds

Yigal Chripun (17/65) Sep 27 2009 If you refer to my posts than I want to clarify:
Andrei Alexandrescu (7/66) Sep 27 2009 I don't think this would fly. One good thing about nullable references

bearophile (4/9) Sep 27 2009 nonnullable references can also reduce the total amount of code a little...
Michel Fortin (17/80) Sep 27 2009 You want me to add wings? Please explain.

Andrei Alexandrescu (5/72) Sep 27 2009 I did explain. You suggest that we replace an automated, no-cost

Christopher Wright (13/29) Sep 27 2009 I dislike these forced checks.

Michel Fortin (17/31) Sep 27 2009 If the programmer knows a value isn't null, why not put the value in a

Jeremie Pelletier (11/44) Sep 27 2009 I much prefer explicit null checks than implicit ones I can't control.

Jarrett Billingsley (8/17) Sep 27 2009 Nonnull types do not create implicit null checks. Nonnull types DO NOT

Jeremie Pelletier (5/26) Sep 27 2009 Forcing checks on nullables is just as bad, not all nullables need to be...

Jarrett Billingsley (25/38) Sep 27 2009 You don't get it, do you. If you have a reference that doesn't need to

bearophile (6/9) Sep 27 2009 Asserts tend to vanish in release mode, so it may be better to use somet...

bearophile (4/8) Sep 27 2009 Are you willing to give your help to implement about 5-10% if this featu...

Jeremie Pelletier (5/16) Sep 27 2009 Sure, I would love to help implement flow analysis, I don't know enough

Andrei Alexandrescu (4/13) Sep 26 2009 Non-nullable references should be the default.

Jeremie Pelletier (16/33) Sep 26 2009 Like I said in another post of this thread, I believe the issue here is

Walter Bright (3/9) Sep 26 2009 The compiler, when -O is used, should remove nearly all the redundant

BCS (3/15) Sep 27 2009 Sweet, so you already have a bunch of the logic needed to check make sur...

Walter Bright (3/13) Sep 26 2009 Ack, I remember we talked about this, I guess I don't remember the

Andrei Alexandrescu (9/24) Sep 27 2009 The resolution was that the language will allow delete'ing the unwanted

Christopher Wright (6/22) Sep 27 2009 I looked into this slightly. You'd have to do mark non-nullable fields

Yigal Chripun (6/13) Sep 26 2009 No one was claiming that.

bearophile (5/5) Sep 26 2009 Walter, I can already see lot of confusion in this thread. Let's think w...
Daniel Keep (40/64) Sep 26 2009 *sigh* Walter, I really admire you as a programmer. But this is about

Walter Bright (25/46) Sep 26 2009 They do just that in Java because of the checked-exceptions thing. I

Ary Borenszweig (2/46) Sep 26 2009 Null pointer seg faults *not being able to happen* are much more safe. :...

Jeremie Pelletier (7/54) Sep 26 2009 There is no such thing as "not being able to happen" :)

Jarrett Billingsley (4/8) Sep 26 2009 Why the hell would the compiler allow that to begin with? Why bother

Jeremie Pelletier (6/18) Sep 26 2009 Because D is a practical language that let the programmer do whatever he...

downs (2/24) Sep 27 2009 Sure, but if you set out to break it the compiler really can't (or shoul...

Tom S (9/68) Sep 26 2009 It's a systems programming language. You can screw up the type system if...
Ary Borenszweig (6/59) Sep 26 2009 Object is not-nullable, Object? (or whatever syntax you like) is

Jeremie Pelletier (17/80) Sep 26 2009 union A {

Ary Borenszweig (5/78) Sep 26 2009 Ah, nice one.
Christopher Wright (7/13) Sep 26 2009 It's a large improvement, but only for local variables. If your segfault...

Jeremie Pelletier (22/37) Sep 26 2009 But how would you enforce a nonnull type over an aggregate in the first

downs (5/52) Sep 27 2009 "Here are some cases you haven't mentioned yet. This proves that the com...

Jeremie Pelletier (12/64) Sep 27 2009 I allocate most structs on the gc, unless I need them only for the scope...

downs (6/81) Sep 27 2009 You're twisting my words.

Nick Sabalausky (12/23) Sep 28 2009 Unions are nothing more than an alternate syntax for a reinterpret cast....

Jeremie Pelletier (12/39) Sep 28 2009 Yet it's the only way I know of to do bitwise logic on floating points

Jari-Matti =?UTF-8?B?TcOka2Vsw6Q=?= (7/16) Sep 28 2009 You could add built-in methods for those operations to the float type:

Jeremie Pelletier (9/28) Sep 28 2009 That would be so inefficient in some cases, you don't always want to

Jari-Matti =?UTF-8?B?TcOka2Vsw6Q=?= (5/26) Sep 28 2009 It depends on the boolean representation. I see no reason why a built-in...

bearophile (4/8) Sep 28 2009 I agree. One of the best qualities of C++ is that it often allows the pr...

Yigal Chripun (18/57) Sep 28 2009 here's a type-safe alternative

Jeremie Pelletier (9/74) Sep 28 2009 Not always true.

bearophile (31/39) Sep 28 2009 I agree, I'm using D also because it offers unions. Sometimes they are u...

Christopher Wright (10/18) Sep 28 2009 Certainly agreed on virtual calls: on my machine, I timed a simple

Andrei Alexandrescu (6/34) Sep 28 2009 Thanks for posting these interesting numbers. I seem to recall that

bearophile (13/23) Sep 28 2009 The main problem of virtual calls in D are the missed inlining opportuni...

Michel Fortin (18/20) Sep 28 2009 If I recall correctly, implementing an interface adds a variable to an

Walter Bright (16/20) Sep 28 2009 No, it is done with one indirection.

bearophile (6/7) Sep 29 2009 If even Andrei, a quite intelligent person that has written big books on...

Jeremie Pelletier (24/36) Sep 29 2009 I agree, the ABI documentation on digitalmars.com is far from complete,
Walter Bright (9/12) Sep 29 2009 Not everyone is an expert on everything, and how vptrs and vtbl[]s and

Dejan Lekic (1/1) Oct 02 2009 Walter, is that article publicly available?

Don (2/3) Oct 02 2009 http://www.codeproject.com/KB/cpp/FastDelegate.aspx

Dejan Lekic (1/1) Oct 02 2009 Thanks Don! \o/

Saaa (5/16) Sep 30 2009 ?:)

Andrei Alexandrescu (5/26) Sep 30 2009 I wonder whether this would be a good topic for TDPL. Currently I'm

Jeremie Pelletier (7/36) Sep 30 2009 Maybe that's a topic for an appendix of the book. It is really useful to...
bearophile (6/9) Sep 30 2009 It's a very good topic for the book. Any good book about computer langua...
Saaa (6/10) Sep 30 2009 I'd really love to see more about implementations as it makes me twitch

Andrei Alexandrescu (6/19) Sep 30 2009 I do have the clasic arrow drawings that illustrate how reference

Christopher Wright (37/41) Sep 29 2009 Such numbers are not interesting to me. On average, each class I write

Yigal Chripun (30/60) Sep 28 2009 what other use cases for unions exist that cannot be redesigned in a

bearophile (6/9) Sep 28 2009 Stalin accepts only a certain subset of Scheme, you can't use some of th...
Yigal Chripun (10/44) Sep 28 2009 I think you took my post to an extreme, I actually do agree with the

Jeremie Pelletier (8/72) Sep 29 2009 That wasn't what I said, I don't low level hand optimize everything, I

Yigal Chripun (5/7) Sep 29 2009 that is not what I said.

Jeremie Pelletier (18/28) Sep 30 2009 What's wrong with taking a risk? If you know what you're doing where is

Don (7/40) Sep 30 2009 Also, if you're using asm on something other than a small, simple loop,

Jeremie Pelletier (15/56) Sep 30 2009 That's also how I do it once I find the ideal algorithm, I've never had

language_fan (7/14) Sep 30 2009 Do you recommend writing larger algorithms like a hard real-time

Jeremie Pelletier (12/27) Sep 30 2009 Why does everyone associate complexity with assembly? You can write a

language_fan (9/40) Sep 30 2009 Well I meant that we can assume the algorithm choice is already optimal.

Jeremie Pelletier (8/49) Sep 30 2009 Yeah but I don't rate my code based on the number of lines I write, but

Don (10/25) Oct 01 2009 You deal with this by ensuring that you have a clear division between

Don (9/61) Oct 01 2009 By "riskier" I mean "more chance of containing an error".

Yigal Chripun (15/43) Sep 30 2009 When I said optimizing, I meant lowering the implementation level by

Jarrett Billingsley (5/10) Sep 26 2009 If you haven't crawled out from under your rock in the last twenty
bearophile (4/9) Sep 26 2009 There are some ways to reduce the number/probability of memory corruptio...
Daniel Keep (91/143) Sep 26 2009 Checked exceptions are a bad example: you can't not use them. No one is

Ary Borenszweig (2/32) Sep 26 2009 I like your analogies. :)

Jeremie Pelletier (7/42) Sep 26 2009 I also do, but try and picture a plane sophisticated to the point it can...

Ary Borenszweig (18/33) Sep 26 2009 Please, please, please, do some fun little project in Java or C# and

Ary Borenszweig (2/28) Sep 26 2009 I meant "spot"
bearophile (5/10) Sep 26 2009 Something similar happens in other fields too. I have had long discussio...
Jeremie Pelletier (6/45) Sep 26 2009 All null values are uninitialized, but not all initializers are null,

Ary Borenszweig (8/52) Sep 26 2009 I don't see your point here. "new Object()" is not a null intiializer

Jeremie Pelletier (12/67) Sep 26 2009 Nope, never got interested in these to tell the truth. I only did C,

language_fan (9/21) Sep 27 2009 So you only know imperative procedural programming + some features of

Jeremie Pelletier (17/40) Sep 27 2009 This is what I know best, yeah. I did a lot of work in functional

language_fan (24/46) Sep 27 2009 I must say I have not studied languages that much, only the concepts and...

Jeremie Pelletier (15/63) Sep 27 2009 I agree, Wikipedia is often the first source I check to learn on

Jason House (3/28) Sep 26 2009 Your example segfaults. A is null.
Chad J (77/78) Sep 26 2009 Admittedly I didn't read the whole thread. It is hueg liek xbox.
Steven Schveighoffer (89/102) Sep 27 2009 Analogies aside, we have 2 distinct problems here, with several solution...

bearophile (7/17) Sep 27 2009 To implement it well (and I think it has to be implemented well) it's no...

Yigal Chripun (12/38) Sep 27 2009 I don't accept this argument about nested if statements. D has a
Steven Schveighoffer (32/57) Sep 27 2009 I think you are referring to a combination of this solution and flow

bearophile (16/39) Sep 27 2009 I like to read many books, I have read about this in the chapter Cofee c...

BCS (10/19) Sep 27 2009 But this still assumes some degree of reliability of the code doing the ...

bearophile (5/7) Sep 27 2009 Fuzzy logic can also be "run" by hardware, fuzzy logic engine chips. Suc...

Don (17/35) Sep 29 2009 Let's go back a step. The problem being addressed is this: inadvertent

bearophile (7/12) Sep 29 2009 I like how you can see things a little more clearly than other people (l...

Jeremie Pelletier (14/30) Sep 29 2009 Which is what I said half a dozen times in this thread :)

bearophile (5/10) Sep 29 2009 You have probably missed them, but flow analysis in D was discussed a lo...

Jeremie Pelletier (4/20) Sep 29 2009 I'll try and hack at it in a few weeks when I get some free time. Its

bearophile (4/4) Sep 30 2009 If nonnull class references are added to D, then it can be good to add n...

Max Samukha (22/26) Sep 30 2009 Don't get confused by 'new' in struct initializers. Structs in C# are

Max Samukha (14/31) Sep 30 2009 Ok. I have rechecked this one and it appears that you don't have to
bearophile (5/7) Sep 30 2009 Yes, you are right.

Jarrett Billingsley (3/8) Sep 30 2009 I don't know why a struct pointer would be different than any other

Denis Koroskin (8/22) Sep 30 2009 Note that C stdlib (and other libraries/bindings) will need to be update...

Jarrett Billingsley (3/9) Sep 30 2009 Wonderful. Don't you love self-documenting code that forces you to use
Michel Fortin (21/29) Sep 30 2009 Which makes me think of this: pointers being non-nullable by default

bearophile (12/17) Sep 30 2009 I see.

Yigal Chripun (6/13) Sep 30 2009 why not just use references with structs?

Jeremie Pelletier (2/24) Sep 30 2009 Because sRef wouldn't be a reference but a copy.

Justin (25/37) Oct 02 2009 AYK, in C++ structs are just classes with public protection (for members...

Max Samukha (4/6) Sep 30 2009 I'll probably never learn to proof-read my opuses. It should have been

Walter Bright <newshound1 digitalmars.com> writes:

Denis Koroskin wrote:
 On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
 <newshound1 digitalmars.com> wrote:
 D has borrowed ideas from many different languages. The trick is to
 take the good stuff and avoid their mistakes <g>.

 How about this one:
 

http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

 :)

I think he's wrong.

Getting rid of null references is like solving the problem of dead 
canaries in the coal mines by replacing them with stuffed toys.

It all depends on what you prefer a program to do when it encounters a 
program bug:

1. Immediately stop and produce an indication that the program failed

2. Soldier on and silently produce garbage output

I prefer (1).

Consider the humble int. There is no invalid value such that referencing 
the invalid value will cause a seg fault. One case is an uninitialized 
int is set to garbage, and erratic results follow. Another is that (in 
D) ints are default initialized to 0. 0 may or may not be what the logic 
of the program requires, and if it isn't, again, silently bad results 
follow.

Consider also the NaN value that floats are default initialized to. This 
has the nice characteristic of you know your results are bad if they are 
NaN. But it has the bad characteristic that you don't know where the NaN 
came from. Don corrected this by submitting a patch that enables the 
program to throw an exception upon trying to use a NaN. Then, you know 
exactly where your program went wrong.

It is exactly analogous to a null pointer exception. And it's darned useful.

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 Getting rid of null references is like solving the problem of dead 
 canaries in the coal mines by replacing them with stuffed toys.

 It all depends on what you prefer a program to do when it encounters a 
 program bug:

 1. Immediately stop and produce an indication that the program failed

 2. Soldier on and silently produce garbage output

 I prefer (1).

 Consider the humble int. There is no invalid value such that referencing 
 the invalid value will cause a seg fault. One case is an uninitialized 
 int is set to garbage, and erratic results follow. Another is that (in 
 D) ints are default initialized to 0. 0 may or may not be what the logic 
 of the program requires, and if it isn't, again, silently bad results 
 follow.

 Consider also the NaN value that floats are default initialized to. This 
 has the nice characteristic of you know your results are bad if they are 
 NaN. But it has the bad characteristic that you don't know where the NaN 
 came from. Don corrected this by submitting a patch that enables the 
 program to throw an exception upon trying to use a NaN. Then, you know 
 exactly where your program went wrong.

 It is exactly analogous to a null pointer exception. And it's darned 
 useful.

My assessment: the chances of convincing Walter he's wrong are quite 
slim... Having a rationale for being wrong is very hard to overcome.

Andrei

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 Getting rid of null references is like solving the problem of dead 
 canaries in the coal mines by replacing them with stuffed toys.

 It all depends on what you prefer a program to do when it encounters a 
 program bug:

 1. Immediately stop and produce an indication that the program failed

 2. Soldier on and silently produce garbage output

 I prefer (1).

 Consider the humble int. There is no invalid value such that 
 referencing the invalid value will cause a seg fault. One case is an 
 uninitialized int is set to garbage, and erratic results follow. 
 Another is that (in D) ints are default initialized to 0. 0 may or may 
 not be what the logic of the program requires, and if it isn't, again, 
 silently bad results follow.

 Consider also the NaN value that floats are default initialized to. 
 This has the nice characteristic of you know your results are bad if 
 they are NaN. But it has the bad characteristic that you don't know 
 where the NaN came from. Don corrected this by submitting a patch that 
 enables the program to throw an exception upon trying to use a NaN. 
 Then, you know exactly where your program went wrong.

 It is exactly analogous to a null pointer exception. And it's darned 
 useful.

 My assessment: the chances of convincing Walter he's wrong are quite 
 slim... Having a rationale for being wrong is very hard to overcome.

 Andrei

I actually side with Walter here. I much prefer my programs to crash on 
using a null reference and fix the issue than add runtime overhead that 
does the same thing. In most cases a simple backtrace is enough to 
pinpoint the location of the bug.

Null references are useful to implement optional arguments without any 
overhead by an Optional!T wrapper. If you disallow null references what 
would "Object foo;" initialize to then?

Jeremie

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Jeremie Pelletier wrote:
 Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 Getting rid of null references is like solving the problem of dead 
 canaries in the coal mines by replacing them with stuffed toys.

 It all depends on what you prefer a program to do when it encounters 
 a program bug:

 1. Immediately stop and produce an indication that the program failed

 2. Soldier on and silently produce garbage output

 I prefer (1).

 Consider the humble int. There is no invalid value such that 
 referencing the invalid value will cause a seg fault. One case is an 
 uninitialized int is set to garbage, and erratic results follow. 
 Another is that (in D) ints are default initialized to 0. 0 may or 
 may not be what the logic of the program requires, and if it isn't, 
 again, silently bad results follow.

 Consider also the NaN value that floats are default initialized to. 
 This has the nice characteristic of you know your results are bad if 
 they are NaN. But it has the bad characteristic that you don't know 
 where the NaN came from. Don corrected this by submitting a patch 
 that enables the program to throw an exception upon trying to use a 
 NaN. Then, you know exactly where your program went wrong.

 It is exactly analogous to a null pointer exception. And it's darned 
 useful.

 My assessment: the chances of convincing Walter he's wrong are quite 
 slim... Having a rationale for being wrong is very hard to overcome.

 Andrei

 I actually side with Walter here. I much prefer my programs to crash on 
 using a null reference and fix the issue than add runtime overhead that 
 does the same thing. In most cases a simple backtrace is enough to 
 pinpoint the location of the bug.

But that's a false choice. You don't choose between a crashing program 
and an out-of-control program. This is the fallacy. The problem is the 
way Walter puts it it's darn appealing. Who would want a subtly 
incorrect program?

 Null references are useful to implement optional arguments without any 
 overhead by an Optional!T wrapper. If you disallow null references what 
 would "Object foo;" initialize to then?

The default should be non-nullable references. You can define nullable 
references if you so wish. The problem is, Walter doesn't realize that 
the default initialization scheme and the optional lack thereof by using 
"= void" goes straight against his reasoning about null objects.

Andrei

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 But that's a false choice. You don't choose between a crashing program 
 and an out-of-control program. This is the fallacy. The problem is the 
 way Walter puts it it's darn appealing. Who would want a subtly 
 incorrect program?

Oh, I've heard people argue for them.

 Null references are useful to implement optional arguments without any 
 overhead by an Optional!T wrapper. If you disallow null references 
 what would "Object foo;" initialize to then?

 
 The default should be non-nullable references. You can define nullable 
 references if you so wish. The problem is, Walter doesn't realize that 
 the default initialization scheme and the optional lack thereof by using 
 "= void" goes straight against his reasoning about null objects.

If there was a reasonable way of having NaN values for ints, D would use 
them. So we're stuck with a less than ideal solution, which is default 
initializing them to 0. At least you get repeatable results from that, 
rather than randomly wrong ones.

"=void" is justifiable in certain optimization cases. D is, after all, a 
systems programming language with back doors there when you need them.

The problem with non-nullable references is what do they default to? 
Some "nan" object? When you use a "nan" object, what should it do? Throw 
an exception?

The problem with null references is not the canary dying, it's that 
there's a logic error in the user's code. Putting a gas mask on the 
canary keeps it from dying, but the gas is still seeping in, you just 
don't know it.

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sat, 26 Sep 2009 14:49:45 -0700, Walter Bright thusly wrote:

 The problem with non-nullable references is what do they default to?
 Some "nan" object? When you use a "nan" object, what should it do?
 Throw an exception?

Well typically if your type system supports algebraic types, you can 
define a higher order Optional type as follows:

  type Optional T = Some T | Nothing

Now a safe nullable reference type would look like

  Optional (T*)

The whole point is to make null pointer tests explicit. You can pass 
around the optional type freely, and only on the actual use site you need 
to pattern match it to see if it's a null pointer:

  void foo(SafeRef[int] a) {
    match(a) {
      case Nothing => // handle null pointer
      case Some(b) => return b + 2;
    }
  }

The default initialization of this type is Nothing.

Some data structures can be initialized in a way that null pointers don't 
exist. In these cases you can use a type that does not have the 'Nothing' 
form. This can lead to nice optimizations. There is no default value, 
cause default initialization can never occur.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

language_fan wrote:
 Sat, 26 Sep 2009 14:49:45 -0700, Walter Bright thusly wrote:
 
 The problem with non-nullable references is what do they default to?
 Some "nan" object? When you use a "nan" object, what should it do?
 Throw an exception?

 
 Well typically if your type system supports algebraic types, you can 
 define a higher order Optional type as follows:
 
   type Optional T = Some T | Nothing
 
 Now a safe nullable reference type would look like
 
   Optional (T*)
 
 The whole point is to make null pointer tests explicit.

But people are objecting to having to test for null pointers.

 You can pass 
 around the optional type freely, and only on the actual use site you need 
 to pattern match it to see if it's a null pointer:
 
   void foo(SafeRef[int] a) {
     match(a) {
       case Nothing => // handle null pointer
       case Some(b) => return b + 2;
     }
   }
 
 The default initialization of this type is Nothing.

I don't see the improvement.

 Some data structures can be initialized in a way that null pointers don't 
 exist. In these cases you can use a type that does not have the 'Nothing' 
 form. This can lead to nice optimizations. There is no default value, 
 cause default initialization can never occur.

Seems like a large restriction on data structures to build that 
requirement into the language. It would also make it difficult to 
transfer code from Java or C++ to D.

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 01:49:45 +0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 The problem with non-nullable references is what do they default to?  
 Some "nan" object? When you use a "nan" object, what should it do? Throw  
 an exception?

Oh, my! You don't even know what a non-null default is!

There is a Null Object pattern  
(http://en.wikipedia.org/wiki/Null_Object_pattern) - I guess that's what  
you are talking about, when you mean "nan object" - but it has little to  
do with non-null references.

With non-null references, you don't have "wrong values", that throw an  
exception upon use (although it's clearly possible), you get a correct  
value.

If an object may or may not have a valid value, you mark it as nullable.  
All the difference is that it's a non-default behavior, that's it. And a  
user is now warned, that an object may be not initialized.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Denis Koroskin wrote:
 On Sun, 27 Sep 2009 01:49:45 +0400, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 The problem with non-nullable references is what do they default to? 
 Some "nan" object? When you use a "nan" object, what should it do? 
 Throw an exception?

 
 Oh, my! You don't even know what a non-null default is!
 
 There is a Null Object pattern 
 (http://en.wikipedia.org/wiki/Null_Object_pattern) - I guess that's what 
 you are talking about, when you mean "nan object" - but it has little to 
 do with non-null references.

It's the black hole object. It prevents you from getting a seg fault, 
but I see no rationale for expecting that an unexpected null object 
always returning "I succeeded" means your program will operate correctly.

The white hole object, of course, always throws an exception when it is 
accessed. At least you know something went wrong - but you already have 
that with null.


 With non-null references, you don't have "wrong values", that throw an 
 exception upon use (although it's clearly possible), you get a correct 
 value.

You're not getting a correct value, you're getting another default 
value. If the logic of your program is expecting a prime number > 8, and 
the null object returns 0, now what?

 If an object may or may not have a valid value, you mark it as nullable. 
 All the difference is that it's a non-default behavior, that's it. And a 
 user is now warned, that an object may be not initialized.

He isn't warned, that's just the problem. The null object happily says 
"I succeeded" for all input and returns more default values and null 
objects.

What happens if the output of your program then becomes a null object? 
How are you going to go about tracing that back to its source? That's a 
lot harder than working backwards from where a null exception originated.

I used to work at Boeing designing critical flight systems. Absolutely 
the WRONG failure mode is to pretend nothing went wrong and happily 
return default values and show lovely green lights on the instrument 
panel. The right thing is to immediately inform the pilot that something 
went wrong and INSTANTLY SHUT THE BAD SYSTEM DOWN before it does 
something really, really bad, because now it is in an unknown state. The 
pilot then follows the procedure he's trained to, such as engage the backup.

A null pointer exception fits right in with that philosophy.

You could think of null exceptions like pain - sure it's unpleasant, but 
people who feel no pain constantly injure themselves and don't live very 
long. When I went to the dentist as a kid for the first time, he shot my 
cheek full of novacaine. After the dental work, I went back to school. I 
found to my amusement that if I chewed on my cheek, it didn't hurt.

Boy was I sorry about that later <g>.

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 02:49:06 +0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
 On Sun, 27 Sep 2009 01:49:45 +0400, Walter Bright  
 <newshound1 digitalmars.com> wrote:

 The problem with non-nullable references is what do they default to?  
 Some "nan" object? When you use a "nan" object, what should it do?  
 Throw an exception?

  Oh, my! You don't even know what a non-null default is!
  There is a Null Object pattern  
 (http://en.wikipedia.org/wiki/Null_Object_pattern) - I guess that's  
 what you are talking about, when you mean "nan object" - but it has  
 little to do with non-null references.

 It's the black hole object. It prevents you from getting a seg fault,  
 but I see no rationale for expecting that an unexpected null object  
 always returning "I succeeded" means your program will operate correctly.

 The white hole object, of course, always throws an exception when it is  
 accessed. At least you know something went wrong - but you already have  
 that with null.


 With non-null references, you don't have "wrong values", that throw an  
 exception upon use (although it's clearly possible), you get a correct  
 value.

 You're not getting a correct value, you're getting another default value.


I'm sorry but I can not continue discussion with you like that. You are  
not listening! You are not even trying to understand what I say. We are  
talking *completely* different things here.

Who the hell cares if it's a black or white as long as it is a hole  
object? I tell you that no one is gonna use it, because it's *much* easier  
to do everything right (i.e. initialize a reference to proper value) than  
create Hole classes for each of the class/interface. I can't even imagine  
anyone writing the code like this:

T someFunction(Args someArgs) {
     ISomeInterface someInterface = new  
BlackHoleOfSomeInterfaceThatAlwaysThrows(); // let's initialize that  
variable to some dumb value just to make compiler happy
     // rest of the method body
}

Novices, maybe. But professionals would never do that sin for sure.

*Please* let's go past that pattern, it really has nothing to do with  
proposed non-null by default references.

 If the logic of your program is expecting a prime number > 8, and the  
 null object returns 0, now what?

You are talking heresy here. I'm afraid you don't even know what you are  
talking about.

A) There is no such thing as null object. That's bullsh*t! No one ever  
proposed to use those, you did. And now you deny use of them and discuss  
how bad they are.
B) You can't call a function that accepts non-null T if you don't have a  
valid (i.e. non-null) T reference. End of story.

 If an object may or may not have a valid value, you mark it as  
 nullable. All the difference is that it's a non-default behavior,  
 that's it. And a user is now warned, that an object may be not  
 initialized.

 He isn't warned, that's just the problem. The null object happily says  
 "I succeeded" for all input and returns more default values and null  
 objects.

Please stop that "null object" pattern propaganda. Did you even read what  
I wrote?

I wrote: if a variable may be not initialized - no problem, make it  
nullable and assign a null! A user is now forced to check that variable  
against a null before dereference and must take appropriate actions if it  
is null.

 What happens if the output of your program then becomes a null object?  
 How are you going to go about tracing that back to its source? That's a  
 lot harder than working backwards from where a null exception originated.

 I used to work at Boeing designing critical flight systems. Absolutely  
 the WRONG failure mode is to pretend nothing went wrong and happily  
 return default values and show lovely green lights on the instrument  
 panel. The right thing is to immediately inform the pilot that something  
 went wrong and INSTANTLY SHUT THE BAD SYSTEM DOWN before it does  
 something really, really bad, because now it is in an unknown state. The  
 pilot then follows the procedure he's trained to, such as engage the  
 backup.

 A null pointer exception fits right in with that philosophy.

 You could think of null exceptions like pain - sure it's unpleasant, but  
 people who feel no pain constantly injure themselves and don't live very  
 long. When I went to the dentist as a kid for the first time, he shot my  
 cheek full of novacaine. After the dental work, I went back to school. I  
 found to my amusement that if I chewed on my cheek, it didn't hurt.

 Boy was I sorry about that later <g>.

That's trolling, Walter. I'm sorry, but you are talking non-sense here.
Once again, no one ever proposed use of null object pattern. You imagined  
it and now denying its use.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Walter Bright wrote:
 Denis Koroskin wrote:
 On Sun, 27 Sep 2009 01:49:45 +0400, Walter Bright 
 <newshound1 digitalmars.com> wrote:

 The problem with non-nullable references is what do they default to? 
 Some "nan" object? When you use a "nan" object, what should it do? 
 Throw an exception?

 Oh, my! You don't even know what a non-null default is!

 There is a Null Object pattern 
 (http://en.wikipedia.org/wiki/Null_Object_pattern) - I guess that's 
 what you are talking about, when you mean "nan object" - but it has 
 little to do with non-null references.

 
 It's the black hole object. It prevents you from getting a seg fault, 
 but I see no rationale for expecting that an unexpected null object 
 always returning "I succeeded" means your program will operate correctly.
 
 The white hole object, of course, always throws an exception when it is 
 accessed. At least you know something went wrong - but you already have 
 that with null.
 
 
 With non-null references, you don't have "wrong values", that throw an 
 exception upon use (although it's clearly possible), you get a correct 
 value.

 
 You're not getting a correct value, you're getting another default 
 value. If the logic of your program is expecting a prime number > 8, and 
 the null object returns 0, now what?
 
 If an object may or may not have a valid value, you mark it as 
 nullable. All the difference is that it's a non-default behavior, 
 that's it. And a user is now warned, that an object may be not 
 initialized.

 
 He isn't warned, that's just the problem. The null object happily says 
 "I succeeded" for all input and returns more default values and null 
 objects.
 
 What happens if the output of your program then becomes a null object? 
 How are you going to go about tracing that back to its source? That's a 
 lot harder than working backwards from where a null exception originated.
 
 I used to work at Boeing designing critical flight systems. Absolutely 
 the WRONG failure mode is to pretend nothing went wrong and happily 
 return default values and show lovely green lights on the instrument 
 panel. The right thing is to immediately inform the pilot that something 
 went wrong and INSTANTLY SHUT THE BAD SYSTEM DOWN before it does 
 something really, really bad, because now it is in an unknown state. The 
 pilot then follows the procedure he's trained to, such as engage the 
 backup.
 
 A null pointer exception fits right in with that philosophy.
 
 You could think of null exceptions like pain - sure it's unpleasant, but 
 people who feel no pain constantly injure themselves and don't live very 
 long. When I went to the dentist as a kid for the first time, he shot my 
 cheek full of novacaine. After the dental work, I went back to school. I 
 found to my amusement that if I chewed on my cheek, it didn't hurt.
 
 Boy was I sorry about that later <g>.

Haha that's a nice analogy, I myself was just unable to speak. I guess 
that's what you call undefined behavior!

That's exactly the point with nonnull references, they turn access 
violations or segfaults into undefined behavior, or worse into generic 
behavior that's much harder to track to its source.

I think nonnull references are a nice concept for languages that have a 
higher level than D. If I expect references to never be null I just 
don't check for null before using them, and let the code crash which 
gives me a nice crash window with a backtrace in my runtime.

Jeremie

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 7:21 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 That's exactly the point with nonnull references, they turn access
 violations or segfaults into undefined behavior, or worse into generic
 behavior that's much harder to track to its source.

 I think nonnull references are a nice concept for languages that have a
 higher level than D. If I expect references to never be null I just don't
 check for null before using them, and let the code crash which gives me a
 nice crash window with a backtrace in my runtime.

You're missing the point. You wouldn't have "undefined behavior at
runtime" with nonnull references because there would be NO POINT in
having nonnull references without ALSO having nullable references.

Could your reference be null? Use a nullable reference.

Is your reference never supposed to be null? Use a nonnull reference.

End of problem. You do not create "null objects" and store them in a
nonnull reference which you then check at runtime. You use a nullable
reference which the language *forces* you to check before use.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 7:21 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 That's exactly the point with nonnull references, they turn access
 violations or segfaults into undefined behavior, or worse into generic
 behavior that's much harder to track to its source.

 I think nonnull references are a nice concept for languages that have a
 higher level than D. If I expect references to never be null I just don't
 check for null before using them, and let the code crash which gives me a
 nice crash window with a backtrace in my runtime.

 
 You're missing the point. You wouldn't have "undefined behavior at
 runtime" with nonnull references because there would be NO POINT in
 having nonnull references without ALSO having nullable references.
 
 Could your reference be null? Use a nullable reference.
 
 Is your reference never supposed to be null? Use a nonnull reference.
 
 End of problem. You do not create "null objects" and store them in a
 nonnull reference which you then check at runtime. You use a nullable
 reference which the language *forces* you to check before use.

I don't want the language to force me to check nullable references 
before using them, that just takes away a lot of optimization cases.

You could just use the casting system to sneak null into a nonnull 
reference and bam, undefined behavior. And you could have nullables 
which are always nonnull at some point in time within a scope but your 
only way out of the compiler errors about using a nullable without first 
testing it for nullity is to use excessive casting.

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 11:06 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 I don't want the language to force me to check nullable references before
 using them, that just takes away a lot of optimization cases.

You don't design tight loops that dereference pointers with the
intention that those pointers will ever be null. Those loops always
expect nonnull pointers, and therefore you'd use a nonnull reference.

The number of null references in your program are far less than you'd
think. For those that really could be legitimately null (like an
optional callback or something), you have to check for null at runtime
anyway. Most of your code wouldn't really change. You'd instead just
get more errors at compile time for things that are obviously illegal
or just very potentially dangerous.

 You could just use the casting system to sneak null into a nonnull reference
 and bam, undefined behavior.

No, you couldn't. That would be a pretty shitty nonnull reference type
if the compiler let you put null in it.

 And you could have nullables which are always
 nonnull at some point in time within a scope but your only way out of the
 compiler errors about using a nullable without first testing it for nullity
 is to use excessive casting.

The argument of verbosity that comes up with nonnull references holds
some weight, but it's far more a matter of designing your code not to
do something like that.

Sep 26 2009

Christopher Wright <dhasenan gmail.com> writes:

Walter Bright wrote:
 Denis Koroskin wrote:
 If an object may or may not have a valid value, you mark it as 
 nullable. All the difference is that it's a non-default behavior, 
 that's it. And a user is now warned, that an object may be not 
 initialized.

 
 He isn't warned, that's just the problem. The null object happily says 
 "I succeeded" for all input and returns more default values and null 
 objects.

This is not the proposal. The proposal was to codify in the type system 
whether a particular object has "null" as a valid value. If a variable 
that cannot be null is not initialized to a non-null value before use, 
that is an error.

It's entirely equivalent to using the current type system with a ton of 
manual contracts requiring that variables not be null. Except the 
contracts are enforced at compile time, not runtime.

A similar concept would be range-bounded integer types, or floating 
point types that cannot be NaN or infinity.

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sat, 26 Sep 2009 15:49:06 -0700, Walter Bright thusly wrote:

 I used to work at Boeing designing critical flight systems. Absolutely
 the WRONG failure mode is to pretend nothing went wrong and happily
 return default values and show lovely green lights on the instrument
 panel.

Basically if there is only one way the system can operate correctly, your 
approach is to catch errors on runtime (segfaults) until a later 
iteration of the program development turns out to work correctly or well 
enough. Meanwhile there are several buggy revisions of the program in the 
development process.

The idea behind non-nullable types and other contracts is to catch these 
errors on compile time. Sure, the code is a bit harder to write, but it 
is safe and never segfaults. The idea is to minimize the amount of 
runtime errors of all sorts. That's also how other features of statically 
typed languages work.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

language_fan wrote:
 The idea behind non-nullable types and other contracts is to catch these 
 errors on compile time. Sure, the code is a bit harder to write, but it 
 is safe and never segfaults. The idea is to minimize the amount of 
 runtime errors of all sorts. That's also how other features of statically 
 typed languages work.


I certainly agree that catching errors at compile time is preferable by 
far. Where I disagree is the notion that non-nullable types achieve 
this. I've argued extensively here that they hide errors, not fix them.

Also, by "safe" I presume you mean "memory safe" which means free of 
memory corruption. Null pointer exceptions are memory safe. A null 
pointer could be caused by memory corruption, but it cannot *cause* 
memory corruption.

Sep 26 2009

Jason House <jason.james.house gmail.com> writes:

Walter Bright Wrote:

 language_fan wrote:
 The idea behind non-nullable types and other contracts is to catch these 
 errors on compile time. Sure, the code is a bit harder to write, but it 
 is safe and never segfaults. The idea is to minimize the amount of 
 runtime errors of all sorts. That's also how other features of statically 
 typed languages work.

 
 
 I certainly agree that catching errors at compile time is preferable by 
 far. Where I disagree is the notion that non-nullable types achieve 
 this. I've argued extensively here that they hide errors, not fix them.

If you argued any cases other than there's no good default initialization, I
missed it. I reject the default initialization argument. I find code that

uninitialized variables highly useful. I've also never had a bug that Don's
signalling nans would help with. I've had nan bugs that cropped up later
though... On top of that, use of a null variable because it was never set are
typically easy to find. 
 
 Also, by "safe" I presume you mean "memory safe" which means free of 
 memory corruption. Null pointer exceptions are memory safe. A null 
 pointer could be caused by memory corruption, but it cannot *cause* 
 memory corruption.

I reject this argument too :(
To me, code isn't safe if it crashes. Did Boeing avoid checking for fault modes
that were easily and reliably detectable? It seems stupid to argue that it's ok
for an altimeter can send bogus data as long as it's easy to detect. All you
have to do is turn off autopilot. Who cares, right?

Why should I use D for production code if it's designed to segfault? Software
isn't used for important things like autopilot, controlling the brakes in my
car, or dispensing medicine in hospitals. There's no problem allowing that
stuff to crash. You can always recover the core file, and it's always trivial
to reproduce the scenario...

Mix in other things like malfunctioning debug data, and I wonder why I even use
D.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jason House wrote:
 Also, by "safe" I presume you mean "memory safe" which means free
 of memory corruption. Null pointer exceptions are memory safe. A
 null pointer could be caused by memory corruption, but it cannot
 *cause* memory corruption.

 
 I reject this argument too :( To me, code isn't safe if it crashes.

Well, we can't discuss this if we cannot agree on terms. The 
conventional definition of memory safe means no memory corruption. A 
null pointer dereference is not memory corruption. You can call it 
something else, but if you call it "unsafe" then people will 
misunderstand you.


 Did Boeing avoid checking for fault modes that were easily and
 reliably detectable? It seems stupid to argue that it's ok for an
 altimeter can send bogus data as long as it's easy to detect. All you
 have to do is turn off autopilot. Who cares, right?

Errors in incorrectly initialized data are not easily and reliably 
detectable. A null pointer, on the other hand, *is* reliably detectable 
by the hardware.

Boeing's philosophy is that if the airplane cannot tolerate a particular 
system failing abruptly and completely, then the design is faulty. 
That's also the FAA regulations. Safety is achieved NOT by designing 
systems that cannot fail, but by designing systems that can survive failure.

In particular, if the airplane cannot handle turning off the autopilot, 
it will be rejected by both Boeing and the FAA. Name any single part or 
system on a Boeing airliner, and if it vanishes abruptly in a puff of 
smoke, the airliner will survive it.

There is no "the autopilot is receiving corrupted data, but what the 
hell, we'll keep it turned on anyway". It's inconceivable.

The only reasonable thing a program can do if it discovers it is in an 
unknown state is to stop immediately. The only reasonable way to use a 
program is to be able to tolerate its complete failure.


 Why should I use D for production code if it's designed to segfault?
 Software isn't used for important things like autopilot, controlling
 the brakes in my car, or dispensing medicine in hospitals. There's no
 problem allowing that stuff to crash. You can always recover the core
 file, and it's always trivial to reproduce the scenario...

It's not designed to segfault. It's designed to expose errors, not hide 
them. The system that uses the autopilot is designed to survive total 
failure of the autopilot. The same for your brakes in your car (ever 
wonder why there are dual brake systems, and if your power assist fails 
you can still use the brakes?). I don't know how the ABS works, but I 
would bet you plenty that if the computer controlling it fails, the 
brakes will still function. And you bet your life (literally) that if a 
computer dispensing radiation or medicine into your body better stop 
immediately if it detects it is in an unknown state.

Do you *really* want the radiation machine to continue operating if it 
has self-detected a program bug? Do you really want to BET YOUR LIFE 
that the software in it is perfect? Do you think that requiring the 
software be literally perfect is a reasonable, achievable, and safe 
requirement?

I don't. Not for a minute. And NOTHING Boeing designs relies on 
perfection for safety, either. In fact, the opposite is true, the 
designs are all based on "what if this fails?" If the answer is "people 
die" then the engineers are sent back to the trenches.

Hospitals are way, way behind on this approach. Even adding simple 
checklists (pilots starting using them 70 years ago) have reduced 
accidental deaths in hospitals by 30%, a staggering improvement.

 Mix in other things like malfunctioning debug data, and I wonder why
 I even use D.

The debug data is a serious problem, and I think I've got it corrected now.

Sep 27 2009

"Manfred_Nowak" <svv1999 hotmail.com> writes:

Walter Bright wrote:

 Name any single part or system on a Boeing airliner, and if it
 vanishes abruptly in a puff of smoke, the airliner will survive it. 

Except this sentence I applaud every thought.

If "single part" includes the passenger area, the meaning of this sentence 
is upright ridiculous.

-manfred

Sep 27 2009

"Nick Sabalausky" <a a.a> writes:

"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:h9n3k5$2eu9$1 digitalmars.com...
 Jason House wrote:
 Also, by "safe" I presume you mean "memory safe" which means free
 of memory corruption. Null pointer exceptions are memory safe. A
 null pointer could be caused by memory corruption, but it cannot
 *cause* memory corruption.

 I reject this argument too :( To me, code isn't safe if it crashes.

 Well, we can't discuss this if we cannot agree on terms. The conventional 
 definition of memory safe means no memory corruption.

He keeps saying "safe", and every time he does you turn it into "memory 
safe". If he meant "memory safe" he probably would have said something like 
"memory safe". He already made it perfectly clear he's talking about 
crashes, so continuing to put the words "memory safe" into his mouth doesn't 
help the discussion.

 Boeing, Boeing, Boeing, Boeing, Boeing...

Straw man. No one's arguing against designing systems to survive failure, 
and no one's arguing against forcing errors to be exposed.

Your point seems to be: A good system is designed to handle a crash/failure 
without corruption, so let's allow things to crash/fail all they want.

Our point is: A good system is designed to handle a crash/failure without 
corruption, but let's also do what we can to minimize the amount of 
crashes/failures in the first place.

You're acting as if handling failures safely and minimizing failures were 
mutually exclusive.

 It's not designed to segfault. It's designed to expose errors, not hide 
 them.

Right. And some of these errors can be exposed at compile time...and you 
want to just leave them as runtime segfaults instead? And you want this 
because exposing an error at compile time somehow causes it to become a 
hidden error?

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Nick Sabalausky:

 He keeps saying "safe", and every time he does you turn it into "memory 
 safe". If he meant "memory safe" he probably would have said something like 
 "memory safe". He already made it perfectly clear he's talking about 
 crashes, so continuing to put the words "memory safe" into his mouth doesn't 
 help the discussion.

Likewise, I think that the name of SafeD modules is misleading, they are
MemorySafeD :-)

Bye,
bearophile

Sep 27 2009

Lutger <lutger.blijdestijn gmail.com> writes:

Nick Sabalausky wrote:

 "Walter Bright" <newshound1 digitalmars.com> wrote in message

...
 You're acting as if handling failures safely and minimizing failures were
 mutually exclusive.

Not that I have an opinion on this either way, but if I understand Walter 
right that is exactly his point (although you exaggerate it a bit), see 
below.
 
 It's not designed to segfault. It's designed to expose errors, not hide
 them.

 
 Right. And some of these errors can be exposed at compile time...and you
 want to just leave them as runtime segfaults instead? And you want this
 because exposing an error at compile time somehow causes it to become a
 hidden error?

somehow -> encourages a practice where programmers get annoyed by the 
'exposing of errors' to the point that they hide them

This is what it's about, I think: are non-nullable references *by default* 
so annoying as to cause programmers to initialize them with wrong values (or 
circumventing them in other ways)? 
The answer may depend on the details of the feature, quality of 
implementation and on the habits of the 'programmers' in question, I don't 
know.

Sep 27 2009

Jason House <jason.james.house gmail.com> writes:

Lutger Wrote:

 This is what it's about, I think: are non-nullable references *by default* 
 so annoying as to cause programmers to initialize them with wrong values (or 
 circumventing them in other ways)? 
 The answer may depend on the details of the feature, quality of 
 implementation and on the habits of the 'programmers' in question, I don't 
 know. 

In reality, the issue becomes what will programmers do to bypass compiler
errors. This is one area where syntactic sugar is worth its weight in gold. I'm


SomeType x; // Not nullable
SomeType? y; // Nullable

If the developer is too lazy to add the question mark and prefers to do
SomeType x = cast(SomeType) null;
Then it's their own fault when they get a runtime segfault to replace a
compile-time error.

Sep 27 2009

BCS <none anon.com> writes:

Hello Lutger,

 The answer may
 depend on [...]
 the habits of the 'programmers' in question, I don't know.
 

If you can't trust the programmer to write good code, replace them with someone 
you can trust. There will never be a usable language that can take in garbage 
and spit out correct programs.

Sep 27 2009

Lutger <lutger.blijdestijn gmail.com> writes:

BCS wrote:

 Hello Lutger,
 
 The answer may
 depend on [...]
 the habits of the 'programmers' in question, I don't know.
 

 
 If you can't trust the programmer to write good code, replace them with
 someone you can trust. There will never be a usable language that can take
 in garbage and spit out correct programs.

Hi. I don't think this argument will work, for several reasons:

First, there is a huge demand for programmers, so much that even I got hired 
in this time of crisis ;) Good programmers don't suddenly fall from the 
skies apparently. 
Second, there are lot's of tasks doable by programmers with less skill than 
others using tools that trade safety for performance / expressiveness / 
whatever. 
Finally, programmers are humans, humans make mistakes, have quirks and bad 
days. All of them. What it comes down to is that languages are made in order 
to service and adapt to the programmers, not the other way around.

Do you maintain that a programmer who can't deal with non-nullable 
references without hacking them away is unusually incompetent? I don't know 
about this. Actually I suspect non-nullable references by default are in the 
end safer (whatever that means), but only if they don't complicate the use 
of nullable references.

Sep 27 2009

BCS <none anon.com> writes:

Hello Lutger,

 BCS wrote:
 
 Hello Lutger,
 
 The answer may
 depend on [...]
 the habits of the 'programmers' in question, I don't know.

 If you can't trust the programmer to write good code, replace them
 with someone you can trust. There will never be a usable language
 that can take in garbage and spit out correct programs.
 

 Hi. I don't think this argument will work, for several reasons:
 

[...]
 
 Do you maintain that a programmer who can't deal with non-nullable
 references without hacking them away is unusually incompetent?

Incompetent? No. But I wouldn't want to hire a programer that *habitually* 
(and unnecessarily) hacks past a feature designed to prevent bugs. The best 
race car driver in the world is clearly not incompetent but would still get 
a ticket on public roads for speeding or following to close.

 I don't
 know about this. Actually I suspect non-nullable references by default
 are in the end safer (whatever that means), but only if they don't
 complicate the use of nullable references.

I'll second that.

Sep 27 2009

"Manfred_Nowak" <svv1999 hotmail.com> writes:

BCS wrote:

[...]
 I wouldn't want to hire a programer that *habitually* (and
 unnecessarily) hacks past a feature designed to prevent bugs.

In the short time of an interview its not possible to test for habits (or 
necessarity) to hack past a feature designed to provent bugs.

Therefore the only measures of code quality are the number of bugs 
detected by the users---or the number of WTF's exclaimed during a code 
review.

Are you able to give an upper limit for the number of WTF's during a code 
review for which the coder is not fired?

-manfred

Sep 28 2009

BCS <none anon.com> writes:

Hello Manfred_Nowak,

 BCS wrote:
 
 [...]
 
 I wouldn't want to hire a programer that *habitually* (and
 unnecessarily) hacks past a feature designed to prevent bugs.
 

 In the short time of an interview its not possible to test for habits
 (or necessarity) to hack past a feature designed to provent bugs.

Good point, I guess that all that is left would be to try and get a feel 
for what they think of that kind of practice (give them something ugly that 
works and ask "what do you think of this code?"). If they indicate they think 
that kind of hacking a bad idea, then at least you can say they lied if you 
have to get rid of them for that kind of things.

Sep 28 2009

language_fan <foo bar.com.invalid> writes:

Mon, 28 Sep 2009 20:34:44 +0000, BCS thusly wrote:

 Hello Manfred_Nowak,
 
 BCS wrote:
 
 [...]
 
 I wouldn't want to hire a programer that *habitually* (and
 unnecessarily) hacks past a feature designed to prevent bugs.
 

 In the short time of an interview its not possible to test for habits
 (or necessarity) to hack past a feature designed to provent bugs.

 
 Good point, I guess that all that is left would be to try and get a feel
 for what they think of that kind of practice (give them something ugly
 that works and ask "what do you think of this code?"). If they indicate
 they think that kind of hacking a bad idea, then at least you can say
 they lied if you have to get rid of them for that kind of things.

At least in the companies I have worked in they briefly teach you their 
stuff in 1-7 days and want to see some preliminary results. If you have 
trouble writing any code, you have lost the job (there is a 6 month test 
period or something similar so it is perfectly legal to kick him out if 
he fails). Usually the schedules are tight so hiring a lazy bastard is 
not worth the effort. Other ways to control the learning are working with 
a more experienced pair (pair programming - ever heard of it?) and weekly 
meetings.

Sep 28 2009

bearophile <bearophileHUGS lycos.com> writes:

Lutger:

 First, there is a huge demand for programmers, so much that even I got hired 
 in this time of crisis ;) Good programmers don't suddenly fall from the 
 skies apparently. 

This is the nicest thing I've read this week. Thank you very much :-)
Biologists aren't that lucky, apparently.

Bye,
bearophile

Sep 27 2009

Jesse Phillips <jessekphillips gmail.com> writes:

On Sun, 27 Sep 2009 10:10:19 -0400, Nick Sabalausky wrote:

 "Walter Bright" <newshound1 digitalmars.com> wrote in message
 news:h9n3k5$2eu9$1 digitalmars.com...
 Jason House wrote:
 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 I reject this argument too :( To me, code isn't safe if it crashes.

 Well, we can't discuss this if we cannot agree on terms. The
 conventional definition of memory safe means no memory corruption.

 
 He keeps saying "safe", and every time he does you turn it into "memory
 safe". If he meant "memory safe" he probably would have said something
 like "memory safe". He already made it perfectly clear he's talking
 about crashes, so continuing to put the words "memory safe" into his
 mouth doesn't help the discussion.

The thing is that memory safety is the only safety with code. In Walter's 
examples he very clearly showed that a crash is not unsafe, but operating 
with incorrect values is. He has pointed out that if initialization is 
enforced, whether with a default or by coder, there is a good chance it 
will be initialized to the wrong value.

Now if you really want to throw some sticks into the spokes, you would 
say that if the program crashes due to a null pointer, it is still likely 
that the programmer will just initialize/set the value to a "default" 
that still isn't valid just to get the program to continue to run.

Sep 27 2009

language_fan <foo bar.com.invalid> writes:

Sun, 27 Sep 2009 16:47:51 +0000, Jesse Phillips thusly wrote:

 The thing is that memory safety is the only safety with code. In
 Walter's examples he very clearly showed that a crash is not unsafe, but
 operating with incorrect values is. He has pointed out that if
 initialization is enforced, whether with a default or by coder, there is
 a good chance it will be initialized to the wrong value.

Have you ever used functional languages? When you develop in Haskell or 
SML, how often you feel there is a good change something will be 
initialized to the wrong value? Can you show some statistics that show 
how unsafe this practice is?

When the non-nullability is made optional, you *only* use it when you 
really know the initialization has a sane value, ok? Otherwise you can 
use the good old nullable references, right?


 Now if you really want to throw some sticks into the spokes, you would
 say that if the program crashes due to a null pointer, it is still
 likely that the programmer will just initialize/set the value to a
 "default" that still isn't valid just to get the program to continue to
 run.

Why should it crash in the first place? I hate crashes. You liek them? I 
can prove by structural induction that you do not like them when you can 
avoid crashes with static checking.

Sep 27 2009

Jesse Phillips <jesse.k.phillips+d gmail.com> writes:

language_fan Wrote:

 Now if you really want to throw some sticks into the spokes, you would
 say that if the program crashes due to a null pointer, it is still
 likely that the programmer will just initialize/set the value to a
 "default" that still isn't valid just to get the program to continue to
 run.

 
 Why should it crash in the first place? I hate crashes. You liek them? I 
 can prove by structural induction that you do not like them when you can 
 avoid crashes with static checking.

No one likes programs that crash, doesn't that mean it is an incorrect behavior
though?

 Have you ever used functional languages? When you develop in Haskell or 
 SML, how often you feel there is a good change something will be 
 initialized to the wrong value? Can you show some statistics that show 
 how unsafe this practice is?

So isn't that the question? Does/can "default" (by human or machine)
initialization create an incorrect state? If it does, do we continue to work as
if nothing was wrong or crash? I don't know how often the initialization would
be incorrect, but I don't think Walter is concerned with it's frequency, but
that it is possible.

Sep 28 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips  
<jesse.k.phillips+d gmail.com> wrote:

 language_fan Wrote:

 Have you ever used functional languages? When you develop in Haskell or
 SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 So isn't that the question? Does/can "default" (by human or machine)  
 initialization create an incorrect state? If it does, do we continue to  
 work as if nothing was wrong or crash? I don't know how often the  
 initialization would be incorrect, but I don't think Walter is concerned  
 with it's frequency, but that it is possible.

It creates an invalid, non-compiling program.

It's simple:

If initialization doesn't make sense, don't use non-nullable type.

If initialization makes sense, use non-nullable type, initialize with the  
correct value.

In case 1, we are back to current behavior, no problem (in Walter's eyes).

In case 2, we eliminate any possible crash due to non-initialization.

The subtle difference is the *default*.  If non-null is the default, then  
you haphazardly write code like this:

Object o;

And you get a compile error "error, please initialize o or declare as  
Object? o".  It makes you look at the line and say "hm... does it make  
sense to initialize there?" and you either put an initializer or you  
change it to

Object? o;

And move on.

90% of the time, you write something like:

auto x = new Object();

and you don't even have to think about it.  The compiler tells you when  
you got it wrong, and usually you then get it right after a moment of  
thought.


analysis, not non-nullable defaults).  And I very seldom have null  

nice stack trace).  Compare that to D, where I build my program and get:


Segmentation fault.


I'd rather spend an extra 5 minutes having D compiler complain about  
initialization than face the Segmentation fault error search.

The thing is, I don't want D to cater to the moronic programmers that say  
"what? I need to initialize, ok, um.. here's a dummy object".  I want it  
to cater to *me* and prevent *me* from making simple errors where I  
obviously should have known better, but accidentally left out the  
initializer.

It's like the whole allowing object == null problem (coincidentally,  
resulting in the same dreaded error).  Once Walter implemented the  
compiler that flagged them all, he discovered Phobos had several of those  
*obviously incorrect* statements.  Hm... maybe he should do the same with  
this...  Maybe someone who can hack the compiler can do it for him!  Any  
takers?

-Steve

P.S.  I never make the object == null mistake anymore.  The compiler has  
trained me :)

Sep 28 2009

Jesse Phillips <jessekphillips gmail.com> writes:

On Mon, 28 Sep 2009 16:01:10 -0400, Steven Schveighoffer wrote:

 On Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips
 <jesse.k.phillips+d gmail.com> wrote:
 
 language_fan Wrote:

 Have you ever used functional languages? When you develop in Haskell
 or SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 So isn't that the question? Does/can "default" (by human or machine)
 initialization create an incorrect state? If it does, do we continue to
 work as if nothing was wrong or crash? I don't know how often the
 initialization would be incorrect, but I don't think Walter is
 concerned with it's frequency, but that it is possible.

 
 It creates an invalid, non-compiling program.

No it doesn't, I'm not referring to null as the invalid state.

float a;

In this program it is invalid for 'a' to equal zero. If the compiler 
complains it is not initialized the programmer could fulfill the 
requirements.

float a = 0;

Hopefully the programmer knows that it shouldn't be 0, but a correction 
like this is still possible, the compiler won't complain and the program 
won't crash. Depending on what 'a' is controlling this could be very bad.

I'm really not arguing either way, I'm trying to make it clear since no 
one seems to be getting Walters positions.

BTW, what is it with people writing

SomeObject foo;

If they believe the compiler should enforce explicit initialization? If 
you think an object should always be initialized at declaration don't 
write a statement that only declares and don't set a reference to null.

Sep 28 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 28 Sep 2009 21:43:20 -0400, Jesse Phillips  
<jessekphillips gmail.com> wrote:

 On Mon, 28 Sep 2009 16:01:10 -0400, Steven Schveighoffer wrote:

 On Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips
 <jesse.k.phillips+d gmail.com> wrote:

 language_fan Wrote:

 Have you ever used functional languages? When you develop in Haskell
 or SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 So isn't that the question? Does/can "default" (by human or machine)
 initialization create an incorrect state? If it does, do we continue to
 work as if nothing was wrong or crash? I don't know how often the
 initialization would be incorrect, but I don't think Walter is
 concerned with it's frequency, but that it is possible.

 It creates an invalid, non-compiling program.

 No it doesn't, I'm not referring to null as the invalid state.

 float a;

 In this program it is invalid for 'a' to equal zero. If the compiler
 complains it is not initialized the programmer could fulfill the
 requirements.

I am not arguing for floats (or any value types) to be required to be  
initialized.

 float a = 0;

 Hopefully the programmer knows that it shouldn't be 0, but a correction
 like this is still possible, the compiler won't complain and the program
 won't crash. Depending on what 'a' is controlling this could be very bad.

 I'm really not arguing either way, I'm trying to make it clear since no
 one seems to be getting Walters positions.

I get his arguments, but I think they are based on an non-analagous  
situation.  I think his arguments are based on his experience with  
compilers or corporate rules requiring what you were saying --  
initializing all variables.  We don't want that, we just want the  
developer to clarify "this variable is initialized" or "this variable is  
ok to be uninitialized".

 BTW, what is it with people writing

 SomeObject foo;

 If they believe the compiler should enforce explicit initialization? If
 you think an object should always be initialized at declaration don't
 write a statement that only declares and don't set a reference to null.

It's more complicated than that.  For example, you *have* to write this  
for objects that are a part of aggregates:

class SomeOtherObject
{
   SomeObject foo; // can't initialize here, because you need to use the  
heap, and compiler only allows CTFE initialization.

   this()
   {
      foo = new SomeObject(); // here is where the initialization sits.
   }
}

This is ok, but what if the initialization is buried, or you add another  
variable to a large class and forgot to add the initializer to the  
constructor?

And there *are* cases where you *don't* want to initialize, that should  
also be possible:

SomeObject? foo;

If this wasn't part of the proposal, I'd agree with Walter 100%, but it  
gives the lazy programmer an easy way to default to the current behavior  
(easier than building some dummy object), so given the lazy nature of said  
programmer, they are more likely to do this than assign a dummy value.

-Steve

Sep 29 2009

language_fan <foo bar.com.invalid> writes:

Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:

 language_fan Wrote:
 
 Now if you really want to throw some sticks into the spokes, you
 would say that if the program crashes due to a null pointer, it is
 still likely that the programmer will just initialize/set the value
 to a "default" that still isn't valid just to get the program to
 continue to run.

 
 Why should it crash in the first place? I hate crashes. You liek them?
 I can prove by structural induction that you do not like them when you
 can avoid crashes with static checking.

 
 No one likes programs that crash, doesn't that mean it is an incorrect
 behavior though?
 
 Have you ever used functional languages? When you develop in Haskell or
 SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 
 So isn't that the question? Does/can "default" (by human or machine)
 initialization create an incorrect state? If it does, do we continue to
 work as if nothing was wrong or crash? I don't know how often the
 initialization would be incorrect, but I don't think Walter is concerned
 with it's frequency, but that it is possible.

Value types can be incorrectly initialized and nobody notices. E.g.

  int min;

  foreach(int value; list)
    if (value < min) min = value;

Oops, you forgot to define a flag variable or initialize to int.min (if 
that is what you want). Even Java IDEs spot this error, but not D. The 
flow analysis helps me in tremendous ways - I can fix the error 
statically and boom, the software is suddenly again error free.

Now I can tell you, in functional languages there is no other way. All 
initializations have to be correct, they are final, they are constants 
and they can be initialized incorrectly. But there are some tools that 
help in this. Functions can be automatically tested. Invariants, pre- and 
post-conditions can be set. Still, I can even bet they are much safer 
than D in every possible way. How is this possible?

It really depends on your subjective opinion whether you want a program 
to segfault or spot a set of errors statically, and have illegally 
behaving non-crashing programs. I say FFFFFFFFFFUUUUUUUUUUU every time I 
experience a segfault. My hobby programs at home are not that critical, 
and at work the critical code is *proven* to be correct so no need to 
worry there.

Sep 28 2009

language_fan <foo bar.com.invalid> writes:

Mon, 28 Sep 2009 22:33:26 +0000, language_fan thusly wrote:

 Value types can be incorrectly initialized and nobody notices. E.g.
 
   int min;
 
   foreach(int value; list)
     if (value < min) min = value;

 Now I can tell you, in functional languages there is no other way. All
 initializations have to be correct, they are final, they are constants
 and they can be initialized incorrectly. But there are some tools that
 help in this. Functions can be automatically tested. Invariants, pre-
 and post-conditions can be set. Still, I can even bet they are much
 safer than D in every possible way. How is this possible?

For instance if I use the example given above, I write it like this in a 
functional language:

find_min:: Ord a => [a] -> Maybe a
find_min [] = Nothing
find_min (h:t) = Just $ foldl min h t

You can then use quickcheck to verify the result in some fancy way.

I just cannot think of any way how you could crash programs written in 
this way. They are solid as a rock.

Sep 28 2009

"Nick Sabalausky" <a a.a> writes:

"language_fan" <foo bar.com.invalid> wrote in message 
news:h9relp$1ebg$4 digitalmars.com...
 Mon, 28 Sep 2009 22:33:26 +0000, language_fan thusly wrote:

 Value types can be incorrectly initialized and nobody notices. E.g.

   int min;

   foreach(int value; list)
     if (value < min) min = value;

 Now I can tell you, in functional languages there is no other way. All
 initializations have to be correct, they are final, they are constants
 and they can be initialized incorrectly. But there are some tools that
 help in this. Functions can be automatically tested. Invariants, pre-
 and post-conditions can be set. Still, I can even bet they are much
 safer than D in every possible way. How is this possible?

 For instance if I use the example given above, I write it like this in a
 functional language:

 find_min:: Ord a => [a] -> Maybe a
 find_min [] = Nothing
 find_min (h:t) = Just $ foldl min h t

 You can then use quickcheck to verify the result in some fancy way.

 I just cannot think of any way how you could crash programs written in
 this way. They are solid as a rock.

I'm not particulary accustomed to that sort of syntax. Am I correct in my 
analysis that that essentially does something like this?:

// Assuming that:
// 1. Variables of type void could be declared and had value 'void'.
// 2. 'any(T,U,V)' was a "supertype" that can and must be one (and only one) 
of T, U, or V.

immutable any(int,void) min(immutable any(int,void) a, immutable 
any(int,void) b)
{
    static if( is(typeof(a) == void) && is(typeof(b) == void) )
        return void;
    else static if( is(typeof(a) == int) && is(typeof(b) == void) )
        return a;
    else static if( is(typeof(a) == void) && is(typeof(b) == int) )
        return b;
    else
        return a<b? a : b;
}

immutable any(int,void) findMin(immutable int[] list)
{
    static if(list.length == 0)
        return void;
    else
        return reduce!("min(a,b)")(list); // 'reduce' from phobos2
}

Sep 28 2009

language_fan <foo bar.com.invalid> writes:

Mon, 28 Sep 2009 20:17:54 -0400, Nick Sabalausky thusly wrote:

 "language_fan" <foo bar.com.invalid> wrote in message
 news:h9relp$1ebg$4 digitalmars.com...
 Mon, 28 Sep 2009 22:33:26 +0000, language_fan thusly wrote:

 Value types can be incorrectly initialized and nobody notices. E.g.

   int min;

   foreach(int value; list)
     if (value < min) min = value;

 Now I can tell you, in functional languages there is no other way. All
 initializations have to be correct, they are final, they are constants
 and they can be initialized incorrectly. But there are some tools that
 help in this. Functions can be automatically tested. Invariants, pre-
 and post-conditions can be set. Still, I can even bet they are much
 safer than D in every possible way. How is this possible?

 For instance if I use the example given above, I write it like this in
 a functional language:

 find_min:: Ord a => [a] -> Maybe a
 find_min [] = Nothing
 find_min (h:t) = Just $ foldl min h t

 You can then use quickcheck to verify the result in some fancy way.

 I just cannot think of any way how you could crash programs written in
 this way. They are solid as a rock.

 
 I'm not particulary accustomed to that sort of syntax. Am I correct in
 my analysis that that essentially does something like this?:
 
 // Assuming that:
 // 1. Variables of type void could be declared and had value 'void'. //
 2. 'any(T,U,V)' was a "supertype" that can and must be one (and only
 one) of T, U, or V.
 
 immutable any(int,void) min(immutable any(int,void) a, immutable
 any(int,void) b)
 {
     static if( is(typeof(a) == void) && is(typeof(b) == void) )
         return void;
     else static if( is(typeof(a) == int) && is(typeof(b) == void) )
         return a;
     else static if( is(typeof(a) == void) && is(typeof(b) == int) )
         return b;
     else
         return a<b? a : b;
 }
 
 immutable any(int,void) findMin(immutable int[] list) {
     static if(list.length == 0)
         return void;
     else
         return reduce!("min(a,b)")(list); // 'reduce' from phobos2
 }

Well to be honest, I thought I knew how to read D, but this is starting 
to look a bit scary. It looks like it does almost the same. I just used 
lists instead of arrays since they are the basic data type in functional 
code. Second, the find_min accepted any type that implements the 'Ord' 
class, i.e. supports  the '<' relation, not only ints. I guess it could 
be solved by changing some pieces of code to look like this:

 immutable any(T,void) findMin(T)(immutable T[] list) {

My original idea was to just show that it is much harder to make similar 
kinds of errors with algebraic data types. I should have made a less 
generic :-)

Sep 28 2009

Adam Burton <adz21c googlemail.com> writes:

I don't know if what I am about to rant about has already been discussed and 
I haven't noticed, but sometimes I feel like sticking my opinions in and 
this seems to be one of them times :-) so bare with me and we'll see if I am 
a crazy man blabbing on about crap or not :-).

language_fan wrote:

 Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:
 
 language_fan Wrote:
 
 Now if you really want to throw some sticks into the spokes, you
 would say that if the program crashes due to a null pointer, it is
 still likely that the programmer will just initialize/set the value
 to a "default" that still isn't valid just to get the program to
 continue to run.

 
 Why should it crash in the first place? I hate crashes. You liek them?
 I can prove by structural induction that you do not like them when you
 can avoid crashes with static checking.

 
 No one likes programs that crash, doesn't that mean it is an incorrect
 behavior though?
 
 Have you ever used functional languages? When you develop in Haskell or
 SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 
 So isn't that the question? Does/can "default" (by human or machine)
 initialization create an incorrect state?


Yes but this is possible now anyway. Consider

Foo obj;  // Machine default of null right?
obj.bar(); // Null pointer exception due to null being bad state for the app

Now steps in moron programmer, who would put garbage data into non-nullable 
vars to init them, to fix the issue

Foo obj = new Foo("bleh");  // Fix to avoid null pointer exception (and yes 
i have seen people do this)
obj.bar(); // Logic error but the application soldiers on.
 If it does, do we continue to
 work as if nothing was wrong or crash?


Depends on the applications specification/purpose/design? See in a few lines 
what I mean but I don't see how this is pertinent to the discussion. [1]
 I don't know how often the
 initialization would be incorrect, but I don't think Walter is concerned
 with it's frequency, but that it is possible.


Not sure what your getting at, but with my moron programmer example I have 
shown its possible to insert garbage into classes, maybe we should drop them 
too? Also the default machine implementation seemed to screw up too. I think 
there's a point where you have to trust the human factor to do its job 
correctly. If the feature was so ridiculously complex (like depending on 
planetary alignment) that it forced the programmer into stupid practices 
then fair enough, even if its likely most will get it right then that's a 
problem with the feature not the programmer (although I would personally say 
this isn't the case, pending I have understood the feature correctly :-P) 
... if that makes sense (so any technical issues, e.g. I believe someone 
mentioned enforcing it in structs allocated with malloc, are good points 
that I am just not technical enough to comment on, consider me the casual 
hobby reader who has an interest, but not a good background, in systems 
languages). However I think the previous discussions as I remember them seem 
to assert the programmer is an idiot who will initialize with crap, which I 
think is just out of the languages control.
 ...
 
 It really depends on your subjective opinion whether you want a program
 to segfault or spot a set of errors statically, and have illegally
 behaving non-crashing programs. I say FFFFFFFFFFUUUUUUUUUUU every time I
 experience a segfault. My hobby programs at home are not that critical,
 and at work the critical code is *proven* to be correct so no need to
 worry there.

[1] I think the above touches on an important point when it comes to whether 
it should crash or continue, without a more in depth knowledge its hard to 
say. Some applications it may be possible to crash a process within itself 
(so just throw exception) and return the application to a reasonable state 
that it may continue (like crashing back to the applications main menu and 
letting you start again). Others apps you may want to kill there and then 
(but die gracefully, so rollback transactions etc) before they do more harm.

Regardless the above 2 arguments of crashing vs continuing and the 
incompetence of some developers seems to have no baring on non-nullable. 
Ignoring the fact a function with all non-nullable variables could still 
crash with a non nullpointerexecption, it seems to me if anything non-
nullables just make the application crash earlier when it receives a null 
where not expected. Consider below implemented "normally".

void FuncOne(Foo foo)
{
   ....
   foo = null;   // The bug
   ....
   FuncTwo(foo);
   ....
}

// Does not expect null
void FuncTwo(Foo foo)
{
   foo.bar();   //null pointer exception
}

Trivial example but consider there are chunks of code you can't see that may 
also use foo that you would need to investigate to see if they set it to 
null, so plenty of code paths to search. Now consider with non-nullables.

void FuncOne(Foo? foo)
{
   ....
   foo = null;   // The bug
   ....
   FuncTwo(enforce(foo));   // [2]
   ....
}

// Does not expect null
void FuncTwo(Foo foo)
{
   foo.bar();
}

[2] Here I am guessing at what people mean by enforce. My assumption is it 
checks to see if foo is null and throws nullpointerexception if so. Else it 
lets to application continue executing and also skips the compiler check 
that we are passing a nullable into a non-nullable.

So first off without enforce [2] would have had a compiler error that would 
have made me investigate this potential bug anyway, whether I should have an 
alternate code path or more in depth look at the design, but lets assume I 
think it should never get to that state because its not valid for it to be 
null at [2] (but it is else where in FuncOne). So on execution we get an 
exception at [2], so we died earlier than we did in the nullable form. So 
not only do we have less to search (a lot less, cos not only does the trace 
give us less but also any other functions that only take non-nullable can 
remove code paths to check making the search area much smaller, sounds 
productive), but also we killed the application earlier before it did even 
more damage (like putting a plane into a dive maybe?). I wanted to point 
that out because I am sure Walter noticed it moved the error from one place 
to another but I don't think anyone has pointed out it is a way to identify 
bad application state earlier (which seems to be the focus for one argument) 
rather than later (seems to be that eventually code paths that allow nulls 
sooner or later turn into ones that don't because otherwise the variables 
are pointless, so by telling the compiler where it turns to a non null path 
you can get it to trigger the exceptions early which I would think would be 
inline with crashing the application when there is bad state).

I also see non-nullables helping track down potential bugs when changing 
variables to non-nullable and removing unnecessary code for vice versa.

Seems to me non-nullable is in the same sort of area as const/immutable. 
Where as immutable or const prevent any data changes from happening to stop 
bad application state, non-nullables prevent null going where its not 
allowed for bad application state with numerous other productivity benefits.

Are these the ramblings of a sleep deprived mad man getting involved with 
things he doesn't understand? you decide :-).

Sep 28 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

language_fan wrote:
 Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:
 
 language_fan Wrote:

 Now if you really want to throw some sticks into the spokes, you
 would say that if the program crashes due to a null pointer, it is
 still likely that the programmer will just initialize/set the value
 to a "default" that still isn't valid just to get the program to
 continue to run.

 Why should it crash in the first place? I hate crashes. You liek them?
 I can prove by structural induction that you do not like them when you
 can avoid crashes with static checking.

 No one likes programs that crash, doesn't that mean it is an incorrect
 behavior though?

 Have you ever used functional languages? When you develop in Haskell or
 SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 So isn't that the question? Does/can "default" (by human or machine)
 initialization create an incorrect state? If it does, do we continue to
 work as if nothing was wrong or crash? I don't know how often the
 initialization would be incorrect, but I don't think Walter is concerned
 with it's frequency, but that it is possible.

 
 Value types can be incorrectly initialized and nobody notices. E.g.
 
   int min;
 
   foreach(int value; list)
     if (value < min) min = value;
 
 Oops, you forgot to define a flag variable or initialize to int.min

You mean int.max :o).

Andrei

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Andrei Alexandrescu wrote:
 language_fan wrote:
 Mon, 28 Sep 2009 15:35:07 -0400, Jesse Phillips thusly wrote:

 language_fan Wrote:

 Now if you really want to throw some sticks into the spokes, you
 would say that if the program crashes due to a null pointer, it is
 still likely that the programmer will just initialize/set the value
 to a "default" that still isn't valid just to get the program to
 continue to run.

 Why should it crash in the first place? I hate crashes. You liek them?
 I can prove by structural induction that you do not like them when you
 can avoid crashes with static checking.

 No one likes programs that crash, doesn't that mean it is an incorrect
 behavior though?

 Have you ever used functional languages? When you develop in Haskell or
 SML, how often you feel there is a good change something will be
 initialized to the wrong value? Can you show some statistics that show
 how unsafe this practice is?

 So isn't that the question? Does/can "default" (by human or machine)
 initialization create an incorrect state? If it does, do we continue to
 work as if nothing was wrong or crash? I don't know how often the
 initialization would be incorrect, but I don't think Walter is concerned
 with it's frequency, but that it is possible.

 Value types can be incorrectly initialized and nobody notices. E.g.

   int min;

   foreach(int value; list)
     if (value < min) min = value;

 Oops, you forgot to define a flag variable or initialize to int.min

 
 You mean int.max :o).
 
 Andrei

He just proved how enforcing initializers can still cause errors! I 
didn't even think of that one!

:o)

Sep 28 2009

Derek Parnell <derek psych.ward> writes:

On Mon, 28 Sep 2009 19:27:03 -0500, Andrei Alexandrescu wrote:

 language_fan wrote:
 
   int min;
 
   foreach(int value; list)
     if (value < min) min = value;
 
 Oops, you forgot to define a flag variable or initialize to int.min

 
 You mean int.max :o).

  if (list.length == 0)
     throw( some exception); // An empty or null list has no minimum
  int min = list[0]; 
  foreach(int value; list[1..$])
    if (value < min) min = value;


I'm still surprised by Walter's stance.

For the purposes of this discussion...
* Null only applies to the memory address portion of reference types and
not to value types. The discussion is not about non-nullable value types.
* There are two types of reference types:
  (1) Those that can be initialized on declaration because the coder knows
what to initialize them to; a.k.a. non-nullable. If the coder does not know
what to initialize them to at declaration time, then either the design is
wrong, the coder doesn't understand the algorithm or application, or it is
truly a complex run-time decision.
  (2) Those that aren't in set (1); a.k.a. nullable.
* The standard declaration should imply non-nullable. And if not
initialized the compiler should complain. This encourages protection, but
does not guarantee it, of course.
* To declare a nullable type, use a special syntax to denote that the coder
is deliberately choosing to declare a nullable reference.
* The compiler will prevent non-nullable types being simply set to null. As
D is a system language too, there will be a rare cases that need to subvert
this compiler protection, so there will need to be a method to explicitly
set a non-nullable type to a null. The point is that such a method should
be a visible warning beacon to maintenance coders.

Priority should be given to coders that prefer safe coding. If a coder, for
whatever reason, chooses to use nullable references or initialize
non-nullable reference to rubbish data, then the responsibility is on them
to ensure safe applications. Safe coding practices should not be penalized.

The C/C++ programming language is inherently "unsafe" in this regard, and
that is not news to anyone. The D programming language does not have to
follow this paradigm.

I'm still not ready to use D for anything, but I watch it in hope.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Derek Parnell wrote:
 On Mon, 28 Sep 2009 19:27:03 -0500, Andrei Alexandrescu wrote:
 
 language_fan wrote:
   int min;

   foreach(int value; list)
     if (value < min) min = value;

 Oops, you forgot to define a flag variable or initialize to int.min

 You mean int.max :o).

 
   if (list.length == 0)
      throw( some exception); // An empty or null list has no minimum
   int min = list[0]; 
   foreach(int value; list[1..$])
     if (value < min) min = value;
 
 
 I'm still surprised by Walter's stance.
 
 For the purposes of this discussion...
 * Null only applies to the memory address portion of reference types and
 not to value types. The discussion is not about non-nullable value types.
 * There are two types of reference types:
   (1) Those that can be initialized on declaration because the coder knows
 what to initialize them to; a.k.a. non-nullable. If the coder does not know
 what to initialize them to at declaration time, then either the design is
 wrong, the coder doesn't understand the algorithm or application, or it is
 truly a complex run-time decision.
   (2) Those that aren't in set (1); a.k.a. nullable.
 * The standard declaration should imply non-nullable. And if not
 initialized the compiler should complain. This encourages protection, but
 does not guarantee it, of course.
 * To declare a nullable type, use a special syntax to denote that the coder
 is deliberately choosing to declare a nullable reference.
 * The compiler will prevent non-nullable types being simply set to null. As
 D is a system language too, there will be a rare cases that need to subvert
 this compiler protection, so there will need to be a method to explicitly
 set a non-nullable type to a null. The point is that such a method should
 be a visible warning beacon to maintenance coders.
 
 Priority should be given to coders that prefer safe coding. If a coder, for
 whatever reason, chooses to use nullable references or initialize
 non-nullable reference to rubbish data, then the responsibility is on them
 to ensure safe applications. Safe coding practices should not be penalized.
 
 The C/C++ programming language is inherently "unsafe" in this regard, and
 that is not news to anyone. The D programming language does not have to
 follow this paradigm.

But it doesn't have to follow the paranoid safety paradigm either. I 
wouldn't like two reference types and casting between the two when 
they're essentially the same with one having a single value that can't 
be set out of 4 billions possibilities. Seems like a waste to me, 
especially since 3 billions of these possibilities will result in the 
same segfault crash than that one you're trying to make illegal on 
nonnull types.

 I'm still not ready to use D for anything, but I watch it in hope.

I'm already using D quite a lot, I don't find null vs nonnull references 
all that meaningful. Like walter said, you can just make your own 
nonnull invariant.

Here's a very, very simple wrapper, took 10 seconds to write:

struct NonNull(C) if(is(C == class)) {
	C ref;
	invariant() { assert(ref !is null); }
	T opDot() { return ref; }
}

C++ has all sort of pointer wrappers like this one, you don't see a 
smart pointer feature in the C++ language for the simple reason its 
widely used and safer. In fact letting the semantics of these pointers 
up to libraries allow any project to write its custom ones, and quite a 
lot do.

It should be the same for D, I believe its better to implement flow 
analysis and let the compiler warn you of uninitialized variables (which 
will solve most nullptr references, the other half being by 
NonNull!Object fields). The compiler could also provide better tools to 
build smart wrapper types upon (like force initialization or prevent 
void initialization, heck even provide a tuple of valid initializers) 
and let libraries write their own.

Jeremie

Sep 29 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jeremie Pelletier wrote:
 struct NonNull(C) if(is(C == class)) {
     C ref;
     invariant() { assert(ref !is null); }
     T opDot() { return ref; }
 }

This only catches null errors at runtime.  The whole point of a non-null
type is to catch null errors at compile time.


-- 
Rainer Deyke - rainerd eldwood.com

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Rainer Deyke wrote:
 Jeremie Pelletier wrote:
 struct NonNull(C) if(is(C == class)) {
     C ref;
     invariant() { assert(ref !is null); }
     T opDot() { return ref; }
 }

 
 This only catches null errors at runtime.  The whole point of a non-null
 type is to catch null errors at compile time.
 

Thats what flow analysis is for, since these are mostly uninitialized 
variables rather than null ones.

Its dead easy to insert null into a nonnull reference, and since you 
expect the type to never be null its the last thing you're gonna check. 
If variables are properly initialized, you'll never get null where you 
don't expect it, and those are checked at compile time too, and work on 
every type.

Sep 29 2009

bearophile <bearophileHUGS lycos.com> writes:

Jeremie Pelletier:

 Its dead easy to insert null into a nonnull reference,

If it's easy to put a null into a nonnull by *mistake*, then that system needs
to be designed better.


 and since you 
 expect the type to never be null its the last thing you're gonna check.

I agree, but I think in a well designed system such situations are really
uncommon.


 If variables are properly initialized, you'll never get null where you 
 don't expect it, and those are checked at compile time too, and work on 
 every type.

Cyclone is an example of language where there is both flow analysis (in a very
C-like language that allows some kinds of gotos too, maybe someone here may
read their source code and adapt it to D. [One of the weirder characteristics
of open source programs is that hardly anyone ever reads/copies code/solutions
from other open source projects; and I don't think those stupid/idiotic
differences in OSS licences are enough to justify such behaviours. I think
there's also a strong amount of NIH syndrome. So I don't hold my breath for the

be tuned and used for both such open source languages/implementations, that
have different but not totally different GC needs]) and optional nonnull
references (well, pointers). I think Cyclone shows how to design a safer C-like
language. And making D safer is simpler than making C safer, despite D is more
complex than C.

Bye,
bearophile

Sep 29 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jeremie Pelletier wrote:
 Rainer Deyke wrote:
 This only catches null errors at runtime.  The whole point of a non-null
 type is to catch null errors at compile time.

 
 Thats what flow analysis is for, since these are mostly uninitialized
 variables rather than null ones.

Nitpick: there are no uninitialized variables in D (unless you
especially request them).  There are explicitly initialized variables
and default-initialized variables.

I can see the argument for disabling default initialization and
requiring explicit initialization.  You don't even need flow analysis
for that.  However, that doesn't address the problem that non-null
references are intended to solve.  It's still possible to explicitly
store a null values in non-null references without the problem being
detected at compile time.


-- 
Rainer Deyke - rainerd eldwood.com

Sep 29 2009

bearophile <bearophileHUGS lycos.com> writes:

Jesse Phillips:

The thing is that memory safety is the only safety with code.<


errors. That's another kind of safety.
If you look at safety-critical code, the one Walter was talking about, you see
people test code (and compile time) very well, looking for an enormous amount
of possible errors. Doing this increases code safety. So you can have ABS
brakes, TAC machine in hospitals, automatic pilots and so on.

Bye,
bearophile

Sep 27 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jesse Phillips wrote:
 The thing is that memory safety is the only safety with code.

That is such bullshit.  For example, this:

  class A {
  }

  class B {
  }

  A x = new B;

No memory access violation (yet).  Clearly incorrect.  Detecting this at
compile time is clearly a safety feature, and a good one.

You could argue that assigned a 'B' to a variable that is declared to
hold an 'A' is already a memory safety violation.  If so, then the exact
argument also applies to assigning 'null' to the same variable.


-- 
Rainer Deyke - rainerd eldwood.com

Sep 27 2009

Jesse Phillips <jesse.k.phillips+d gmail.com> writes:

Rainer Deyke Wrote:

 You could argue that assigned a 'B' to a variable that is declared to
 hold an 'A' is already a memory safety violation.  

Yeah, it was brought to my attention that "type safety" by a friend could be
another form. bearophile also brings up a good example.

If so, then the exact argument also applies to assigning 'null' to the same
variable.

I think that is what Walter is getting at, you're not dealing with memory that
is correct, when this happens the program should halt and be dealt with from
outside the program.

Sep 28 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jesse Phillips wrote:
 Yeah, it was brought to my attention that "type safety" by a friend
 could be another form. bearophile also brings up a good example.

<snip>

 I think that is what Walter is getting at, you're not dealing with
 memory that is correct, when this happens the program should halt and
 be dealt with from outside the program.

Type errors and null pointer errors both belong to the same class of
errors, namely variables containing bogus contents.  Some languages like
Python detect both at runtime.  That's fine for those languages.
However, I prefer to detect as many errors as possible at compile time,
especially for larger projects.

Nullable types turn compile time errors into runtime errors which may or
may not be detected during testing.  In the worst case, nullable types
lead to silent data corruption.  Consider what happens when a bogus null
field is serialized.


-- 
Rainer Deyke - rainerd eldwood.com

Sep 28 2009

BCS <none anon.com> writes:

Hello Walter,

 The only reasonable thing a program can do if it discovers it is in an
 unknown state is to stop immediately.
 

This whole thread is NOT about what to do on unknown states. It is about 
using the compiler to statically remove the possibility of one type of unknown 
state ever happening.

If D were to get non-null by default, with optional nullable, then without 
ASM/union hacks or the like, you can only get a seg-v when you use the
non-default 
nullable type.

Given the above (and assuming memory safety), the only possible
wrong-data-error 
left would be where the programmer explicitly places the wrong value in a 
variable. In my book, that is a non-starter because 1) it can happen now 
2) it can happen anywhere, not just at initialization 3) it can't be detected 
and 4) (assuming a well done syntax) in the cases where the compiler can't 
validate the code, the lazy thing to do and the correct thing to do (use 
a nullable type) will be the same.

Sep 27 2009

Walter Bright <newshound1 digitalmars.com> writes:

Nick Sabalausky wrote:

I agree with you that if the compiler can detect null dereferences at 
compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of 
 memory corruption. Null pointer exceptions are memory safe. A null pointer 
 could be caused by memory corruption, but it cannot *cause* memory 
 corruption.

 
 No, he's using the real meaning of "safe", not the misleadingly-limited 
 "SafeD" version of "safe" (which I'm still convinced is going to get some 
 poor soul into serious trouble from mistakingly thinking their SafeD program 
 is much safer than it really is). Out here in reality, "safe" also means a 
 lack of ability to crash, or at least some level of protection against it. 

Memory safety is something that can be guaranteed (presuming the 
compiler is correctly implemented). There is no way to guarantee that a 
non-trivial program cannot crash. It's the old halting problem.

 You seem to be under the impression that nothing can be made uncrashable 
 without introducing the possibility of corrupted state. That's hogwash.

I read that statement several times and I still don't understand what it 
means.

BTW, hardware null pointer checking is a safety feature, just like array 
bounds checking is.

Sep 27 2009

downs <default_357-line yahoo.de> writes:

Walter Bright wrote:
 Nick Sabalausky wrote:
 
 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.
 
 
 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it. 

 
 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee that a
 non-trivial program cannot crash. It's the old halting problem.
 

Okay, I'm gonna have to call you out on this one because it's simply incorrect.

The halting problem deals with a valid program state - halting.

We cannot check if every program halts because halting is an instruction that
must be allowed at almost any point in the program.

Why do crashes have to be allowed? They're not an allowed instruction!

A compiler can be turing complete and still not allow crashes. There is nothing
wrong with this, and it has *nothing* to do with the halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 
 I read that statement several times and I still don't understand what it
 means.
 
 BTW, hardware null pointer checking is a safety feature, just like array
 bounds checking is.

PS: You can't convert segfaults into exceptions under Linux, as far as I know.

Sep 27 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it. 

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee that a
 non-trivial program cannot crash. It's the old halting problem.

 
 Okay, I'm gonna have to call you out on this one because it's simply incorrect.
 
 The halting problem deals with a valid program state - halting.
 
 We cannot check if every program halts because halting is an instruction that
must be allowed at almost any point in the program.
 
 Why do crashes have to be allowed? They're not an allowed instruction!
 
 A compiler can be turing complete and still not allow crashes. There is
nothing wrong with this, and it has *nothing* to do with the halting problem.
 
 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like array
 bounds checking is.

 
 PS: You can't convert segfaults into exceptions under Linux, as far as I know.

How did Jeremie do that?

Andrei

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Andrei Alexandrescu wrote:
 downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it. 

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee that a
 non-trivial program cannot crash. It's the old halting problem.

 Okay, I'm gonna have to call you out on this one because it's simply 
 incorrect.

 The halting problem deals with a valid program state - halting.

 We cannot check if every program halts because halting is an 
 instruction that must be allowed at almost any point in the program.

 Why do crashes have to be allowed? They're not an allowed instruction!

 A compiler can be turing complete and still not allow crashes. There 
 is nothing wrong with this, and it has *nothing* to do with the 
 halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like array
 bounds checking is.

 PS: You can't convert segfaults into exceptions under Linux, as far as 
 I know.

 
 How did Jeremie do that?
 
 Andrei

A signal handler with the undocumented kernel parameters attaches the 
signal context to the exception object, repairs the stack frame forged 
by the kernel to make us believe we called the handler ourselves, does a 
backtrace right away and attaches it to the exception object, and then 
throw it.

The error handling code will unwind down to the runtime's main() where a 
catch clause is waiting for any Throwables, sending them back into the 
unhandled exception handler, and a crash window appears with the 
backtrace, all finally blocks executed, and gracefully shutting down.

All I need to do is an ELF/DWARF reader to extract symbolic debug info 
under linux, its already working for PE/CodeView on windows.

Jeremie

Sep 27 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 19:29, Jeremie Pelletier wrote:
 Andrei Alexandrescu wrote:
 downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it.

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee that a
 non-trivial program cannot crash. It's the old halting problem.

 Okay, I'm gonna have to call you out on this one because it's simply
 incorrect.

 The halting problem deals with a valid program state - halting.

 We cannot check if every program halts because halting is an
 instruction that must be allowed at almost any point in the program.

 Why do crashes have to be allowed? They're not an allowed instruction!

 A compiler can be turing complete and still not allow crashes. There
 is nothing wrong with this, and it has *nothing* to do with the
 halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand
 what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like
 array
 bounds checking is.

 PS: You can't convert segfaults into exceptions under Linux, as far
 as I know.

 How did Jeremie do that?

 Andrei

 A signal handler with the undocumented kernel parameters attaches the
 signal context to the exception object, repairs the stack frame forged
 by the kernel to make us believe we called the handler ourselves, does a
 backtrace right away and attaches it to the exception object, and then
 throw it.

 The error handling code will unwind down to the runtime's main() where a
 catch clause is waiting for any Throwables, sending them back into the
 unhandled exception handler, and a crash window appears with the
 backtrace, all finally blocks executed, and gracefully shutting down.

 All I need to do is an ELF/DWARF reader to extract symbolic debug info
 under linux, its already working for PE/CodeView on windows.

 Jeremie

Is this Linux specific? what about other *nix systems, like BSD and 
solaris?

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Yigal Chripun wrote:
 On 27/09/2009 19:29, Jeremie Pelletier wrote:
 Andrei Alexandrescu wrote:
 downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it.

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee 
 that a
 non-trivial program cannot crash. It's the old halting problem.

 Okay, I'm gonna have to call you out on this one because it's simply
 incorrect.

 The halting problem deals with a valid program state - halting.

 We cannot check if every program halts because halting is an
 instruction that must be allowed at almost any point in the program.

 Why do crashes have to be allowed? They're not an allowed instruction!

 A compiler can be turing complete and still not allow crashes. There
 is nothing wrong with this, and it has *nothing* to do with the
 halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand
 what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like
 array
 bounds checking is.

 PS: You can't convert segfaults into exceptions under Linux, as far
 as I know.

 How did Jeremie do that?

 Andrei

 A signal handler with the undocumented kernel parameters attaches the
 signal context to the exception object, repairs the stack frame forged
 by the kernel to make us believe we called the handler ourselves, does a
 backtrace right away and attaches it to the exception object, and then
 throw it.

 The error handling code will unwind down to the runtime's main() where a
 catch clause is waiting for any Throwables, sending them back into the
 unhandled exception handler, and a crash window appears with the
 backtrace, all finally blocks executed, and gracefully shutting down.

 All I need to do is an ELF/DWARF reader to extract symbolic debug info
 under linux, its already working for PE/CodeView on windows.

 Jeremie

 
 Is this Linux specific? what about other *nix systems, like BSD and 
 solaris?

Signal handler are standard to most *nix platforms since they're part of 
the posix C standard libraries, maybe some platforms will require a 
special handling but nothing impossible to do.

Sep 27 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Jeremie Pelletier wrote:
 Is this Linux specific? what about other *nix systems, like BSD and 
 solaris?

 
 Signal handler are standard to most *nix platforms since they're part of 
 the posix C standard libraries, maybe some platforms will require a 
 special handling but nothing impossible to do.

Let me write a message on behalf of Sean Kelly. He wrote that to Walter 
and myself this morning, then I suggested him to post it but probably he 
is off email for a short while. Hopefully the community will find a 
solution to the issue he's raising. Let me post this:

===================
Sean Kelly wrote:

There's one minor problem with his code.  It's not safe to throw an 
exception from a signal handler.  Here's a quote from the POSIX spec at 
opengroup.org:

"In order to prevent errors arising from interrupting non-reentrant 
function calls, applications should protect calls to these functions 
either by blocking the appropriate signals or through the use of some 
programmatic semaphore (see semget() , sem_init() , sem_open() , and so 
on). Note in particular that even the "safe" functions may modify errno; 
the signal-catching function, if not executing as an independent thread, 
may want to save and restore its value. Naturally, the same principles 
apply to the reentrancy of application routines and asynchronous data 
access. Note thatlongjmp() and siglongjmp() are not in the list of 
reentrant functions. This is because the code executing after longjmp() 
and siglongjmp() can call any unsafe functions with the same danger as 
calling those unsafe functions directly from the signal handler. 
Applications that use longjmp() andsiglongjmp() from within signal 
handlers require rigorous protection in order to be portable."

If this were an acceptable approach it would have been in druntime ages 
ago :-)
===================



Andrei

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Andrei Alexandrescu wrote:
 Jeremie Pelletier wrote:
 Is this Linux specific? what about other *nix systems, like BSD and 
 solaris?

 Signal handler are standard to most *nix platforms since they're part 
 of the posix C standard libraries, maybe some platforms will require a 
 special handling but nothing impossible to do.

 
 Let me write a message on behalf of Sean Kelly. He wrote that to Walter 
 and myself this morning, then I suggested him to post it but probably he 
 is off email for a short while. Hopefully the community will find a 
 solution to the issue he's raising. Let me post this:
 
 ===================
 Sean Kelly wrote:
 
 There's one minor problem with his code.  It's not safe to throw an 
 exception from a signal handler.  Here's a quote from the POSIX spec at 
 opengroup.org:
 
 "In order to prevent errors arising from interrupting non-reentrant 
 function calls, applications should protect calls to these functions 
 either by blocking the appropriate signals or through the use of some 
 programmatic semaphore (see semget() , sem_init() , sem_open() , and so 
 on). Note in particular that even the "safe" functions may modify errno; 
 the signal-catching function, if not executing as an independent thread, 
 may want to save and restore its value. Naturally, the same principles 
 apply to the reentrancy of application routines and asynchronous data 
 access. Note thatlongjmp() and siglongjmp() are not in the list of 
 reentrant functions. This is because the code executing after longjmp() 
 and siglongjmp() can call any unsafe functions with the same danger as 
 calling those unsafe functions directly from the signal handler. 
 Applications that use longjmp() andsiglongjmp() from within signal 
 handlers require rigorous protection in order to be portable."
 
 If this were an acceptable approach it would have been in druntime ages 
 ago :-)
 ===================
 
 
 
 Andrei

Yes but the segfault signal handler is not made to design code that can 
live with these exceptions, its just a feature to allow segfaults to be 
sent to the crash handler to get a backtrace dump. Even on windows while 
you can recover from access violations, its generally a bad idea to 
allow for bugs to be turned into features.

Jeremie

Sep 27 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 28 Sep 2009 01:31:44 +0400, Jeremie Pelletier <jeremiep gmail.com>  
wrote:

 Andrei Alexandrescu wrote:
 Jeremie Pelletier wrote:
 Is this Linux specific? what about other *nix systems, like BSD and  
 solaris?

 Signal handler are standard to most *nix platforms since they're part  
 of the posix C standard libraries, maybe some platforms will require a  
 special handling but nothing impossible to do.

  Let me write a message on behalf of Sean Kelly. He wrote that to  
 Walter and myself this morning, then I suggested him to post it but  
 probably he is off email for a short while. Hopefully the community  
 will find a solution to the issue he's raising. Let me post this:
  ===================
 Sean Kelly wrote:
  There's one minor problem with his code.  It's not safe to throw an  
 exception from a signal handler.  Here's a quote from the POSIX spec at  
 opengroup.org:
  "In order to prevent errors arising from interrupting non-reentrant  
 function calls, applications should protect calls to these functions  
 either by blocking the appropriate signals or through the use of some  
 programmatic semaphore (see semget() , sem_init() , sem_open() , and so  
 on). Note in particular that even the "safe" functions may modify  
 errno; the signal-catching function, if not executing as an independent  
 thread, may want to save and restore its value. Naturally, the same  
 principles apply to the reentrancy of application routines and  
 asynchronous data access. Note thatlongjmp() and siglongjmp() are not  
 in the list of reentrant functions. This is because the code executing  
 after longjmp() and siglongjmp() can call any unsafe functions with the  
 same danger as calling those unsafe functions directly from the signal  
 handler. Applications that use longjmp() andsiglongjmp() from within  
 signal handlers require rigorous protection in order to be portable."
  If this were an acceptable approach it would have been in druntime  
 ages ago :-)
 ===================
    Andrei

 Yes but the segfault signal handler is not made to design code that can  
 live with these exceptions, its just a feature to allow segfaults to be  
 sent to the crash handler to get a backtrace dump. Even on windows while  
 you can recover from access violations, its generally a bad idea to  
 allow for bugs to be turned into features.

 Jeremie

Isn't this reason alone strong enough to encourage use of non-null  
references?
And to implement them, since we don't the feature currently.

Sep 28 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jeremie Pelletier (jeremiep gmail.com)'s article
 Andrei Alexandrescu wrote:
 Jeremie Pelletier wrote:
 Is this Linux specific? what about other *nix systems, like BSD and
 solaris?

 Signal handler are standard to most *nix platforms since they're part
 of the posix C standard libraries, maybe some platforms will require a
 special handling but nothing impossible to do.

 Let me write a message on behalf of Sean Kelly. He wrote that to Walter
 and myself this morning, then I suggested him to post it but probably he
 is off email for a short while. Hopefully the community will find a
 solution to the issue he's raising. Let me post this:

 ===================
 Sean Kelly wrote:

 There's one minor problem with his code.  It's not safe to throw an
 exception from a signal handler.  Here's a quote from the POSIX spec at
 opengroup.org:

 "In order to prevent errors arising from interrupting non-reentrant
 function calls, applications should protect calls to these functions
 either by blocking the appropriate signals or through the use of some
 programmatic semaphore (see semget() , sem_init() , sem_open() , and so
 on). Note in particular that even the "safe" functions may modify errno;
 the signal-catching function, if not executing as an independent thread,
 may want to save and restore its value. Naturally, the same principles
 apply to the reentrancy of application routines and asynchronous data
 access. Note thatlongjmp() and siglongjmp() are not in the list of
 reentrant functions. This is because the code executing after longjmp()
 and siglongjmp() can call any unsafe functions with the same danger as
 calling those unsafe functions directly from the signal handler.
 Applications that use longjmp() andsiglongjmp() from within signal
 handlers require rigorous protection in order to be portable."

 If this were an acceptable approach it would have been in druntime ages
 ago :-)
 ===================

 Yes but the segfault signal handler is not made to design code that can
 live with these exceptions, its just a feature to allow segfaults to be
 sent to the crash handler to get a backtrace dump. Even on windows while
 you can recover from access violations, its generally a bad idea to
 allow for bugs to be turned into features.

I don't think it's fair to compare Windows to Unix here because, as far as
I know, Windows (ie. Win32, etc) was built with exceptions in mind (thanks to
SEH), while Unix was not.  So while the Windows kernel may theoretically be fine
with an exception being thrown from within kernel code, this isn't true of Unix.

It's true that as long as only Errors are thrown (and thus that the app intends
to terminate), things aren't as bad as they could be.  Worst case, some mutex
in libc is left locked or in some weird state and code executed during stack
unwinding or when trying to report the error causes the app to hang instead
of terminate.  And this risk is somewhat mitigated because I'd expect most
of these errors to occur within user code anyway.

One thing I'm not entirely sure about is whether the signal handler will always
have a valid, C-style call stack tracing back into user code.  These errors are
triggered by hardware, and I really don't know what kind of tricks are common
at that level of OS code.  longjmp() doesn't have this problem because it
doesn't
care about the call stack--it just swaps some registers and executes a JMP.  I
don't suppose anyone here knows more about the feasibility of throwing
exceptions from signal handlers at all?  I'll ask around some OS groups and
see what people say.

Sep 29 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Sean Kelly (sean invisibleduck.org)'s article
 One thing I'm not entirely sure about is whether the signal handler will always
 have a valid, C-style call stack tracing back into user code.  These errors are
 triggered by hardware, and I really don't know what kind of tricks are common
 at that level of OS code.  longjmp() doesn't have this problem because it
doesn't
 care about the call stack--it just swaps some registers and executes a JMP.  I
 don't suppose anyone here knows more about the feasibility of throwing
 exceptions from signal handlers at all?  I'll ask around some OS groups and
 see what people say.

I was right, it is illegal to throw an exception from a signal handler.  And
worse,
it's illegal to call malloc from a signal handler, so you can't safely create an
exception object anyway.  Heck, I'm not sure it's even safe to perform IO from
a signal handler, so tracing directly from within the handler won't even work
reliably.  In short, while I'm totally fine with people using this in their own
code, it's too unreliable to make an "official" solution by adding it to
Druntime.

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Sean Kelly wrote:
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 One thing I'm not entirely sure about is whether the signal handler will always
 have a valid, C-style call stack tracing back into user code.  These errors are
 triggered by hardware, and I really don't know what kind of tricks are common
 at that level of OS code.  longjmp() doesn't have this problem because it
doesn't
 care about the call stack--it just swaps some registers and executes a JMP.  I
 don't suppose anyone here knows more about the feasibility of throwing
 exceptions from signal handlers at all?  I'll ask around some OS groups and
 see what people say.

 
 I was right, it is illegal to throw an exception from a signal handler.  And
worse,
 it's illegal to call malloc from a signal handler, so you can't safely create
an
 exception object anyway.  Heck, I'm not sure it's even safe to perform IO from
 a signal handler, so tracing directly from within the handler won't even work
 reliably.  In short, while I'm totally fine with people using this in their own
 code, it's too unreliable to make an "official" solution by adding it to
Druntime.

Weird, it works just fine for me. Maybe its because the exception is 
always caught in the thread's entry point, i never tried to let such an 
exception unwind past the entry point. I haven't tried malloc or any I/O 
either.

There still should be a way to grab the backtrace and context data from 
the hidden ucontext_* parameter and do something with it after returning 
from the signal handler.

The whole idea of a crash handler is to limit the number of times you 
need to do postmortem debugging after a crash, or launch the process 
again within the debugger.

Sep 29 2009

Sean Kelly <sean invisibleduck.org> writes:

== Quote from Jeremie Pelletier (jeremiep gmail.com)'s article
 Sean Kelly wrote:
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 One thing I'm not entirely sure about is whether the signal handler will always
 have a valid, C-style call stack tracing back into user code.  These errors are
 triggered by hardware, and I really don't know what kind of tricks are common
 at that level of OS code.  longjmp() doesn't have this problem because it
doesn't
 care about the call stack--it just swaps some registers and executes a JMP.  I
 don't suppose anyone here knows more about the feasibility of throwing
 exceptions from signal handlers at all?  I'll ask around some OS groups and
 see what people say.

 I was right, it is illegal to throw an exception from a signal handler.  And
worse,
 it's illegal to call malloc from a signal handler, so you can't safely create
an
 exception object anyway.  Heck, I'm not sure it's even safe to perform IO from
 a signal handler, so tracing directly from within the handler won't even work
 reliably.  In short, while I'm totally fine with people using this in their own
 code, it's too unreliable to make an "official" solution by adding it to
Druntime.

 Weird, it works just fine for me. Maybe its because the exception is
 always caught in the thread's entry point, i never tried to let such an
 exception unwind past the entry point. I haven't tried malloc or any I/O
 either.

I think in practice, the issue is simply that malloc and IO routines aren't on
the list of reentrant functions, so if a signal is called from within one of
these
routines then the signal handler trying to call the same routine could cause
Bad Things to happen.  This actually comes up in our GC code on Linux
because threads are suspended for the collection via signals.  If one of
these threads is suspended within a non-reentrant library routine and the
GC code calls the same routine it can crash or deadlock on an internal
mutex (the latter actually happened on OSX until I changed how GC works
there).  This is kind of a weird issue, since in this case any thread can screw
with the GC thread, even though the GC thread itself never enters a signal
handler.  This is something that never occurred to me before--it was Fawzi
that figured out why OSX apps were deadlocking for no reason whatsoever
(I *think* this was pre-Druntime, though I can't recall precisely).

In short, you may never actually run into a problem using these functions,
and if they work for you then that's all that matters.  I'm just hesitant to
roll something into Druntime that is "undefined" according to a spec and
has only been verified to work through experimentation by a subset of
D users.  ie. I'd rather Druntime be a tad gimped and always work than
be super fancy and not work for some people.  YMMV.

 There still should be a way to grab the backtrace and context data from
 the hidden ucontext_* parameter and do something with it after returning
 from the signal handler.

Yeah, I saw one suggestion that you could have a thread blocked waiting
for (in this case) backtrace data.  So another thread could do the trace
and no worries about signal handler limitations.  Still, this seems like a
pretty heavyweight approach.

If there were some way to cache the trace data and then have the same
thread process it I'd love to know how.  I ran into this "can't throw
exceptions from a signal handler" issue at a previous job, and finally
gave up on the idea in frustration after not being able to come up with
a decent workaround.

 The whole idea of a crash handler is to limit the number of times you
 need to do postmortem debugging after a crash, or launch the process
 again within the debugger.

Yup.  And as a server programmer, I think getting backtraces within a log
file is totally awesome, since dealing with a core dump is difficult at best
for such apps.  In fact I'd probably use your approach within my own code,
since it seems to work.

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Sean Kelly wrote:
 == Quote from Jeremie Pelletier (jeremiep gmail.com)'s article
 Sean Kelly wrote:
 == Quote from Sean Kelly (sean invisibleduck.org)'s article
 One thing I'm not entirely sure about is whether the signal handler will always
 have a valid, C-style call stack tracing back into user code.  These errors are
 triggered by hardware, and I really don't know what kind of tricks are common
 at that level of OS code.  longjmp() doesn't have this problem because it
doesn't
 care about the call stack--it just swaps some registers and executes a JMP.  I
 don't suppose anyone here knows more about the feasibility of throwing
 exceptions from signal handlers at all?  I'll ask around some OS groups and
 see what people say.

 I was right, it is illegal to throw an exception from a signal handler.  And
worse,
 it's illegal to call malloc from a signal handler, so you can't safely create
an
 exception object anyway.  Heck, I'm not sure it's even safe to perform IO from
 a signal handler, so tracing directly from within the handler won't even work
 reliably.  In short, while I'm totally fine with people using this in their own
 code, it's too unreliable to make an "official" solution by adding it to
Druntime.

 Weird, it works just fine for me. Maybe its because the exception is
 always caught in the thread's entry point, i never tried to let such an
 exception unwind past the entry point. I haven't tried malloc or any I/O
 either.

 
 I think in practice, the issue is simply that malloc and IO routines aren't on
 the list of reentrant functions, so if a signal is called from within one of
these
 routines then the signal handler trying to call the same routine could cause
 Bad Things to happen.  This actually comes up in our GC code on Linux
 because threads are suspended for the collection via signals.  If one of
 these threads is suspended within a non-reentrant library routine and the
 GC code calls the same routine it can crash or deadlock on an internal
 mutex (the latter actually happened on OSX until I changed how GC works
 there).  This is kind of a weird issue, since in this case any thread can screw
 with the GC thread, even though the GC thread itself never enters a signal
 handler.  This is something that never occurred to me before--it was Fawzi
 that figured out why OSX apps were deadlocking for no reason whatsoever
 (I *think* this was pre-Druntime, though I can't recall precisely).
 
 In short, you may never actually run into a problem using these functions,
 and if they work for you then that's all that matters.  I'm just hesitant to
 roll something into Druntime that is "undefined" according to a spec and
 has only been verified to work through experimentation by a subset of
 D users.  ie. I'd rather Druntime be a tad gimped and always work than
 be super fancy and not work for some people.  YMMV.

I agree, I don't mind occasional crashes within the crash handler itself 
if it ever comes to that, at this point things are already going pretty 
bad anyways and the process is already going to exit soon enough. It 
could be confusing as hell to library users if they don't know this 
might happen in rare cases, so I understand keeping it away from 
Druntime until a proven solution is found.

 There still should be a way to grab the backtrace and context data from
 the hidden ucontext_* parameter and do something with it after returning
 from the signal handler.

 
 Yeah, I saw one suggestion that you could have a thread blocked waiting
 for (in this case) backtrace data.  So another thread could do the trace
 and no worries about signal handler limitations.  Still, this seems like a
 pretty heavyweight approach.

Eh, I'm not going that way either :) Maybe spawn another process with 
some basic infos collected by the signal handler (ie registers, loaded 
modules and backtrace) and let that other process deal with generating a 
crash window while we gracefully shut down with a core dump. That's also 
a heavyweight idea but its only happening after a crash, not while 
waiting for it.

 If there were some way to cache the trace data and then have the same
 thread process it I'd love to know how.  I ran into this "can't throw
 exceptions from a signal handler" issue at a previous job, and finally
 gave up on the idea in frustration after not being able to come up with
 a decent workaround.
 
 The whole idea of a crash handler is to limit the number of times you
 need to do postmortem debugging after a crash, or launch the process
 again within the debugger.

 
 Yup.  And as a server programmer, I think getting backtraces within a log
 file is totally awesome, since dealing with a core dump is difficult at best
 for such apps.  In fact I'd probably use your approach within my own code,
 since it seems to work.

Yeah I'm not much into post-mortem debugging either, I like running 
within the debugger or having a convenient crash window. It's also neat 
thing to use when you distribute your executable since you can implement 
a smtp mailer for the crash reports instead of the crash window.

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Sean Kelly wrote:
 == Quote from Jeremie Pelletier (jeremiep gmail.com)'s article
 Andrei Alexandrescu wrote:
 Jeremie Pelletier wrote:
 Is this Linux specific? what about other *nix systems, like BSD and
 solaris?

 Signal handler are standard to most *nix platforms since they're part
 of the posix C standard libraries, maybe some platforms will require a
 special handling but nothing impossible to do.

 Let me write a message on behalf of Sean Kelly. He wrote that to Walter
 and myself this morning, then I suggested him to post it but probably he
 is off email for a short while. Hopefully the community will find a
 solution to the issue he's raising. Let me post this:

 ===================
 Sean Kelly wrote:

 There's one minor problem with his code.  It's not safe to throw an
 exception from a signal handler.  Here's a quote from the POSIX spec at
 opengroup.org:

 "In order to prevent errors arising from interrupting non-reentrant
 function calls, applications should protect calls to these functions
 either by blocking the appropriate signals or through the use of some
 programmatic semaphore (see semget() , sem_init() , sem_open() , and so
 on). Note in particular that even the "safe" functions may modify errno;
 the signal-catching function, if not executing as an independent thread,
 may want to save and restore its value. Naturally, the same principles
 apply to the reentrancy of application routines and asynchronous data
 access. Note thatlongjmp() and siglongjmp() are not in the list of
 reentrant functions. This is because the code executing after longjmp()
 and siglongjmp() can call any unsafe functions with the same danger as
 calling those unsafe functions directly from the signal handler.
 Applications that use longjmp() andsiglongjmp() from within signal
 handlers require rigorous protection in order to be portable."

 If this were an acceptable approach it would have been in druntime ages
 ago :-)
 ===================

 Yes but the segfault signal handler is not made to design code that can
 live with these exceptions, its just a feature to allow segfaults to be
 sent to the crash handler to get a backtrace dump. Even on windows while
 you can recover from access violations, its generally a bad idea to
 allow for bugs to be turned into features.

 
 I don't think it's fair to compare Windows to Unix here because, as far as
 I know, Windows (ie. Win32, etc) was built with exceptions in mind (thanks to
 SEH), while Unix was not.  So while the Windows kernel may theoretically be
fine
 with an exception being thrown from within kernel code, this isn't true of
Unix.
 
 It's true that as long as only Errors are thrown (and thus that the app intends
 to terminate), things aren't as bad as they could be.  Worst case, some mutex
 in libc is left locked or in some weird state and code executed during stack
 unwinding or when trying to report the error causes the app to hang instead
 of terminate.  And this risk is somewhat mitigated because I'd expect most
 of these errors to occur within user code anyway.
 
 One thing I'm not entirely sure about is whether the signal handler will always
 have a valid, C-style call stack tracing back into user code.  These errors are
 triggered by hardware, and I really don't know what kind of tricks are common
 at that level of OS code.  longjmp() doesn't have this problem because it
doesn't
 care about the call stack--it just swaps some registers and executes a JMP.  I
 don't suppose anyone here knows more about the feasibility of throwing
 exceptions from signal handlers at all?  I'll ask around some OS groups and
 see what people say.

I haven't had any problems so far, the stack trace generated was always 
valid and similar to what gdb would output. But I agree that trying to 
recover from these exceptions is a *bad* idea in so many ways.

 From what I know, the kernel alters the stack frame of the signal 
handler to make us believe we called it ourselves. Returning from the 
signal handler therefore jumps to the routine from which the signal was 
originally raised, without the kernel being aware of it.

This is a bit different than how SEH is handled, but has a lot in common 
to it:

 From the research I did about SEH internals, its just built on top of 
interrupt handlers. The hardware raises an exception (access violation, 
etc), jumps into a kernel handler for the corresponding interrupt, it 
there looks up the base of the stack for a pointer to a struct 
containing a handler function and a handler table which is set and 
restored by try blocks and calls the exception handler (_d_framehandler 
in our case) with the appropriate parameters. From there the kernel 
decides what to do based on the return code of the framehandler.

The signal handler model is therefore quite acceptable to build 
exception handling on top of. We just may want to also manually generate 
a core dump before throwing the exception to support postmortem debugging.

Sep 29 2009

downs <default_357-line yahoo.de> writes:

Jeremie Pelletier wrote:
 Andrei Alexandrescu wrote:
 downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it. 

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee that a
 non-trivial program cannot crash. It's the old halting problem.

 Okay, I'm gonna have to call you out on this one because it's simply
 incorrect.

 The halting problem deals with a valid program state - halting.

 We cannot check if every program halts because halting is an
 instruction that must be allowed at almost any point in the program.

 Why do crashes have to be allowed? They're not an allowed instruction!

 A compiler can be turing complete and still not allow crashes. There
 is nothing wrong with this, and it has *nothing* to do with the
 halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand
 what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like
 array
 bounds checking is.

 PS: You can't convert segfaults into exceptions under Linux, as far
 as I know.

 How did Jeremie do that?

 Andrei

 
 A signal handler with the undocumented kernel parameters attaches the
 signal context to the exception object, repairs the stack frame forged
 by the kernel to make us believe we called the handler ourselves, does a
 backtrace right away and attaches it to the exception object, and then
 throw it.
 
 The error handling code will unwind down to the runtime's main() where a
 catch clause is waiting for any Throwables, sending them back into the
 unhandled exception handler, and a crash window appears with the
 backtrace, all finally blocks executed, and gracefully shutting down.
 
 All I need to do is an ELF/DWARF reader to extract symbolic debug info
 under linux, its already working for PE/CodeView on windows.
 
 Jeremie


Woah, nice. I stand corrected. Is this in druntime already?

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

downs wrote:
 Jeremie Pelletier wrote:
 Andrei Alexandrescu wrote:
 downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it. 

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee that a
 non-trivial program cannot crash. It's the old halting problem.

 Okay, I'm gonna have to call you out on this one because it's simply
 incorrect.

 The halting problem deals with a valid program state - halting.

 We cannot check if every program halts because halting is an
 instruction that must be allowed at almost any point in the program.

 Why do crashes have to be allowed? They're not an allowed instruction!

 A compiler can be turing complete and still not allow crashes. There
 is nothing wrong with this, and it has *nothing* to do with the
 halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand
 what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like
 array
 bounds checking is.

 PS: You can't convert segfaults into exceptions under Linux, as far
 as I know.

 How did Jeremie do that?

 Andrei

 A signal handler with the undocumented kernel parameters attaches the
 signal context to the exception object, repairs the stack frame forged
 by the kernel to make us believe we called the handler ourselves, does a
 backtrace right away and attaches it to the exception object, and then
 throw it.

 The error handling code will unwind down to the runtime's main() where a
 catch clause is waiting for any Throwables, sending them back into the
 unhandled exception handler, and a crash window appears with the
 backtrace, all finally blocks executed, and gracefully shutting down.

 All I need to do is an ELF/DWARF reader to extract symbolic debug info
 under linux, its already working for PE/CodeView on windows.

 Jeremie

 
 
 Woah, nice. I stand corrected. Is this in druntime already?

Not yet, its part of a custom runtime I'm working on and wish to release 
under a public domain license when I get the time. The code is linked 
from a thread in D.announce.

Sep 27 2009

grauzone <none example.net> writes:

Jeremie Pelletier wrote:
 downs wrote:
 Jeremie Pelletier wrote:
 Andrei Alexandrescu wrote:
 downs wrote:
 Walter Bright wrote:
 Nick Sabalausky wrote:

 I agree with you that if the compiler can detect null dereferences at
 compile time, it should.


 Also, by "safe" I presume you mean "memory safe" which means 
 free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 No, he's using the real meaning of "safe", not the
 misleadingly-limited "SafeD" version of "safe" (which I'm still
 convinced is going to get some poor soul into serious trouble from
 mistakingly thinking their SafeD program is much safer than it 
 really
 is). Out here in reality, "safe" also means a lack of ability to
 crash, or at least some level of protection against it. 

 Memory safety is something that can be guaranteed (presuming the
 compiler is correctly implemented). There is no way to guarantee 
 that a
 non-trivial program cannot crash. It's the old halting problem.

 Okay, I'm gonna have to call you out on this one because it's simply
 incorrect.

 The halting problem deals with a valid program state - halting.

 We cannot check if every program halts because halting is an
 instruction that must be allowed at almost any point in the program.

 Why do crashes have to be allowed? They're not an allowed instruction!

 A compiler can be turing complete and still not allow crashes. There
 is nothing wrong with this, and it has *nothing* to do with the
 halting problem.

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.

 I read that statement several times and I still don't understand
 what it
 means.

 BTW, hardware null pointer checking is a safety feature, just like
 array
 bounds checking is.

 PS: You can't convert segfaults into exceptions under Linux, as far
 as I know.

 How did Jeremie do that?

 Andrei

 A signal handler with the undocumented kernel parameters attaches the
 signal context to the exception object, repairs the stack frame forged
 by the kernel to make us believe we called the handler ourselves, does a
 backtrace right away and attaches it to the exception object, and then
 throw it.

 The error handling code will unwind down to the runtime's main() where a
 catch clause is waiting for any Throwables, sending them back into the
 unhandled exception handler, and a crash window appears with the
 backtrace, all finally blocks executed, and gracefully shutting down.

 All I need to do is an ELF/DWARF reader to extract symbolic debug info
 under linux, its already working for PE/CodeView on windows.

 Jeremie


 Woah, nice. I stand corrected. Is this in druntime already?

 
 Not yet, its part of a custom runtime I'm working on and wish to release 
 under a public domain license when I get the time. The code is linked 
 from a thread in D.announce.

Some of this functionality is also in Tango (SVN version). Signals are 
catched only to print a backtrace.

Sep 27 2009

Leandro Lucarella <llucax gmail.com> writes:

grauzone, el 27 de septiembre a las 22:31 me escribiste:
Woah, nice. I stand corrected. Is this in druntime already?

Not yet, its part of a custom runtime I'm working on and wish to
release under a public domain license when I get the time. The
code is linked from a thread in D.announce.

 
 Some of this functionality is also in Tango (SVN version). Signals
 are catched only to print a backtrace.

I think this is a very bad idea. When the program receive a segfault
I want my lovely core dumped. A core dump is way more useful than any
possible backtrace.

I really don't see any use for it except if an uncaught exception could
generate a core dump (just as GCC do for C++ code). But I *really*
*really* want my core dump, so I can open my debugger and inspect the dead
program exactly in the point where it failed.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
The average person laughs 13 times a day

Sep 27 2009

BCS <none anon.com> writes:

Hello downs,

 PS: You can't convert segfaults into exceptions under Linux, as far as
 I know.
 

Last I checked, throwing from a signal handler works on linux.

Sep 27 2009

language_fan <foo bar.com.invalid> writes:

Sun, 27 Sep 2009 00:27:14 -0700, Walter Bright thusly wrote:

 You seem to be under the impression that nothing can be made
 uncrashable without introducing the possibility of corrupted state.
 That's hogwash.


What I mean by safe is that no matter what you do, you cannot make the 
program crash or cause memory corruption. If you look at typical 
functional languages, unless FFI is used, the only ways the program may 
fail are a) no more stack memory b) no more heap memory c) programs halts 
(halting problem) d) developer explicitly kills the program e.g. with the 
Error type. Note that if your language is simple enough, say simply typed 
lambda calculus, you do not have the third problem anymore. All of these 
errors can also happen in D, but none of the D's other problems happen in 
those languages.

Sep 27 2009

Lionello Lunesu <lio lunesu.remove.com> writes:

On 27-9-2009 9:20, Walter Bright wrote:
 language_fan wrote:
 The idea behind non-nullable types and other contracts is to catch
 these errors on compile time. Sure, the code is a bit harder to write,
 but it is safe and never segfaults. The idea is to minimize the amount
 of runtime errors of all sorts. That's also how other features of
 statically typed languages work.


 I certainly agree that catching errors at compile time is preferable by
 far. Where I disagree is the notion that non-nullable types achieve
 this. I've argued extensively here that they hide errors, not fix them.

 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

// t.d
void main()
{
    int* a;
    a[20000] = 2;
}

[C:\Users\Lionello] dmd -run t.d

[C:\Users\Lionello]

This code passes on Vista. Granted, needs a big enough offset and some 
luck, but indexing null will never be secure in the current flat memory 
models.

L.

Sep 27 2009

Max Samukha <spambox d-coding.com> writes:

Lionello Lunesu wrote:

 On 27-9-2009 9:20, Walter Bright wrote:
 language_fan wrote:
 The idea behind non-nullable types and other contracts is to catch
 these errors on compile time. Sure, the code is a bit harder to write,
 but it is safe and never segfaults. The idea is to minimize the amount
 of runtime errors of all sorts. That's also how other features of
 statically typed languages work.


 I certainly agree that catching errors at compile time is preferable by
 far. Where I disagree is the notion that non-nullable types achieve
 this. I've argued extensively here that they hide errors, not fix them.

 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 
 // t.d
 void main()
 {
     int* a;
     a[20000] = 2;
 }
 
 [C:\Users\Lionello] dmd -run t.d
 
 [C:\Users\Lionello]
 
 This code passes on Vista. Granted, needs a big enough offset and some
 luck, but indexing null will never be secure in the current flat memory
 models.
 
 L.

That is a strong argument. If an object is big enough, modifying it via a 
null reference may still cause memory corruption. Initializing references to 
null does not guarantee memory safety.

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Max Samukha wrote:
 Lionello Lunesu wrote:
 
 On 27-9-2009 9:20, Walter Bright wrote:
 language_fan wrote:
 The idea behind non-nullable types and other contracts is to catch
 these errors on compile time. Sure, the code is a bit harder to write,
 but it is safe and never segfaults. The idea is to minimize the amount
 of runtime errors of all sorts. That's also how other features of
 statically typed languages work.

 I certainly agree that catching errors at compile time is preferable by
 far. Where I disagree is the notion that non-nullable types achieve
 this. I've argued extensively here that they hide errors, not fix them.

 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 // t.d
 void main()
 {
     int* a;
     a[20000] = 2;
 }

 [C:\Users\Lionello] dmd -run t.d

 [C:\Users\Lionello]

 This code passes on Vista. Granted, needs a big enough offset and some
 luck, but indexing null will never be secure in the current flat memory
 models.

 L.

 
 That is a strong argument. If an object is big enough, modifying it via a 
 null reference may still cause memory corruption. Initializing references to 
 null does not guarantee memory safety.

How is that corruption? These pointers were purposely set to 0x00000002, 
corruption I believe is when memory is modified without the programmer 
being aware of it. For example if the GC was to free memory that is 
still reachable, that would cause corruption.

Corruption is near impossible to trace back, this case is trivial.

Sep 28 2009

Lionello Lunesu <lio lunesu.remove.com> writes:

On 28-9-2009 18:09, Jeremie Pelletier wrote:
 Max Samukha wrote:
 Lionello Lunesu wrote:

 On 27-9-2009 9:20, Walter Bright wrote:
 language_fan wrote:
 The idea behind non-nullable types and other contracts is to catch
 these errors on compile time. Sure, the code is a bit harder to write,
 but it is safe and never segfaults. The idea is to minimize the amount
 of runtime errors of all sorts. That's also how other features of
 statically typed languages work.

 I certainly agree that catching errors at compile time is preferable by
 far. Where I disagree is the notion that non-nullable types achieve
 this. I've argued extensively here that they hide errors, not fix them.

 Also, by "safe" I presume you mean "memory safe" which means free of
 memory corruption. Null pointer exceptions are memory safe. A null
 pointer could be caused by memory corruption, but it cannot *cause*
 memory corruption.

 // t.d
 void main()
 {
 int* a;
 a[20000] = 2;
 }

 [C:\Users\Lionello] dmd -run t.d

 [C:\Users\Lionello]

 This code passes on Vista. Granted, needs a big enough offset and some
 luck, but indexing null will never be secure in the current flat memory
 models.

 L.

 That is a strong argument. If an object is big enough, modifying it
 via a null reference may still cause memory corruption. Initializing
 references to null does not guarantee memory safety.

 How is that corruption? These pointers were purposely set to 0x00000002,
 corruption I believe is when memory is modified without the programmer
 being aware of it. For example if the GC was to free memory that is
 still reachable, that would cause corruption.

 Corruption is near impossible to trace back, this case is trivial.

Uh? What pointer is being set to 0x00000002?

I'm indexing an array that happens to be uninitialized, which means: 
null. The code passes without problems, but modifies a 'random' address, 
with unpredictable consequences.

According to Walter a compile time check is not needed, because at 
run-time it is guaranteed that the program will abort when a null 
pointer is about to be used. But, that's not always the case, see my 
example.

L.

Sep 28 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 The problem with non-nullable references is what do they default to? 
 Some "nan" object? When you use a "nan" object, what should it do? Throw 
 an exception?

This is the mistake. There would no way to default initialize a non-null 
object. I'm surprised you are still saying this, because we discussed 
how NonNull!T could be implemented by disabling its default constructor.

Andrei

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 The problem with non-nullable references is what do they default to? 
 Some "nan" object? When you use a "nan" object, what should it do? 
 Throw an exception?

 
 This is the mistake. There would no way to default initialize a non-null 
 object. I'm surprised you are still saying this, because we discussed 
 how NonNull!T could be implemented by disabling its default constructor.

Sure, so the user just provides "0" as the argument to the non-default 
constructor. Or he writes:

     C c = c_empty;

using c_empty as his placeholder for an empty object. Now, what happens 
with:

     c.foo();

? Should c_empty throw an exception? To take this a little farther, 
suppose I wish to create an array of C that I will partially fill with 
valid data, and leave some empty slots. Those empty slots I stuff with 
c_empty, to avoid having nulls. What is c_empty's proper behavior if I 
mistakenly try to access its members?

Forcing the user to provide an initializer does not solve the problem. 
The crucial point is the problem is *not* the seg fault, the seg fault 
is the symptom. The problem is the user has not set the object to a 
value that his program's logic requires.


I am also perfectly happy with NonNull being a type constructor, to be 
used where appropriate. My disagreement is with the notion that null 
references should be eliminated at the language level.

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Walter Bright wrote:
 The problem with non-nullable references is what do they default to? 
 Some "nan" object? When you use a "nan" object, what should it do? 
 Throw an exception?

 This is the mistake. There would no way to default initialize a 
 non-null object. I'm surprised you are still saying this, because we 
 discussed how NonNull!T could be implemented by disabling its default 
 constructor.

 
 Sure, so the user just provides "0" as the argument to the non-default 
 constructor. Or he writes:
 
     C c = c_empty;
 
 using c_empty as his placeholder for an empty object. Now, what happens 
 with:
 
     c.foo();
 
 ? Should c_empty throw an exception?

The problem is you keep on insisting on one case "I have a non-null 
reference that I don't have an initializer for, but the compiler forces 
me to find one, so I'll just throw a crappy value in." This focus on one 
situation comes straight with your admitted bad habit of defining 
variables in one place and initializing in another. The situation you 
need to open a curious eye on is "I have a reference that's never 
supposed to be null, but I forgot about initializing it and the compiler 
silently put a useless null in it." The simplest case is what _every_ D 
beginner has done:

T x;
x.fun();

to witness a crash. Why the hell does that crash? It did work when T was 
a struct. (Also this damns generic code to hell.)

So again: focus on the situation when people forget to initialize 
references that are never supposed to be null.

That has happened to me, and I'm supposed to know about this stuff. And 
one thing you don't understand is that on Linux, access violations are 
much more difficult to figure than others. On a computing cluster it 
gets one order of magnitude more difficult. So spare me of your Windows 
setup that launches your debugger on the line of the crash. For better 
or worse, many don't have that. People sometimes have problems that you 
don't have, and you need to put yourself in their shoes.

 To take this a little farther, 
 suppose I wish to create an array of C that I will partially fill with 
 valid data, and leave some empty slots. Those empty slots I stuff with 
 c_empty, to avoid having nulls. What is c_empty's proper behavior if I 
 mistakenly try to access its members?

You make an array of nullable references. Again you confuse having 
non-null as a default with having non-null as the only option.

 Forcing the user to provide an initializer does not solve the problem. 
 The crucial point is the problem is *not* the seg fault, the seg fault 
 is the symptom. The problem is the user has not set the object to a 
 value that his program's logic requires.
 
 
 I am also perfectly happy with NonNull being a type constructor, to be 
 used where appropriate. My disagreement is with the notion that null 
 references should be eliminated at the language level.

Null references shouldn't be eliminated from the language. They just 
should NOT be the default. I guess I'm going to say that until you tune 
on my station.


Andrei

Sep 26 2009

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Andrei Alexandrescu wrote:
 [snip]
 The problem is you keep on insisting on one case "I have a non-null 
 reference that I don't have an initializer for, but the compiler forces 
 me to find one, so I'll just throw a crappy value in." This focus on one 
 situation comes straight with your admitted bad habit of defining 
 variables in one place and initializing in another. The situation you 
 need to open a curious eye on is "I have a reference that's never 
 supposed to be null, but I forgot about initializing it and the compiler 
 silently put a useless null in it." The simplest case is what _every_ D 
 beginner has done:
 
 T x;
 x.fun();
 
 to witness a crash. Why the hell does that crash? It did work when T was 
 a struct. (Also this damns generic code to hell.)
 
 So again: focus on the situation when people forget to initialize 
 references that are never supposed to be null.
 
 That has happened to me, and I'm supposed to know about this stuff. And 
 one thing you don't understand is that on Linux, access violations are 
 much more difficult to figure than others. On a computing cluster it 
 gets one order of magnitude more difficult. So spare me of your Windows 
 setup that launches your debugger on the line of the crash. For better 
 or worse, many don't have that. People sometimes have problems that you 
 don't have, and you need to put yourself in their shoes.

Quoted for truth.


-- 
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 The problem is you keep on insisting on one case "I have a non-null 
 reference that I don't have an initializer for, but the compiler forces 
 me to find one, so I'll just throw a crappy value in." This focus on one 
 situation comes straight with your admitted bad habit of defining 
 variables in one place and initializing in another.

Thank you Andrei for your good efforts in trying to add some light on this
topic. I think we are converging :-)

But I think you have to deal with the example shown by Jeremie Pelletier too,
this was my answer:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=96834

(What I have written in the last line is confused. I meant that the type system
doesn't allow you to read or access an object before it's initialized. This
looks like flow analysis, but there are ways to simplify/constraint the
situation enough, for example with that enforce scope block).

Bye,
bearophile

Sep 26 2009

BCS <none anon.com> writes:

Hello Walter,

 The problem with non-nullable references is what do they default to?
 Some "nan" object? When you use a "nan" object, what should it do?
 Throw an exception?
 

They don't have a default. There semantics would be such that the compiler 
rejects as illegal any code that would require it to supply a default.

As to the user stuffing "c_empty" in just to get the compiler to shut up; 
firstly, that says the variable should not yet be declared as you don't yet 
known what value to give it and secondly either c_empy is a rational value 
or the user is subverting the type system and is on there own.

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:

 I actually side with Walter here. I much prefer my programs to crash on
 using a null reference and fix the issue than add runtime overhead that does
 the same thing. In most cases a simple backtrace is enough to pinpoint the
 location of the bug.

There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
None. It's handled entirely by the type system. Can we please move
past this?

 Null references are useful to implement optional arguments without any
 overhead by an Optional!T wrapper. If you disallow null references what
 would "Object foo;" initialize to then?

It wouldn't. The compiler wouldn't allow it. It would force you to
initialize it. That is the entire point of nonnull references.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 
 I actually side with Walter here. I much prefer my programs to crash on
 using a null reference and fix the issue than add runtime overhead that does
 the same thing. In most cases a simple backtrace is enough to pinpoint the
 location of the bug.

 
 There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
 None. It's handled entirely by the type system. Can we please move
 past this?
 
 Null references are useful to implement optional arguments without any
 overhead by an Optional!T wrapper. If you disallow null references what
 would "Object foo;" initialize to then?

 
 It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

How would you do this then?

void foo(int a) {
	Object foo;
	if(a == 1) foo = new Object1;
	else if(a == 2) foo = Object2;
	else foo = Object3;
	foo.doSomething();
}

The compiler would just die on the first line of the method where foo is 
null.

What about "int a;" should this throw an error too? Or "float f;".

What about standard pointers? I can think of so many algorithms who rely 
on pointers possibly being null.

Maybe this could be a case to add in SafeD but leave out in standard D. 
I wouldn't want a nonnull reference type, I use nullables just too often.

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 01:59:45 +0400, Jeremie Pelletier <jeremiep gmail.com>  
wrote:

 Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier <jeremiep gmail.com>  
 wrote:

 I actually side with Walter here. I much prefer my programs to crash on
 using a null reference and fix the issue than add runtime overhead  
 that does
 the same thing. In most cases a simple backtrace is enough to pinpoint  
 the
 location of the bug.

  There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
 None. It's handled entirely by the type system. Can we please move
 past this?

 Null references are useful to implement optional arguments without any
 overhead by an Optional!T wrapper. If you disallow null references what
 would "Object foo;" initialize to then?

  It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

 How would you do this then?

 void foo(int a) {
 	Object foo;
 	if(a == 1) foo = new Object1;
 	else if(a == 2) foo = Object2;
 	else foo = Object3;
 	foo.doSomething();
 }

Let's consider the following example, first:

void foo(int a) {
	Object foo;
	if (a == 1) foo = Object1;
	else if(a == 2) foo = Object2;
	else if(a == 3) foo = Object3;

	foo.doSomething();
}

Do you agree that this program has a bug? It is buggy, because one of the  
paths skips "foo" variable initialization.

Now back to your question. My answer is that compiler should be smart  
enough to differentiate between the two cases and raise a compile-time  

successfully compiles while the second one doesn't.

Until then, non-nullable references are too hard to use to become useful,  
because you'll end up with a lot of initializer functions:

void foo(int a) {
	Object initializeFoo() {
		if (a == 1) return new Object1();
		if (a == 2) return new Object2();
		return new Object3();
         }

	Object foo = initializeFoo();
	foo.doSomething();
}

I actually believe the code is more clear that way, but there are cases  
when you can't do it (initialize a few variables, for example)

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sun, 27 Sep 2009 02:15:33 +0400, Denis Koroskin thusly wrote:

 Until the, non-nullable references are too hard to use to become
 useful, because you'll end up with a lot of initializer functions:
 
 void foo(int a) {
 	Object initializeFoo() {
 		if (a == 1) return new Object1();
 		if (a == 2) return new Object2();
 		return new Object3();
          }
 
 	Object foo = initializeFoo();
 	foo.doSomething();
 }
 
 I actually believe the code is more clear that way, but there are cases
 when you can't do it (initialize a few variables, for example)

Having a functional switch() helps a lot. I write code like this every 
day:

  val foo = predicate.match {
    case 1 => new Object1
    case 2 => new Object2("foo", "bar")
    case _ => new DefaultObject
  }

  foo.doSomething

I also rarely have runtime bugs these days.

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 5:59 PM, Jeremie Pelletier <jeremiep gmail.com> wro=
te:
 How would you do this then?

 void foo(int a) {
 =A0 =A0 =A0 =A0Object foo;
 =A0 =A0 =A0 =A0if(a =3D=3D 1) foo =3D new Object1;
 =A0 =A0 =A0 =A0else if(a =3D=3D 2) foo =3D Object2;
 =A0 =A0 =A0 =A0else foo =3D Object3;
 =A0 =A0 =A0 =A0foo.doSomething();
 }

 The compiler would just die on the first line of the method where foo is
 null.

Either use Object? (a nullable reference), or factor out the object
creation - use a separate method or something.

 What about "int a;" should this throw an error too? Or "float f;".

Those are not reference types. But actually, the D spec says it's an
error to use an uninitialized variable, so a compliant D compiler
wouldn't be out of line by diagnosing such things as errors if they
are used before they're intialized. Such a compiler would break a lot
of existing D code, but that's what you get for not following the
spec..

 What about standard pointers? I can think of so many algorithms who rely =

on
 pointers possibly being null.

Again, you have both nonnull (void*) and nullable (void*?) types.

 Maybe this could be a case to add in SafeD but leave out in standard D. I
 wouldn't want a nonnull reference type, I use nullables just too often.

You probably use them far less than you'd think.

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Jarrett Billingsley:

 Jeremie Pelletier:
 How would you do this then?

 void foo(int a) {
        Object foo;
        if(a == 1) foo = new Object1;
        else if(a == 2) foo = Object2;
        else foo = Object3;
        foo.doSomething();
 }

 The compiler would just die on the first line of the method where foo is
 null.

 
 Either use Object? (a nullable reference), or factor out the object
 creation - use a separate method or something.

Using a separate function to initialize an nonnull reference is a possible
solution, but we can invent nicer solutions too.

You can have a function where inside an object is nullable but returns a
nonnull reference, see the enforce() used by Denis Koroskin. (The compiler also
has to recognize as a possible "enforce" an if (foo is null) {...}).

Another possible solution is to use something like a Python "with" block that
assures something is done when the block is done:

enforce (Object foo) {
    // foo is nonnull, but inside here it's in a limbo
    if(a == 1)
        foo = new Object1;
    else if(a == 2)
        foo = Object2;
    else
        foo = Object3;
} // when the enforce block ends foo must be initialized
foo.doSomething();

Probably there are other possible solutions.

A better solution is to just allow foo to be undefined until it's written over.
To simplify analysis it has to be defined when the scope ends.

Bye,
bearophile

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sat, 26 Sep 2009 17:59:45 -0400, Jeremie Pelletier thusly wrote:

 How would you do this then?
 
 void foo(int a) {
 	Object foo;
 	if(a == 1) foo = new Object1;
 	else if(a == 2) foo = Object2;
 	else foo = Object3;
 	foo.doSomething();
 }

I just LOVE to see questions like these ;) You still have SO much to 
learn. Go grab the 'purely functional data structures' by chris okasaki 
from the nearest library and try how many pages you can read before your 
head explodes. No, it is a purely enlightening process actually :)

Sep 26 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 00:59, Jeremie Pelletier wrote:
 Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier
 <jeremiep gmail.com> wrote:

 I actually side with Walter here. I much prefer my programs to crash on
 using a null reference and fix the issue than add runtime overhead
 that does
 the same thing. In most cases a simple backtrace is enough to
 pinpoint the
 location of the bug.

 There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
 None. It's handled entirely by the type system. Can we please move
 past this?

 Null references are useful to implement optional arguments without any
 overhead by an Optional!T wrapper. If you disallow null references what
 would "Object foo;" initialize to then?

 It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

 How would you do this then?

 void foo(int a) {
 Object foo;
 if(a == 1) foo = new Object1;
 else if(a == 2) foo = Object2;
 else foo = Object3;
 foo.doSomething();
 }

 The compiler would just die on the first line of the method where foo is
 null.

 What about "int a;" should this throw an error too? Or "float f;".

 What about standard pointers? I can think of so many algorithms who rely
 on pointers possibly being null.

 Maybe this could be a case to add in SafeD but leave out in standard D.
 I wouldn't want a nonnull reference type, I use nullables just too often.

with current D syntax this can be implemented as:

void foo(int a) {
   Object foo = (a == 1) ? new Object1
              : (a == 2) ? Object2
              : Object3;
   foo.doSomething();
}

The above agrees also with what Denis said about possible uninitialized 
variable bugs.

in D "if" is the same as in C - a procedural statement.
I personally think that it should be an expression like in FP languages 
which is safer.

to reinforce what others have said already:
1) non-null references *by default* does not affect nullable references 
in any way and does not add any overhead. The idea is to make the 
*default* the *safer* option which is one of the primary goals of this 
language.
2) there is no default value for non-nullable references. you must 
initialize it to a correct, logical value *always*. If you resort to 
some "default" value you are doing something wrong.

btw, C++ references implement this idea already. functions that return a 
reference will throw an exception on error (Walter's canary) while the 
same function that returns a pointer will usually just return null on 
error.

segfaults are *NOT* a good mechanism to handle errors. An exception 
trace gives you a whole lot more information about what went wrong and 
where compared to a segfault.

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sun, 27 Sep 2009 02:04:06 +0200, Yigal Chripun thusly wrote:

 segfaults are *NOT* a good mechanism to handle errors. An exception
 trace gives you a whole lot more information about what went wrong and
 where compared to a segfault.

Indeed, especially since in the case of D half of the userbase has a 
broken linker (optlink) and the other half has a broken debugger (gdb). I 
much rather write non-segfaulting applications in a language without 
debugger than buggy crap and debug it with the world's best debugger.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Yigal Chripun wrote:
 On 27/09/2009 00:59, Jeremie Pelletier wrote:
 Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 5:29 PM, Jeremie Pelletier
 <jeremiep gmail.com> wrote:

 I actually side with Walter here. I much prefer my programs to crash on
 using a null reference and fix the issue than add runtime overhead
 that does
 the same thing. In most cases a simple backtrace is enough to
 pinpoint the
 location of the bug.

 There is NO RUNTIME OVERHEAD in implementing nonnull reference types.
 None. It's handled entirely by the type system. Can we please move
 past this?

 Null references are useful to implement optional arguments without any
 overhead by an Optional!T wrapper. If you disallow null references what
 would "Object foo;" initialize to then?

 It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

 How would you do this then?

 void foo(int a) {
 Object foo;
 if(a == 1) foo = new Object1;
 else if(a == 2) foo = Object2;
 else foo = Object3;
 foo.doSomething();
 }

 The compiler would just die on the first line of the method where foo is
 null.

 What about "int a;" should this throw an error too? Or "float f;".

 What about standard pointers? I can think of so many algorithms who rely
 on pointers possibly being null.

 Maybe this could be a case to add in SafeD but leave out in standard D.
 I wouldn't want a nonnull reference type, I use nullables just too often.

 
 with current D syntax this can be implemented as:
 
 void foo(int a) {
   Object foo = (a == 1) ? new Object1
              : (a == 2) ? Object2
              : Object3;
   foo.doSomething();
 }

 The above agrees also with what Denis said about possible uninitialized 
 variable bugs.
 
 in D "if" is the same as in C - a procedural statement.
 I personally think that it should be an expression like in FP languages 
 which is safer.

 to reinforce what others have said already:
 1) non-null references *by default* does not affect nullable references 
 in any way and does not add any overhead. The idea is to make the 
 *default* the *safer* option which is one of the primary goals of this 
 language.
 2) there is no default value for non-nullable references. you must 
 initialize it to a correct, logical value *always*. If you resort to 
 some "default" value you are doing something wrong.
 
 btw, C++ references implement this idea already. functions that return a 
 reference will throw an exception on error (Walter's canary) while the 
 same function that returns a pointer will usually just return null on 
 error.
 
 segfaults are *NOT* a good mechanism to handle errors. An exception 
 trace gives you a whole lot more information about what went wrong and 
 where compared to a segfault.

This is something for the runtime or the debugger to deal with. My 
runtime converts access violations on windows or segfaults on linux into 
exception objects, which unwind all the way down to main where it 
catches into the unhandled exception handler (or crash handler) and I 
get a neat popup with a "hello, your program crashed at this point, here 
is a backtrace with resolved symbols and filenames along with current 
registers and loaded modules, would you like a cup of coffee while you 
solve the problem?". I sent that crash handler to D.announce last week too.

The compiler won't be able to enforce *every* nonnull reference and 
segfaults are bound to happen, especially with casting. While it may 
prevent most of them, any good programmer would too, I don't remember 
the last time I had a segfault on a null reference actually.

I can see what the point is with nonnull references, but I can also see 
its not a bulletproof solution. ie "Object foo = cast(Object)null;" 
would easily bypass the nonnull enforcement, resulting in a segfault the 
system is trying to avoid.

What about function parameters, a lot of parameters are optional 
references, which are tested and then used into functions whose 
parameters aren't optional. It would result in a lot of casts, something 
that could easily confuse people and easily generate segfaults.

Alls I'm saying is, nonnull references would just take the issue from 
one place to another. Like Walter said, you can put a gas mask to ignore 
the room full of toxic gas, but that doesn't solve the gas problem in 
itself, you're just denyinng it exists. Then someday you forget about 
it, remove the mask, and suffocate.

Jeremie

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 10:59 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:

 The compiler won't be able to enforce *every* nonnull reference and
 segfaults are bound to happen, especially with casting. While it may prevent
 most of them, any good programmer would too, I don't remember the last time
 I had a segfault on a null reference actually.

 I can see what the point is with nonnull references, but I can also see its
 not a bulletproof solution. ie "Object foo = cast(Object)null;" would easily
 bypass the nonnull enforcement, resulting in a segfault the system is trying
 to avoid.

 What about function parameters, a lot of parameters are optional references,
 which are tested and then used into functions whose parameters aren't
 optional. It would result in a lot of casts, something that could easily
 confuse people and easily generate segfaults.

You haven't read my reply to your post yet, have you.

Nullable.



References.



Exist.



Too.

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Jeremie Pelletier:

 I don't remember 
 the last time I had a segfault on a null reference actually.

I have read that null deference bugs are among the most common problem in



 I can see what the point is with nonnull references, but I can also see 
 its not a bulletproof solution. ie "Object foo = cast(Object)null;" 
 would easily bypass the nonnull enforcement, resulting in a segfault the 
 system is trying to avoid.

That's life.


 What about function parameters, a lot of parameters are optional 
 references, which are tested and then used into functions whose 
 parameters aren't optional. It would result in a lot of casts, something 
 that could easily confuse people and easily generate segfaults.

By "optional" I think you mean "nullable" there.

Note that some of those casts can be avoided, because the nonnull nature of a
reference can be implicitly inferred by the compiler:

Foo somefunction(Foo? foo) {
  if (foo is null) {
    ... // do something
  } else {
    // here foo can be implicitly converted to
    // a nonnullable reference, because the compiler
    // can infer that here foo can never be null.
    return foo;
}


 Alls I'm saying is, nonnull references would just take the issue from 
 one place to another. Like Walter said, you can put a gas mask to ignore 
 the room full of toxic gas, but that doesn't solve the gas problem in 
 itself, you're just denyinng it exists. Then someday you forget about 
 it, remove the mask, and suffocate.

No solution is perfect, so it's a matter of computing its pro and cons. It's
hard to tell how much good a feature is before trying it. That's why I have
half-seriously to implement nonullables in a branch of D2, test it and keep it
only if it turns out to be good.

Bye,
bearophile

Sep 26 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Jeremie Pelletier wrote:
 ...
 
 This is something for the runtime or the debugger to deal with. My
 runtime converts access violations on windows or segfaults on linux into
 exception objects, which unwind all the way down to main where it
 catches into the unhandled exception handler (or crash handler) and I
 get a neat popup with a "hello, your program crashed at this point, here
 is a backtrace with resolved symbols and filenames along with current
 registers and loaded modules, would you like a cup of coffee while you
 solve the problem?". I sent that crash handler to D.announce last week too.

See my long explanation that NPEs are only symptoms; very rarely do they
put up a big sign saying "what ho; the problem is RIGHT HERE!"

 The compiler won't be able to enforce *every* nonnull reference and
 segfaults are bound to happen, especially with casting. While it may
 prevent most of them, any good programmer would too, I don't remember
 the last time I had a segfault on a null reference actually.

I do.  It took a day and a half to track it back to the source.

 I can see what the point is with nonnull references, but I can also see
 its not a bulletproof solution. ie "Object foo = cast(Object)null;"
 would easily bypass the nonnull enforcement, resulting in a segfault the
 system is trying to avoid.

Why lock the door when someone could break the window?

Why have laws when people could break them?

Why build a wall when someone could park a hydrogen bomb next to it?

Why have a typesystem when you could use casting to put the float
representation of 3.14159 into a void* and then dereference it?

Casting is not an argument against non-null references because casting
can BREAK ANYTHING.

"Doctor, it hurts when I hammer nails into my shin."

"So stop doing it."

 What about function parameters, a lot of parameters are optional
 references, which are tested and then used into functions whose
 parameters aren't optional. It would result in a lot of casts, something
 that could easily confuse people and easily generate segfaults.

So what you're saying is: better to never, ever do error checking and
just start fixing things after they've broken?

And why is everything solved via casting?  Look: here's a solution
that's less typing than a cast, AND it's safe.  You could even put
nonnull it in object.d!

T notnull(U : T?, T)(U obj)
{
    if( obj is null ) throw new NullException;
    return cast(T) obj;
}

void foo(Quxx o)
{
    o.doStuff;
}

void foo(Quxx? o)
{
    foo(notnull(o));
}

 Alls I'm saying is, nonnull references would just take the issue from
 one place to another.

YES.

THAT'S THE POINT.

It would take the error from a likely unrelated location in the
program's execution and put it RIGHT where the mistake initially occurs!

 Like Walter said, you can put a gas mask to ignore
 the room full of toxic gas, but that doesn't solve the gas problem in
 itself, you're just denyinng it exists. Then someday you forget about
 it, remove the mask, and suffocate.
 
 Jeremie

That's what NPEs are!  They're a *symptom* of you passing crap in to
fields or functions.  They very, VERY rarely actually point out what the
underlying mistake is.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Daniel Keep wrote:
 
 Jeremie Pelletier wrote:
 ...

 This is something for the runtime or the debugger to deal with. My
 runtime converts access violations on windows or segfaults on linux into
 exception objects, which unwind all the way down to main where it
 catches into the unhandled exception handler (or crash handler) and I
 get a neat popup with a "hello, your program crashed at this point, here
 is a backtrace with resolved symbols and filenames along with current
 registers and loaded modules, would you like a cup of coffee while you
 solve the problem?". I sent that crash handler to D.announce last week too.

 
 See my long explanation that NPEs are only symptoms; very rarely do they
 put up a big sign saying "what ho; the problem is RIGHT HERE!"
 
 The compiler won't be able to enforce *every* nonnull reference and
 segfaults are bound to happen, especially with casting. While it may
 prevent most of them, any good programmer would too, I don't remember
 the last time I had a segfault on a null reference actually.

 
 I do.  It took a day and a half to track it back to the source.

Happens to me on some issues too, I don't ask for a workaround in the 
compiler, I just learn my lesson and never repeat that error.

 I can see what the point is with nonnull references, but I can also see
 its not a bulletproof solution. ie "Object foo = cast(Object)null;"
 would easily bypass the nonnull enforcement, resulting in a segfault the
 system is trying to avoid.

 
 Why lock the door when someone could break the window?

Easier to prove someone broke in when the window is shattered than if 
someone just went through the door, stole your stuff and left without 
any traces.

 Why have laws when people could break them?

People break the law, some of them only for the challenge of it, some of 
them to survive, some just don't care. Remove the laws and you remove 
most of these behaviors you're trying to prohibit in the first place. 
Most of the time laws are there so corporate criminals can get rid of 
street criminals legally.

 Why build a wall when someone could park a hydrogen bomb next to it?

They keep most people out, or in. Hydrogen bombs are not something you 
expect the first guy on the street to own.

 Why have a typesystem when you could use casting to put the float
 representation of 3.14159 into a void* and then dereference it?

Because it also allows for countless different optimizations, at the 
price of also being able to shoot your own foot.

There, four similar questions and four completely different answers. My 
point is, there is no perfect all-around solution.

 Casting is not an argument against non-null references because casting
 can BREAK ANYTHING.
 
 "Doctor, it hurts when I hammer nails into my shin."
 
 "So stop doing it."

Why tell him to stop it? The guy will just kill himself at some point 
and raise the collective IQ of mankind in the process. Same for 
programming or anything else, if someone is dumb enough to repeat the 
same mistake over and over, he should find a new domain to work in.

 What about function parameters, a lot of parameters are optional
 references, which are tested and then used into functions whose
 parameters aren't optional. It would result in a lot of casts, something
 that could easily confuse people and easily generate segfaults.

 
 So what you're saying is: better to never, ever do error checking and
 just start fixing things after they've broken?

No, but you shouldn't rule out the fact that they may break, no matter 
what system you're working with.

 And why is everything solved via casting?  Look: here's a solution
 that's less typing than a cast, AND it's safe.  You could even put
 nonnull it in object.d!
 
 T notnull(U : T?, T)(U obj)
 {
     if( obj is null ) throw new NullException;
     return cast(T) obj;
 }
 
 void foo(Quxx o)
 {
     o.doStuff;
 }
 
 void foo(Quxx? o)
 {
     foo(notnull(o));
 }

Also slower than a cast if the compiler doesn't use -inline. Debug 
builds are already painful enough as it is with realtime code.

 Alls I'm saying is, nonnull references would just take the issue from
 one place to another.

 
 YES.
 
 THAT'S THE POINT.
 
 It would take the error from a likely unrelated location in the
 program's execution and put it RIGHT where the mistake initially occurs!

That's a case for variable initialization, not nullable/non-null types.

A nonnull type does not guarantee the value will *never* be null, even 
the simplest hack can get around it.

 Like Walter said, you can put a gas mask to ignore
 the room full of toxic gas, but that doesn't solve the gas problem in
 itself, you're just denyinng it exists. Then someday you forget about
 it, remove the mask, and suffocate.

 Jeremie

 
 That's what NPEs are!  They're a *symptom* of you passing crap in to
 fields or functions.  They very, VERY rarely actually point out what the
 underlying mistake is.

There again, I favor stronger initialization semantics over nonnull 
types. This will get rid of most of these errors and still keep you on 
your toes when a segfault arise, if you only see a segfault once a year 
how will you know how to handle it :)

Most segfaults I have take me at most a few minutes to pinpoint. Its 
finding backdoors to compiler enforcements thats annoying.

Sep 26 2009

Christopher Wright <dhasenan gmail.com> writes:

Jeremie Pelletier wrote:
 There again, I favor stronger initialization semantics over nonnull 
 types. This will get rid of most of these errors

Only for local variables. Not for fields.

 Most segfaults I have take me at most a few minutes to pinpoint. Its 
 finding backdoors to compiler enforcements thats annoying.

You're complaining now because you'd try to cram 'null' down the throat 
of something marked 'not-null' and fear it would be difficult?

Sep 27 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jarrett Billingsley wrote:
 It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

Initialize it to what?

A user-defined default object? What should happen if that default object 
is accessed? Throw an exception? <g>

How would you define an "empty" slot in a data structure?

Sep 26 2009

grauzone <none example.net> writes:

Walter Bright wrote:
 Jarrett Billingsley wrote:
 It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

 
 Initialize it to what?
 
 A user-defined default object? What should happen if that default object 
 is accessed? Throw an exception? <g>
 
 How would you define an "empty" slot in a data structure?

You can allow a non-nullable reference to be null, just like you allow 
an immutable object to be mutable during construction.

You just have to make sure the non-nullable reference is definitely 
assigned.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

grauzone wrote:
 You just have to make sure the non-nullable reference is definitely 
 assigned.

See my reply to Denis Koroskin on that.

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 6:10 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Jarrett Billingsley wrote:
 It wouldn't. The compiler wouldn't allow it. It would force you to
 initialize it. That is the entire point of nonnull references.

 Initialize it to what?

 A user-defined default object? What should happen if that default object is
 accessed? Throw an exception? <g>

The point of using a nonnull type is that you *never expect it to be
null ever*. So you would be initializing it to some useful object. If
you *want* null, you'd use a nullable reference.

 How would you define an "empty" slot in a data structure?

A nullable reference.

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 01:29:55 +0400, Jeremie Pelletier <jeremiep gmail.com>  
wrote:
 [...] I much prefer my programs to crash on using a null reference and  
 fix the issue than add runtime overhead that does the same thing.

What runtime overhead are you talking about here? Use of non-null pointers  
actually make your program run faster, because you don't have to check  
them against null all the time. Non-null references is a contract, which  
is enforced by a compiler at compile-time, not runtime. It also makes your  
program more consistent and less verbose.

 Null references are useful to implement optional arguments without any  
 overhead by an Optional!T wrapper.

Once again, what overhead are you talking about? Optional!(T) (or  
Nullable!(T)) doesn't have to have any additional bits to store the NULL  
state for reference types.

 If you disallow null references what would "Object foo;" initialize to  
 then?

Nothing. It's a compile-time error. But the following is not:

Object foo = initializer();
Nullable!(Object) foo2; // default-initialized to a null, same as currently
Object? foo3; // a desirable syntax sugar for Nullable!(Object)

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Denis Koroskin wrote:
 If you disallow null references what would "Object foo;" initialize to 
 then?

 Nothing. It's a compile-time error.

Should:

    int a;

be disallowed, too? If not (and explain why it should behave 
differently), what about:

    T a;

in generic code?

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 02:18:15 +0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
 If you disallow null references what would "Object foo;" initialize to  
 then?

 Nothing. It's a compile-time error.

 Should:

     int a;

 be disallowed, too? If not (and explain why it should behave  
 differently), what about:

     T a;

 in generic code?

Functional languages don't distinguish between the two (reference or not).  
We were discussing "non-null by default"-references because it's far less  
radical change to a language that "non-null by default" for all types.

Once again, you are taking code out of the context. It is worthless to  
discuss "int a;" on its own.
I'll try to but the context back and show a few concrete examples (where T  
is a generic type):

void foo()
{
     T t;
}

Results in: error (Unused variable 't').

T foo(bool someCondition)
{
     T t;
     if (someCondition) t = someInitializer();

     return t;
}

Results in: error (Use of potentially unassigned variable 't')

T foo(bool someCondition)
{
     T t;
     if (someCondition) t = someInitializer();
     else t = someOtherInitializer();

     return t;
}

Results in: successful compilation

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 02:43:05 +0400, Denis Koroskin <2korden gmail.com>  
wrote:

 On Sun, 27 Sep 2009 02:18:15 +0400, Walter Bright  
 <newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
 If you disallow null references what would "Object foo;" initialize  
 to then?

 Nothing. It's a compile-time error.

 Should:

     int a;

 be disallowed, too? If not (and explain why it should behave  
 differently), what about:

     T a;

 in generic code?

 Functional languages don't distinguish between the two (reference or  
 not). We were discussing "non-null by default"-references because it's  
 far less radical change to a language that "non-null by default" for all  
 types.

 Once again, you are taking code out of the context. It is worthless to  
 discuss "int a;" on its own.
 I'll try to but the context back and show a few concrete examples (where  
 T is a generic type):

 void foo()
 {
      T t;
 }

 Results in: error (Unused variable 't').

 T foo(bool someCondition)
 {
      T t;
      if (someCondition) t = someInitializer();

      return t;
 }

 Results in: error (Use of potentially unassigned variable 't')

 T foo(bool someCondition)
 {
      T t;
      if (someCondition) t = someInitializer();
      else t = someOtherInitializer();

      return t;
 }

 Results in: successful compilation

One more:

T foo(bool someCondition)
{
     T? t;
     if (someCondition) t = someInitializer();
     // ...

     if (t.isNull) { // not initialized yet
         // ...
     }

     return enforce(t); // throws if t is not initialized yet, because foo  
*must* return a valid value by a contract
}

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Denis Koroskin wrote:
 One more:
 
 T foo(bool someCondition)
 {
     T? t;
     if (someCondition) t = someInitializer();
     // ...
 
     if (t.isNull) { // not initialized yet
         // ...
     }
 
     return enforce(t); // throws if t is not initialized yet, because 
 foo *must* return a valid value by a contract
 }

It seems to me you've got null references there anyway?

What would you do about:

    T[] a;
    a[i] = foo();

where you want to have unused slots be null (or empty, or nothing)?

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 03:01:48 +0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
 One more:
  T foo(bool someCondition)
 {
     T? t;
     if (someCondition) t = someInitializer();
     // ...
      if (t.isNull) { // not initialized yet
         // ...
     }
      return enforce(t); // throws if t is not initialized yet, because  
 foo *must* return a valid value by a contract
 }

 It seems to me you've got null references there anyway?

 What would you do about:

     T[] a;
     a[i] = foo();

 where you want to have unused slots be null (or empty, or nothing)?

Easy:

T? foo(); // returns valid object or a null

T?[] a;
a[i] = foo();

Sep 26 2009

downs <default_357-line yahoo.de> writes:

Denis Koroskin wrote:
 On Sun, 27 Sep 2009 03:01:48 +0400, Walter Bright
 <newshound1 digitalmars.com> wrote:
 
 Denis Koroskin wrote:
 One more:
  T foo(bool someCondition)
 {
     T? t;
     if (someCondition) t = someInitializer();
     // ...
      if (t.isNull) { // not initialized yet
         // ...
     }
      return enforce(t); // throws if t is not initialized yet,
 because foo *must* return a valid value by a contract
 }

 It seems to me you've got null references there anyway?

 What would you do about:

     T[] a;
     a[i] = foo();

 where you want to have unused slots be null (or empty, or nothing)?

 
 Easy:
 
 T? foo(); // returns valid object or a null
 
 T?[] a;
 a[i] = foo();

The case of a non-null array is, I think, worthy of some more consideration.

These are the things that would not be possible with a non-nullable array:

- newing it
- setting .length to a greater value
- appending a nullable array of the same base type.

Basically, anything that may fill it with nulls.

The only two allowed instructions would be ~= NonNullable and ~=
NonNullableArray. And it's good that way.

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

downs:

 Basically, anything that may fill it with nulls.
 
 The only two allowed instructions would be ~= NonNullable and ~=
NonNullableArray. And it's good that way.

I agree.
In such situation I'd also like to have a default method to insert one or more
nonnull items in any point of the array (see insert method of Python lists,
that can also be expressed as s[i:i]=[x]). Having fee basic default methods
will help keep such safe arrays flexible.

Bye,
bearophile

Sep 27 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 My assessment: the chances of convincing Walter he's wrong are quite 
 slim... Having a rationale for being wrong is very hard to overcome.

Especially when I'm right!

Sep 26 2009

grauzone <none example.net> writes:

Walter Bright wrote:
 It is exactly analogous to a null pointer exception. And it's darned 
 useful.

On Linux, it just generates a segfault. And then you have no idea where 
the program went wrong. dmd outputting incorrect debugging information 
(so you have troubles using gdb or even addr2line) doesn't really help here.

Not so useful.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

grauzone wrote:
 Walter Bright wrote:
 It is exactly analogous to a null pointer exception. And it's darned 
 useful.

 
 On Linux, it just generates a segfault. And then you have no idea where 
 the program went wrong. dmd outputting incorrect debugging information 
 (so you have troubles using gdb or even addr2line) doesn't really help 
 here.

Then the problem is incorrect dwarf output, not null pointers.


 Not so useful.

It's *still* far more useful than generating corrupt output and 
pretending all is ok.

Sep 26 2009

grauzone <none example.net> writes:

Walter Bright wrote:
 grauzone wrote:
 Walter Bright wrote:
 It is exactly analogous to a null pointer exception. And it's darned 
 useful.

 On Linux, it just generates a segfault. And then you have no idea 
 where the program went wrong. dmd outputting incorrect debugging 
 information (so you have troubles using gdb or even addr2line) doesn't 
 really help here.

 
 Then the problem is incorrect dwarf output, not null pointers.

Indeed. I was just commenting in how badly the current D implementation 
handles it, and how useless the result is.

 Not so useful.

 
 It's *still* far more useful than generating corrupt output and 
 pretending all is ok.

But nobody argues in favor of that?

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

grauzone wrote:
 Walter Bright wrote:
 grauzone wrote:
 Not so useful.

 It's *still* far more useful than generating corrupt output and 
 pretending all is ok.

 
 But nobody argues in favor of that?

It's implicit in the argument that some default should be used instead. 
That's what I'm trying to point out.

Even forcing an explicit initializer doesn't actually solve the problem 
- my experience with such features is programmers simply insert any old 
value to get the code to pass the compiler, even programmers who know 
it's a bad idea do it anyway.

It's a lot like why exception-specifications are a failure. See Bruce 
Eckel's essay on it:
http://www.mindview.net/Etc/Discussions/CheckedExceptions

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 grauzone wrote:
 Walter Bright wrote:
 grauzone wrote:
 Not so useful.

 It's *still* far more useful than generating corrupt output and 
 pretending all is ok.

 But nobody argues in favor of that?

 
 It's implicit in the argument that some default should be used instead. 
 That's what I'm trying to point out.
 
 Even forcing an explicit initializer doesn't actually solve the problem 
 - my experience with such features is programmers simply insert any old 
 value to get the code to pass the compiler, even programmers who know 
 it's a bad idea do it anyway.

I think you're starting to be wrong at the point where you don't realize 
that many bugs come from references that people have forgotten to 
initialize. Once you acknowledge those, you will start to realize that a 
reference that must compulsively be initialized is valuable.

You think from another perspective: you strongly believe that *most* of 
the time you can't or shouldn't initialize a reference. Your code in 
Phobos reflects that perspective. In the RegExp class, for example, you 
very often define a variable at the top of a long function and 
initialize it halfway through it. I trivially replaced such code with 
the correct code that defines symbols just where they're needed.


Andrei

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sat, 26 Sep 2009 18:38:56 -0500, Andrei Alexandrescu thusly wrote:

 Your code in
 Phobos reflects that perspective. In the RegExp class, for example, you
 very often define a variable at the top of a long function and
 initialize it halfway through it. I trivially replaced such code with
 the correct code that defines symbols just where they're needed.

Maybe Walter has not yet transitioned from the good olde Pascal/C style 
programming to the C++/D/Java style?

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

language_fan wrote:
 Maybe Walter has not yet transitioned from the good olde Pascal/C style 
 programming to the C++/D/Java style?

Heh, there's still a Fortran influence in my code <g>.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Walter Bright wrote:
 language_fan wrote:
 Maybe Walter has not yet transitioned from the good olde Pascal/C 
 style programming to the C++/D/Java style?

 
 Heh, there's still a Fortran influence in my code <g>.

This may be a good time to ask about how these variables which can be 
declared anywhere in the function scope are implemented.

void bar(bool foo) {
	if(foo) {
		int a = 1;
		...
	}
	else {
		int a = 2;
		...
	}

}

is the stack frame using two ints, or is the compiler seeing only one? I 
never bothered to check it out and just declared 'int a = void;' at the 
beginning of the routine to keep the stack frames as small as possible.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jeremie Pelletier wrote:
 This may be a good time to ask about how these variables which can be 
 declared anywhere in the function scope are implemented.
 
 void bar(bool foo) {
     if(foo) {
         int a = 1;
         ...
     }
     else {
         int a = 2;
         ...
     }
 
 }
 
 is the stack frame using two ints, or is the compiler seeing only one? I 
 never bothered to check it out and just declared 'int a = void;' at the 
 beginning of the routine to keep the stack frames as small as possible.

They are completely independent variables. One may get assigned to a 
register, and not the other.

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Walter Bright wrote:
 Jeremie Pelletier wrote:
 This may be a good time to ask about how these variables which can be 
 declared anywhere in the function scope are implemented.

 void bar(bool foo) {
     if(foo) {
         int a = 1;
         ...
     }
     else {
         int a = 2;
         ...
     }

 }

 is the stack frame using two ints, or is the compiler seeing only one? 
 I never bothered to check it out and just declared 'int a = void;' at 
 the beginning of the routine to keep the stack frames as small as 
 possible.

 
 They are completely independent variables. One may get assigned to a 
 register, and not the other.

Ok, that's what I thought, so the good old C way of declaring variables 
at the top is not a bad thing yet :)

Sep 27 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jeremie Pelletier wrote:
 Walter Bright wrote:
 They are completely independent variables. One may get assigned to a
 register, and not the other.

 
 Ok, that's what I thought, so the good old C way of declaring variables
 at the top is not a bad thing yet :)

Strange how you can look at the evidence and arrive at exactly the wrong
conclusion.  Declaring variables as close as possible to where they are
used can reduce stack usage, and never increases it.

-- 
Rainer Deyke - rainerd eldwood.com

Sep 27 2009

Rainer Deyke <rainerd eldwood.com> writes:

Jeremie Pelletier wrote:
 void bar(bool foo) {
     if(foo) {
         int a = 1;
         ...
     }
     else {
         int a = 2;
         ...
     }
 
 }
 
 is the stack frame using two ints, or is the compiler seeing only one? I
 never bothered to check it out and just declared 'int a = void;' at the
 beginning of the routine to keep the stack frames as small as possible.

OT, but declaring the variable at the top of the function increases
stack size.

Example with changed variable names:

  void bar(bool foo) {
    if (foo) {
      int a = 1;
    } else {
      int b = 2;
    }
    int c = 3;
  }

In this example, there are clearly three different (and differently
named) variables, but their lifetimes do not overlap.  Only one variable
can exist at a time, therefore the compiler only needs to allocate space
for one variable.  Now, if you move your declaration to the top:

  void bar(bool foo) {
    int a = void;
    if (foo) {
      a = 1;
    } else {
      a = 2; // Reuse variable.
    }
    int c = 3;
  }

You now only have two variables, but both of them coexist at the end of
the function.  Unless the compiler applies a clever optimization, the
compiler is now forced to allocate space for two variables on the stack.


-- 
Rainer Deyke - rainerd eldwood.com

Sep 27 2009

Walter Bright <newshound1 digitalmars.com> writes:

Rainer Deyke wrote:
 OT, but declaring the variable at the top of the function increases
 stack size.
 
 Example with changed variable names:
 
   void bar(bool foo) {
     if (foo) {
       int a = 1;
     } else {
       int b = 2;
     }
     int c = 3;
   }
 
 In this example, there are clearly three different (and differently
 named) variables, but their lifetimes do not overlap.  Only one variable
 can exist at a time, therefore the compiler only needs to allocate space
 for one variable.  Now, if you move your declaration to the top:
 
   void bar(bool foo) {
     int a = void;
     if (foo) {
       a = 1;
     } else {
       a = 2; // Reuse variable.
     }
     int c = 3;
   }
 
 You now only have two variables, but both of them coexist at the end of
 the function.  Unless the compiler applies a clever optimization, the
 compiler is now forced to allocate space for two variables on the stack.

Not necessarily. The optimizer uses a technique called "live range 
analysis" to determine if two variables have non-overlapping ranges. It 
uses this for register assignment, but it could just as well be used for 
minimizing stack usage.

Sep 27 2009

Rainer Deyke <rainerd eldwood.com> writes:

Walter Bright wrote:
   void bar(bool foo) {
     int a = void;
     if (foo) {
       a = 1;
     } else {
       a = 2; // Reuse variable.
     }
     int c = 3;
   }

 You now only have two variables, but both of them coexist at the end of
 the function.  Unless the compiler applies a clever optimization, the
 compiler is now forced to allocate space for two variables on the stack.

 
 Not necessarily. The optimizer uses a technique called "live range
 analysis" to determine if two variables have non-overlapping ranges. It
 uses this for register assignment, but it could just as well be used for
 minimizing stack usage.

That's the optimization I was referring to.  It works for ints, but not
for RAII types.  It also doesn't (necessarily) work if you reorder the
function:

   void bar(bool foo) {
     int a = void;
     int c = 3;
     if (foo) {
       a = 1;
     } else {
       a = 2; // Reuse variable.
     }
   }

Of course, a good optimizer can still reorder the declarations in this
case, or even eliminate the whole function body (since it doesn't do
anything).


-- 
Rainer Deyke - rainerd eldwood.com

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Rainer Deyke:

 Of course, a good optimizer can still reorder the declarations in this
 case, or even eliminate the whole function body (since it doesn't do
 anything).

LLVM has a good optimizer. If you try the LLVM demo on C code with LTO
activated:
http://llvm.org/demo/index.cgi

This C code:

   void bar(int foo) {
     int a;
     int c = 3;
     if (foo) {
       a = 1;
     } else {
       a = 2;
     }
   }

Produces an useful warining:
/tmp/webcompile/_16254_0.c:3: warning: unused variable 'c'

And an empty function:

define void  bar(i32 %foo) nounwind readnone {
entry:
	ret void
}

Bye,
bearophile

Sep 27 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Even forcing an explicit initializer doesn't actually solve the 
 problem - my experience with such features is programmers simply 
 insert any old value to get the code to pass the compiler, even 
 programmers who know it's a bad idea do it anyway.

 
 I think you're starting to be wrong at the point where you don't realize 
 that many bugs come from references that people have forgotten to 
 initialize. Once you acknowledge those, you will start to realize that a 
 reference that must compulsively be initialized is valuable.

The problem is it's worse to force people to provide an initializer. 
Most of the time that will work out ok, but the one silent bad value 
producing silent bad output overbalances all of it. Null pointer 
dereferences do not produce bad output that can be overlooked.

It isn't a theoretical problem with providing bad initializers just to 
shut the compiler up. I have seen it in the wild every time some manager 
required that code compile without warnings and the compiler warned 
about no initializer.

I'm very much a fan of increasing D's ability to detect and head off 
common mistakes, but it's really easy to tip into seducing programmers 
into writing bad code in order to avoid an overly nagging compiler.

There's the other problem of how to represent an "empty" value. You have 
to create a special object that then you have to either test for 
explicitly, or that has member functions that throw. You're no better 
off with that, and arguably worse off.


 You think from another perspective: you strongly believe that *most* of 
 the time you can't or shouldn't initialize a reference. Your code in 
 Phobos reflects that perspective. In the RegExp class, for example, you 
 very often define a variable at the top of a long function and 
 initialize it halfway through it. I trivially replaced such code with 
 the correct code that defines symbols just where they're needed.

That style doesn't reflect anything more than my old C habits which 
require declarations before any statements. I know it's bad style and do 
it less and less over time.

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Even forcing an explicit initializer doesn't actually solve the 
 problem - my experience with such features is programmers simply 
 insert any old value to get the code to pass the compiler, even 
 programmers who know it's a bad idea do it anyway.

 I think you're starting to be wrong at the point where you don't 
 realize that many bugs come from references that people have forgotten 
 to initialize. Once you acknowledge those, you will start to realize 
 that a reference that must compulsively be initialized is valuable.

 
 The problem is it's worse to force people to provide an initializer. 

You're not forcing. You just change the default. Really, it's *exactly* 
the same deal as with = void that you're so happy about.

Andrei

Sep 26 2009

Christopher Wright <dhasenan gmail.com> writes:

Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Even forcing an explicit initializer doesn't actually solve the 
 problem - my experience with such features is programmers simply 
 insert any old value to get the code to pass the compiler, even 
 programmers who know it's a bad idea do it anyway.

 I think you're starting to be wrong at the point where you don't 
 realize that many bugs come from references that people have forgotten 
 to initialize. Once you acknowledge those, you will start to realize 
 that a reference that must compulsively be initialized is valuable.

 
 The problem is it's worse to force people to provide an initializer. 

You aren't forcing them. They decide for themselves. They determine 
whether it's appropriate for a particular variable to be null.

You can achieve the same goal through contracts. However, this is much 
more verbose -- enough so that you'll only add these contracts when 
hunting down a bug. And if you have an array of things

 It isn't a theoretical problem with providing bad initializers just to 
 shut the compiler up. I have seen it in the wild every time some manager 
 required that code compile without warnings and the compiler warned 
 about no initializer.


often I get such an error? Maybe once for every 100 hours of coding. 
It's mainly for cases where I expect an integer to be initialized to 0 
and it's not. You know how often I provide a bad initializer to shut the 
compiler up? Never.


mostly because:
  - I declare variables where I use them, not beforehand.
  - I often declare variables via IDE commands -- I write the code to 
fetch or calculate a value and assign it to a variable that doesn't 
exist, and the IDE fills in the type and declares it in the correct place.
  - I usually don't have more than four or five local variables in a 
function (often no more than one or two). Out of 300KLOC, there are a 
few dozen functions that break this rule.

DMDFE functions are often long, complex, and have many local variables. 
I see how this would conflict with your coding style. You would have to 
add a few question marks for each function, and then you'd be done. 
DMDFE is ~60KLOC, but you could probably switch it over to this type 
system without structural changes to any function in a couple days.

Sep 27 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 00:51, Walter Bright wrote:
 grauzone wrote:
 Walter Bright wrote:
 It is exactly analogous to a null pointer exception. And it's darned
 useful.

 On Linux, it just generates a segfault. And then you have no idea
 where the program went wrong. dmd outputting incorrect debugging
 information (so you have troubles using gdb or even addr2line) doesn't
 really help here.

 Then the problem is incorrect dwarf output, not null pointers.


 Not so useful.

 It's *still* far more useful than generating corrupt output and
 pretending all is ok.

An exception trace is *far* better than a segfault and that does not 
require null values.

what's better?
a)
  auto a = new Class; // returns null (out of memory)
  a.foo = 5; // segfault since a is null

OR
b)
  auto a = new Class; // throws an out of memory exception
  a.foo = 5; // doesn't even reach here

no one here argues for option c where a holds garbage and the program 
generates corrupt output.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Yigal Chripun wrote:
 An exception trace is *far* better than a segfault and that does not 
 require null values.

Seg faults are exceptions, too. You can even catch them (on windows)!

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Walter Bright wrote:
 Yigal Chripun wrote:
 An exception trace is *far* better than a segfault and that does not 
 require null values.

 
 Seg faults are exceptions, too. You can even catch them (on windows)!

Walter, check the crash handler I submitted to D.announce, it has signal 
handlers on linux to convert segfaults into D exception objects and 
throw them so the code can unwind properly and even catch it.

It has made my life so much easier, I barely need to run within a 
debugger anymore for most crashes. I don't know enough of phobos and 
druntime to port it, but its under a public domain license so anyone is 
free to do it!

</shameless plug>

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Jeremie Pelletier wrote:
 Walter Bright wrote:
 Yigal Chripun wrote:
 An exception trace is *far* better than a segfault and that does not 
 require null values.

 Seg faults are exceptions, too. You can even catch them (on windows)!

 
 Walter, check the crash handler I submitted to D.announce, it has signal 
 handlers on linux to convert segfaults into D exception objects and 
 throw them so the code can unwind properly and even catch it.
 
 It has made my life so much easier, I barely need to run within a 
 debugger anymore for most crashes. I don't know enough of phobos and 
 druntime to port it, but its under a public domain license so anyone is 
 free to do it!
 
 </shameless plug>

I think that's great. Walter, Sean, please let's look into this.

Andrei

Sep 27 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 03:35, Walter Bright wrote:
 Yigal Chripun wrote:
 An exception trace is *far* better than a segfault and that does not
 require null values.

 Seg faults are exceptions, too. You can even catch them (on windows)!

No, segfaults are *NOT* exceptions. the setup you mention is windows 
only as Andrei said and for *nix is irrelevant. I develop on Unix 
(solaris) and segfault are a pain to deal with.

furthermore, even *IF* segfaults were transformed in D to exceptions 
that still doesn't make them proper exceptions because true exceptions 
are thrown at the place of the error which is not true for segfaults.


T foo() {
   T t;
   ...logic
   if (error) return null;
   return t;
}

now, foo is buried deep in a lib.

user code has:

T t = someLib.foo();
... logic

t.fubar = 4; //segfault t is null

how is it better to segfault in t.fubar as opposed to throw an exception 
inside foo?

Sep 27 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 01:08:32 +0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  >  
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/  
  >
  >
  > :)

 I think he's wrong.

 Getting rid of null references is like solving the problem of dead  
 canaries in the coal mines by replacing them with stuffed toys.

 It all depends on what you prefer a program to do when it encounters a  
 program bug:

 1. Immediately stop and produce an indication that the program failed

 2. Soldier on and silently produce garbage output

 I prefer (1).

 Consider the humble int. There is no invalid value such that referencing  
 the invalid value will cause a seg fault. One case is an uninitialized  
 int is set to garbage, and erratic results follow. Another is that (in  
 D) ints are default initialized to 0. 0 may or may not be what the logic  
 of the program requires, and if it isn't, again, silently bad results  
 follow.

 Consider also the NaN value that floats are default initialized to. This  
 has the nice characteristic of you know your results are bad if they are  
 NaN. But it has the bad characteristic that you don't know where the NaN  
 came from. Don corrected this by submitting a patch that enables the  
 program to throw an exception upon trying to use a NaN. Then, you know  
 exactly where your program went wrong.

 It is exactly analogous to a null pointer exception. And it's darned  
 useful.

I don't understand you. You say you prefer 1, but describe the path D  
currently takes, which is 2!

dchar d; // not initialized
writeln(d); // Soldier on and silently produce garbage output

I don't see at all how is it related to a non-null default.

Non-null default is all about avoiding erroneous situations, enforcing  
program correctness and stability. You solve an entire class of problem:  
NullPointerException.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Denis Koroskin wrote:
 I don't understand you. You say you prefer 1, but describe the path D 
 currently takes, which is 2!
 
 dchar d; // not initialized
 writeln(d); // Soldier on and silently produce garbage output

d is initialized to the "invalid" unicode bit pattern of 0xFFFF. You'll 
see this if you put a printf in. The bug here is in writeln failing to 
recognize the invalid value.

http://d.puremagic.com/issues/show_bug.cgi?id=3347

 I don't see at all how is it related to a non-null default.

Both are attempts to use invalid values.

 Non-null default is all about avoiding erroneous situations, enforcing 
 program correctness and stability. You solve an entire class of problem: 
 NullPointerException.

No, it just papers over the problem. The actual problem is the user 
failed to initialize it to a value that makes sense for his program. 
Setting it to a default value does not solve the problem.

Let's say the language is changed so that:

    int i;

is now illegal, and generates a compile time error message. What do you 
suggest the user do?

    int i = 0;

The compiler now accepts the code. But is 0 the correct value for the 
program? I guarantee you that programmers will simply insert "= 0" to 
get it to pass compilation, even if 0 is an invalid value for i for the 
logic of the program. (I guarantee it because I've seen it over and 
over, and the bugs that result.)

The point is, there really is no correct answer to the question "what 
should variables be default initialized to that will work correctly"? 
The best we can do is default initialize it to a NaN value, and then we 
can track usages of NaNs and know then that we have a program logic bug. 
A null reference is an ideal NaN value because attempts to use it will 
cause an immediate program halt with a findable indication of where the 
program logic went wrong. There's no avoiding it or pretending it didn't 
happen. There's no silently corrupt program output.

Sep 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sun, 27 Sep 2009 02:03:40 +0400, Walter Bright
<newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
 I don't understand you. You say you prefer 1, but describe the path D  
 currently takes, which is 2!
  dchar d; // not initialized
 writeln(d); // Soldier on and silently produce garbage output

 d is initialized to the "invalid" unicode bit pattern of 0xFFFF. You'll  
 see this if you put a printf in. The bug here is in writeln failing to  
 recognize the invalid value.

 http://d.puremagic.com/issues/show_bug.cgi?id=3347

Change dchar to float or an int. It's still not initialized (well,
default-initialized to some garbage, which may or may not be okay for a
programmer).

 I don't see at all how is it related to a non-null default.

 Both are attempts to use invalid values.

No.

 Non-null default is all about avoiding erroneous situations, enforcing  
 program correctness and stability. You solve an entire class of  
 problem: NullPointerException.

 No, it just papers over the problem. The actual problem is the user  
 failed to initialize it to a value that makes sense for his program.  
 Setting it to a default value does not solve the problem.

 Let's say the language is changed so that:

     int i;

 is now illegal, and generates a compile time error message. What do you  
 suggest the user do?

     int i = 0;

1) We are talking about non-null *references* here.
2) I'd suggest user to initialize it to a proper value.

"int i;" is not the whole function, is it? All I say is "i" should be
initialized before accessed, and that fact should be statically enforced
by a compiler.

 The compiler now accepts the code. But is 0 the correct value for the  
 program? I guarantee you that programmers will simply insert "= 0" to  
 get it to pass compilation, even if 0 is an invalid value for i for the  
 logic of the program. (I guarantee it because I've seen it over and  
 over, and the bugs that result.)

This is absolutely irrelevant to non-null reference types. Programmer  
can't write
"Object o = null;" to cheat on the type system.

Sep 26 2009

Jason House <jason.james.house gmail.com> writes:

Walter Bright Wrote:

 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 Getting rid of null references is like solving the problem of dead 
 canaries in the coal mines by replacing them with stuffed toys.

 It all depends on what you prefer a program to do when it encounters a 
 program bug:

What do you define as a bug? Dereferencing a null pointer? Passing a null
reference into a function that does not expect it? Storing a null reference in
a variable whose later use does not expect one? Unexpectedly getting a null
back from a function? ...

You seem to be using the first definition which is meaningless to me. What I
need to know is how the null ended up where it was unexpected.

 1. Immediately stop and produce an indication that the program failed

By most definitions above, D does not do this. I have more to say, but ran out
of time to type this. I'll add more later...

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jason House wrote:
 Walter Bright Wrote:
 
 Denis Koroskin wrote:
 On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 D has borrowed ideas from many different languages. The trick
 is to take the good stuff and avoid their mistakes <g>.

 
 How about this one:
 

 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/
 
 
 
 
 :)

 
 I think he's wrong.
 
 Getting rid of null references is like solving the problem of dead
  canaries in the coal mines by replacing them with stuffed toys.
 
 It all depends on what you prefer a program to do when it
 encounters a program bug:

 
 What do you define as a bug?

The program doing something it was not deliberately programmed to do.

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

 I used to work at Boeing designing critical flight systems. Absolutely 
 the WRONG failure mode is to pretend nothing went wrong and happily 
 return default values and show lovely green lights on the instrument 
 panel. The right thing is to immediately inform the pilot that something 
 went wrong and INSTANTLY SHUT THE BAD SYSTEM DOWN before it does 
 something really, really bad, because now it is in an unknown state. The 
 pilot then follows the procedure he's trained to, such as engage the backup.

Today we think this design is not the best one, because the pilot suddenly goes
from a situation seen as safe where the autopilot does most things, to a
situation where the pilot has to do everything. It causes panic. A human needs
time to understand the situation and act correctly. So a better solution is to
fail gracefully, giving back the control to the human in a progressive way,
with enough time to understand the situation.
Some of the things you have seen at Boeing today can be done better, there's
some progress in the design of human interfaces too. That's why I suggest you



 You could think of null exceptions like pain - sure it's unpleasant, but 
 people who feel no pain constantly injure themselves and don't live very 
 long. When I went to the dentist as a kid for the first time, he shot my 
 cheek full of novacaine. After the dental work, I went back to school. I 
 found to my amusement that if I chewed on my cheek, it didn't hurt.
 
 Boy was I sorry about that later <g>.

Oh my :-(

Bye,
bearophile

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sat, 26 Sep 2009 19:27:51 -0400, bearophile thusly wrote:

 Some of the things you have seen at Boeing
 today can be done better, there's some progress in the design of human

 days.

That is a really good suggestion. To me it seems that several known 
language authors have experimented with various kinds of languages before 
settling down. But Walter has only done assembler/C/C++/D/Java/Pascal? 
There are so many other important languages, such as Self, Eiffel, Scala, 
Scheme, SML, Haskell, Prolog, etc. It is not by any means harmful to know 
about their internals. There is great deals of CS concepts to be learned 
only by studying the language cores.

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 Walter Bright:
 
 I used to work at Boeing designing critical flight systems.
 Absolutely the WRONG failure mode is to pretend nothing went wrong
 and happily return default values and show lovely green lights on
 the instrument panel. The right thing is to immediately inform the
 pilot that something went wrong and INSTANTLY SHUT THE BAD SYSTEM
 DOWN before it does something really, really bad, because now it is
 in an unknown state. The pilot then follows the procedure he's
 trained to, such as engage the backup.

 
 Today we think this design is not the best one, because the pilot
 suddenly goes from a situation seen as safe where the autopilot does
 most things, to a situation where the pilot has to do everything. It
 causes panic.

I've never seen any suggestion that Boeing (or Airbus, or the FAA) has 
changed its philosophy on this. Do you have a reference?

I should also point out that this strategy has been extremely 
successful. Flying is inherently dangerous, yet is statistically 
incredibly safe. Boeing is doing a LOT right, and I would be extremely 
cautious of changing the philosophy that so far has delivered 
spectacular results.

BTW, shutting off the autopilot does not cause the airplane to suddenly 
nosedive. Airliner aerodynamics are designed to be stable and to seek 
straight and level flight if the controls are not touched. Autopilots do 
shut themselves off now and then, and the pilot takes command.

Computers control a lot of systems besides the autopilot, too.


 A human needs time to understand the situation and act
 correctly. So a better solution is to fail gracefully, giving back
 the control to the human in a progressive way, with enough time to
 understand the situation. Some of the things you have seen at Boeing
 today can be done better,

Please give an example. I'll give one. How about that crash in the 
Netherlands recently where the autopilot decided to fly the airplane 
into the ground? As I recall it was getting bad data from the 
altimeters. I have a firm conviction that if there's a fault in the 
altimeters, the pilot should be informed and get control back 
immediately, as opposed to thinking about a sandwich (or whatever) while 
the autopilot soldiered on. An emergency can escalate very, very fast 
when you're going 600 mph.

There have been cases of faults in the autopilot causing abrupt, bizarre 
maneuvers. This is why the autopilot must STOP IMMEDIATELY upon any 
fault which implies that the system is in an unknown state.

Failing gracefully is done by shutting down the failed system and 
engaging a backup, not by trying to convince yourself that a program in 
an unknown state is capable of continuing to function. Software simply 
does not work that way - one bit wrong and anything can happen.


 there's some progress in the design of
 human interfaces too. That's why I suggest you to program in dotnet

Sep 26 2009

Justin Johansson <procode adam-dott-com.au> writes:

Walter Bright Wrote:

bearophile wrote:
Walter Bright:

I used to work at Boeing designing critical flight systems.
Absolutely the WRONG failure mode is to pretend nothing went wrong
and happily return default values and show lovely green lights on
the instrument panel. The right thing is to immediately inform the
pilot that something went wrong and INSTANTLY SHUT THE BAD SYSTEM
DOWN before it does something really, really bad, because now it is
in an unknown state. The pilot then follows the procedure he's
trained to, such as engage the backup.

Today we think this design is not the best one, because the pilot
suddenly goes from a situation seen as safe where the autopilot does
most things, to a situation where the pilot has to do everything. It
causes panic.

I've never seen any suggestion that Boeing (or Airbus, or the FAA) has
changed its philosophy on this. Do you have a reference?

I should also point out that this strategy has been extremely
successful. Flying is inherently dangerous, yet is statistically
incredibly safe. Boeing is doing a LOT right, and I would be extremely
cautious of changing the philosophy that so far has delivered
spectacular results.

BTW, shutting off the autopilot does not cause the airplane to suddenly
nosedive. Airliner aerodynamics are designed to be stable and to seek
straight and level flight if the controls are not touched. Autopilots do
shut themselves off now and then, and the pilot takes command.

Computers control a lot of systems besides the autopilot, too.

A human needs time to understand the situation and act
correctly. So a better solution is to fail gracefully, giving back
the control to the human in a progressive way, with enough time to
understand the situation. Some of the things you have seen at Boeing
today can be done better,

Please give an example. I'll give one. How about that crash in the
Netherlands recently where the autopilot decided to fly the airplane
into the ground? As I recall it was getting bad data from the
altimeters. I have a firm conviction that if there's a fault in the
altimeters, the pilot should be informed and get control back
immediately, as opposed to thinking about a sandwich (or whatever) while
the autopilot soldiered on. An emergency can escalate very, very fast
when you're going 600 mph.

There have been cases of faults in the autopilot causing abrupt, bizarre
maneuvers. This is why the autopilot must STOP IMMEDIATELY upon any
fault which implies that the system is in an unknown state.

Failing gracefully is done by shutting down the failed system and
engaging a backup, not by trying to convince yourself that a program in
an unknown state is capable of continuing to function. Software simply
does not work that way - one bit wrong and anything can happen.

there's some progress in the design of
human interfaces too. That's why I suggest you to program in dotnet

Re:
As I recall it was getting bad data from the
altimeters. I have a firm conviction that if there's a fault in the
altimeters, the pilot should be informed and get control back
immediately, as opposed to thinking about a sandwich (or whatever) while
the autopilot soldiered on.

Walter, in the heat of this thread I hope you haven't missed the correlation
with discussion
on "Dispatching on a variant" and noting:

"Further, and worth mentioning given another raging thread on this forum at the
moment,
it turns out the ensuring type-safety of my design means that NPE's are a thing
of the
past (for me at least). This is due to strong static type checking together
with runtime type
validation all for a pretty reasonable cost."

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=96847

Regards
Justin Johansson

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the correlation
with discussion
 on "Dispatching on a variant" and noting:

Thanks for pointing it out. The facilities in D enable one to construct 
a non-nullable type, and they are appropriate for many designs. I just 
don't see them as a replacement for *all* reference types.

Sep 26 2009

Justin Johansson <procode adam-dott-com.au> writes:

Walter Bright Wrote:

 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the correlation
with discussion
 on "Dispatching on a variant" and noting:

 
 Thanks for pointing it out. The facilities in D enable one to construct 
 a non-nullable type, and they are appropriate for many designs. I just 
 don't see them as a replacement for *all* reference types.

What you just said made me think that much of this thread is talking at
cross-purposes.

Perhaps the problem should be re-framed.

The example

T bar;
bar.foo();    // new magic in hypothetical D doesn't kill the canary just yet

is a bad example to base this discussion on.

Something like

T bar;
mar.foo( bar)

is a better example to consider.

Forgetting about reference types for a moment, consider the following
statements:

"An int type is an indiscriminate union of negativeIntegerType,
nonNegativeIntegerType, positiveIntegerType and other range-checked integer
types.  Passing around int's to
functions that take int arguments, unless full 32 bits of int is what you
really mean, is
akin to passing around an indiscriminate union value, which is a no no."

Pondering this might well shed some light and set useful context for the
overall discussion.

In other words, it's not so much an argument about calling a method on a
reference type,
its more about how to treat any type, value or reference, in type-safe,
discriminate, manner.

Just a thought (?)

Sep 26 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-09-26 22:07:00 -0400, Walter Bright <newshound1 digitalmars.com> said:

 [...] The facilities in D enable one to construct a non-nullable type, 
 and they are appropriate for many designs. I just don't see them as a 
 replacement for *all* reference types.

As far as I understand this thread, no one here is arguing that 
non-nullable references/pointers should replace *all* reference/pointer 
types. The argument made is that non-nullable should be the default and 
nullable can be specified explicitly any time you need it.

So if you need a reference you use "Object" as the type, and if you 
want that reference to be nullable you write "Object?". The static 
analysis can then assert that your code properly check for null prior 
dereferencing a nullable type and issues a compilation error if not.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 26 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-09-26 23:28:30 -0400, Michel Fortin <michel.fortin michelf.com> said:

 On 2009-09-26 22:07:00 -0400, Walter Bright <newshound1 digitalmars.com> said:
 
 [...] The facilities in D enable one to construct a non-nullable type, 
 and they are appropriate for many designs. I just don't see them as a 
 replacement for *all* reference types.

 
 As far as I understand this thread, no one here is arguing that 
 non-nullable references/pointers should replace *all* reference/pointer 
 types. The argument made is that non-nullable should be the default and 
 nullable can be specified explicitly any time you need it.
 
 So if you need a reference you use "Object" as the type, and if you 
 want that reference to be nullable you write "Object?". The static 
 analysis can then assert that your code properly check for null prior 
 dereferencing a nullable type and issues a compilation error if not.

I just want to add: some people here are suggesting the compiler adds 
code to check for null and throw exceptions... I believe like you that 
this is the wrong approach because, like you said, it makes people add 
dummy try/catch statements to ignore the error. What you want a 
prorammer to do is check for null and properly handle the situation 
before the error occurs, and this is exactly what the static analysis 
approach I suggest forces.

Take this example where "a" is non-nullable and "b" is nullable:

string test(Object a, Object? b)
{
	auto x = a.toString();
	auto y = b.toString();
	
	return x ~ y;
}

This should result in a compiler error on line 4 with a message telling 
you that "b" needs to be checked for null prior use. The programmer 
must then fix his error with an if (or some other control structure), 
like this:

string test(Object a, Object? b)
{
	audo result = a.toString();
	if (b)
		result ~= b.toString();

	return result;
}

And now the compiler will let it pass. This is what I'd like to see. 
What do you think?

I'm not totally against throwing exceptions in some cases, but the 
above approach would be much more useful. Unfortunatly, throwing 
exceptions it the best you can do with a library type approach.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 26 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 05:45, Michel Fortin wrote:
 On 2009-09-26 23:28:30 -0400, Michel Fortin <michel.fortin michelf.com>
 said:

 On 2009-09-26 22:07:00 -0400, Walter Bright
 <newshound1 digitalmars.com> said:

 [...] The facilities in D enable one to construct a non-nullable
 type, and they are appropriate for many designs. I just don't see
 them as a replacement for *all* reference types.

 As far as I understand this thread, no one here is arguing that
 non-nullable references/pointers should replace *all*
 reference/pointer types. The argument made is that non-nullable should
 be the default and nullable can be specified explicitly any time you
 need it.

 So if you need a reference you use "Object" as the type, and if you
 want that reference to be nullable you write "Object?". The static
 analysis can then assert that your code properly check for null prior
 dereferencing a nullable type and issues a compilation error if not.

 I just want to add: some people here are suggesting the compiler adds
 code to check for null and throw exceptions... I believe like you that
 this is the wrong approach because, like you said, it makes people add
 dummy try/catch statements to ignore the error. What you want a
 prorammer to do is check for null and properly handle the situation
 before the error occurs, and this is exactly what the static analysis
 approach I suggest forces.

 Take this example where "a" is non-nullable and "b" is nullable:

 string test(Object a, Object? b)
 {
 auto x = a.toString();
 auto y = b.toString();

 return x ~ y;
 }

 This should result in a compiler error on line 4 with a message telling
 you that "b" needs to be checked for null prior use. The programmer must
 then fix his error with an if (or some other control structure), like this:

 string test(Object a, Object? b)
 {
 audo result = a.toString();
 if (b)
 result ~= b.toString();

 return result;
 }

 And now the compiler will let it pass. This is what I'd like to see.
 What do you think?

 I'm not totally against throwing exceptions in some cases, but the above
 approach would be much more useful. Unfortunatly, throwing exceptions it
 the best you can do with a library type approach.

If you refer to my posts than I want to clarify:
I fully agree with you that in the above this can and should be 
compile-time checked. This is a stricter approach and might seem 
annoying to some programmers but is far safer.
Using non-null references by default will also restrict writing these 
checks to only the places where it is actually needed.

In my posts I was simply answering Walter's claim.
Walter was saying that returning null is a valid and in fact better way 
to indicate errors instead of returning some "default" value which will 
cause the program to generate bad output.
My response to that was that if there's an error, the function should 
instead throw an exception which provides more information and better 
error handling.

null is a bad way to indicate errors precisely because of the point you 
make - the compiler does not enforce the programmer to explicitly handle 
the null case unlike the Option type in FP languages.

Sep 27 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Michel Fortin wrote:
 On 2009-09-26 23:28:30 -0400, Michel Fortin <michel.fortin michelf.com> 
 said:
 
 On 2009-09-26 22:07:00 -0400, Walter Bright 
 <newshound1 digitalmars.com> said:

 [...] The facilities in D enable one to construct a non-nullable 
 type, and they are appropriate for many designs. I just don't see 
 them as a replacement for *all* reference types.

 As far as I understand this thread, no one here is arguing that 
 non-nullable references/pointers should replace *all* 
 reference/pointer types. The argument made is that non-nullable should 
 be the default and nullable can be specified explicitly any time you 
 need it.

 So if you need a reference you use "Object" as the type, and if you 
 want that reference to be nullable you write "Object?". The static 
 analysis can then assert that your code properly check for null prior 
 dereferencing a nullable type and issues a compilation error if not.

 
 I just want to add: some people here are suggesting the compiler adds 
 code to check for null and throw exceptions... I believe like you that 
 this is the wrong approach because, like you said, it makes people add 
 dummy try/catch statements to ignore the error. What you want a 
 prorammer to do is check for null and properly handle the situation 
 before the error occurs, and this is exactly what the static analysis 
 approach I suggest forces.
 
 Take this example where "a" is non-nullable and "b" is nullable:
 
 string test(Object a, Object? b)
 {
     auto x = a.toString();
     auto y = b.toString();
     
     return x ~ y;
 }
 
 This should result in a compiler error on line 4 with a message telling 
 you that "b" needs to be checked for null prior use. The programmer must 
 then fix his error with an if (or some other control structure), like this:
 
 string test(Object a, Object? b)
 {
     audo result = a.toString();
     if (b)
         result ~= b.toString();
 
     return result;
 }
 
 And now the compiler will let it pass. This is what I'd like to see. 
 What do you think?
 
 I'm not totally against throwing exceptions in some cases, but the above 
 approach would be much more useful. Unfortunatly, throwing exceptions it 
 the best you can do with a library type approach.
 

I don't think this would fly. One good thing about nullable references 
is that they are dynamically checked for validity at virtually zero 
cost. Non-nullable references, therefore, would not add value in that 
respect, but would add value by reducing the cases when programmers 
forgot to initialize references properly.

Andrei

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 One good thing about nullable references 
 is that they are dynamically checked for validity at virtually zero 
 cost. Non-nullable references, therefore, would not add value in that 
 respect, but would add value by reducing the cases when programmers 
 forgot to initialize references properly.

nonnullable references can also reduce the total amount of code a little,
because you don't need to write the null tests often (the points where you use
objects are more than the points where you instantiate them).

Bye,
bearophile

Sep 27 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-09-27 09:41:03 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Michel Fortin wrote:
 On 2009-09-26 23:28:30 -0400, Michel Fortin <michel.fortin michelf.com> said:
 
 On 2009-09-26 22:07:00 -0400, Walter Bright <newshound1 digitalmars.com> said:
 
 [...] The facilities in D enable one to construct a non-nullable type, 
 and they are appropriate for many designs. I just don't see them as a 
 replacement for *all* reference types.

 
 As far as I understand this thread, no one here is arguing that 
 non-nullable references/pointers should replace *all* reference/pointer 
 types. The argument made is that non-nullable should be the default and 
 nullable can be specified explicitly any time you need it.
 
 So if you need a reference you use "Object" as the type, and if you 
 want that reference to be nullable you write "Object?". The static 
 analysis can then assert that your code properly check for null prior 
 dereferencing a nullable type and issues a compilation error if not.

 
 I just want to add: some people here are suggesting the compiler adds 
 code to check for null and throw exceptions... I believe like you that 
 this is the wrong approach because, like you said, it makes people add 
 dummy try/catch statements to ignore the error. What you want a 
 prorammer to do is check for null and properly handle the situation 
 before the error occurs, and this is exactly what the static analysis 
 approach I suggest forces.
 
 Take this example where "a" is non-nullable and "b" is nullable:
 
 string test(Object a, Object? b)
 {
     auto x = a.toString();
     auto y = b.toString();
         return x ~ y;
 }
 
 This should result in a compiler error on line 4 with a message telling 
 you that "b" needs to be checked for null prior use. The programmer 
 must then fix his error with an if (or some other control structure), 
 like this:
 
 string test(Object a, Object? b)
 {
     audo result = a.toString();
     if (b)
         result ~= b.toString();
 
     return result;
 }
 
 And now the compiler will let it pass. This is what I'd like to see. 
 What do you think?
 
 I'm not totally against throwing exceptions in some cases, but the 
 above approach would be much more useful. Unfortunatly, throwing 
 exceptions it the best you can do with a library type approach.

 
 I don't think this would fly.

You want me to add wings? Please explain.

 One good thing about nullable references is that they are dynamically 
 checked for validity at virtually zero cost.

When you say they are dynamically checked, do you mean it throws an 
exception when you assign null? I'm not totally against this idea, but 
I find the above a supperior solution because it forces the programmer 
to handle the problem where it occurs and it doesn't require any 
runtime check.

 Non-nullable references, therefore, would not add value in that 
 respect, but would add value by reducing the cases when programmers 
 forgot to initialize references properly.

To me it looks like you're supporting an inferior concept for 
non-nullable references because the better one will not "fly" (whatever 
that means). Well, I support both concepts of non-nullable because they 
both can be very useful, but I believe static checking is a better way 
than throwing exceptions.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 27 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Michel Fortin wrote:
 On 2009-09-27 09:41:03 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 Michel Fortin wrote:
 On 2009-09-26 23:28:30 -0400, Michel Fortin 
 <michel.fortin michelf.com> said:

 On 2009-09-26 22:07:00 -0400, Walter Bright 
 <newshound1 digitalmars.com> said:

 [...] The facilities in D enable one to construct a non-nullable 
 type, and they are appropriate for many designs. I just don't see 
 them as a replacement for *all* reference types.

 As far as I understand this thread, no one here is arguing that 
 non-nullable references/pointers should replace *all* 
 reference/pointer types. The argument made is that non-nullable 
 should be the default and nullable can be specified explicitly any 
 time you need it.

 So if you need a reference you use "Object" as the type, and if you 
 want that reference to be nullable you write "Object?". The static 
 analysis can then assert that your code properly check for null 
 prior dereferencing a nullable type and issues a compilation error 
 if not.

 I just want to add: some people here are suggesting the compiler adds 
 code to check for null and throw exceptions... I believe like you 
 that this is the wrong approach because, like you said, it makes 
 people add dummy try/catch statements to ignore the error. What you 
 want a prorammer to do is check for null and properly handle the 
 situation before the error occurs, and this is exactly what the 
 static analysis approach I suggest forces.

 Take this example where "a" is non-nullable and "b" is nullable:

 string test(Object a, Object? b)
 {
     auto x = a.toString();
     auto y = b.toString();
         return x ~ y;
 }

 This should result in a compiler error on line 4 with a message 
 telling you that "b" needs to be checked for null prior use. The 
 programmer must then fix his error with an if (or some other control 
 structure), like this:

 string test(Object a, Object? b)
 {
     audo result = a.toString();
     if (b)
         result ~= b.toString();

     return result;
 }

 And now the compiler will let it pass. This is what I'd like to see. 
 What do you think?

 I'm not totally against throwing exceptions in some cases, but the 
 above approach would be much more useful. Unfortunatly, throwing 
 exceptions it the best you can do with a library type approach.

 I don't think this would fly.

 
 You want me to add wings? Please explain.

I did explain. You suggest that we replace an automated, no-cost 
checking with a manual, compulsory, conservative, and costly scheme. 
That pretty much summarizes its disadvantages too :o).

Andrei

Sep 27 2009

Christopher Wright <dhasenan gmail.com> writes:

Michel Fortin wrote:
 On 2009-09-26 22:07:00 -0400, Walter Bright <newshound1 digitalmars.com> 
 said:
 
 [...] The facilities in D enable one to construct a non-nullable type, 
 and they are appropriate for many designs. I just don't see them as a 
 replacement for *all* reference types.

 
 As far as I understand this thread, no one here is arguing that 
 non-nullable references/pointers should replace *all* reference/pointer 
 types. The argument made is that non-nullable should be the default and 
 nullable can be specified explicitly any time you need it.
 
 So if you need a reference you use "Object" as the type, and if you want 
 that reference to be nullable you write "Object?". The static analysis 
 can then assert that your code properly check for null prior 
 dereferencing a nullable type and issues a compilation error if not.

I dislike these forced checks.

Let's say you're dealing with a compiler frontend. You have a semantic 
node that just went through some semantic pass and is guaranteed, by 
flow control and contracts, to have a certain property initialized that 
was not initialized prior to that point.

The programmer knows the value isn't null. The compiler shouldn't force 
checks. At most, it should have automated checks that disappear with 
-release.

Also, it introduces more nesting.

Also, unless the compiler's flow analysis is great, it's a nuisance -- 
you can see that the error is bogus and have to insert extra checks.

It should be fine to provide a requireNotNull template and leave it at that.

Sep 27 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-09-27 07:38:59 -0400, Christopher Wright <dhasenan gmail.com> said:

 I dislike these forced checks.
 
 Let's say you're dealing with a compiler frontend. You have a semantic 
 node that just went through some semantic pass and is guaranteed, by 
 flow control and contracts, to have a certain property initialized that 
 was not initialized prior to that point.
 
 The programmer knows the value isn't null. The compiler shouldn't force 
 checks. At most, it should have automated checks that disappear with 
 -release.

If the programmer knows a value isn't null, why not put the value in a 
nullable-reference in the first place?


 Also, it introduces more nesting.

Yes and no. It introduces an "if" statement for null checking, but only 
for nullable references. If you know your reference can't be null it 
should be non-nullable, and then you don't need to check.


 Also, unless the compiler's flow analysis is great, it's a nuisance -- 
 you can see that the error is bogus and have to insert extra checks.

First you're right, if the feature is implemented it should be well 
implemented. Second, if in a few place you don't want an "if" clause, 
you can always cast your nullable reference to a non-nullable one, 
explicitly bypassing the safeties. If you write a cast, you are making 
a consious decision of not checking for null, which is much better than 
the current situation where it's very easy to forget to check for null.


 It should be fine to provide a requireNotNull template and leave it at that.

It's fine to have such a template. But it's not nearly as useful.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Michel Fortin wrote:
 On 2009-09-27 07:38:59 -0400, Christopher Wright <dhasenan gmail.com> said:
 
 I dislike these forced checks.

 Let's say you're dealing with a compiler frontend. You have a semantic 
 node that just went through some semantic pass and is guaranteed, by 
 flow control and contracts, to have a certain property initialized 
 that was not initialized prior to that point.

 The programmer knows the value isn't null. The compiler shouldn't 
 force checks. At most, it should have automated checks that disappear 
 with -release.

 
 If the programmer knows a value isn't null, why not put the value in a 
 nullable-reference in the first place?

It may not be nonnull for the entire lifetime of the reference.

 Also, it introduces more nesting.

 
 Yes and no. It introduces an "if" statement for null checking, but only 
 for nullable references. If you know your reference can't be null it 
 should be non-nullable, and then you don't need to check.

I much prefer explicit null checks than implicit ones I can't control.

 Also, unless the compiler's flow analysis is great, it's a nuisance -- 
 you can see that the error is bogus and have to insert extra checks.

 
 First you're right, if the feature is implemented it should be well 
 implemented. Second, if in a few place you don't want an "if" clause, 
 you can always cast your nullable reference to a non-nullable one, 
 explicitly bypassing the safeties. If you write a cast, you are making a 
 consious decision of not checking for null, which is much better than 
 the current situation where it's very easy to forget to check for null.

That's just adding useless verbosity to the language.

 It should be fine to provide a requireNotNull template and leave it at 
 that.

 
 It's fine to have such a template. But it's not nearly as useful.

It definitely is, the whole point is about reference initializations, 
not what they can or can't initialize to.

What about non-nan floats? Or non-invalid characters? I fear nonnull 
references are a first step in the wrong direction. The focus should be 
about implementing variable initialization checks to the compiler, since 
this solves the issue with any variable, not just references. The flow 
analysis can also be reused for many other optimizations.

Sep 27 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sun, Sep 27, 2009 at 2:07 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:

 Yes and no. It introduces an "if" statement for null checking, but only
 for nullable references. If you know your reference can't be null it should
 be non-nullable, and then you don't need to check.

 I much prefer explicit null checks than implicit ones I can't control.

Nonnull types do not create implicit null checks. Nonnull types DO NOT
need to be checked. And nullable types WOULD force explicit null
checks.

 What about non-nan floats? Or non-invalid characters? I fear nonnull
 references are a first step in the wrong direction. The focus should be
 about implementing variable initialization checks to the compiler, since
 this solves the issue with any variable, not just references. The flow
 analysis can also be reused for many other optimizations.

hash_t foo(Object o) { return o.toHash(); }
foo(null); // bamf, I just killed your function.

Forcing initialization of locals does NOT solve all the problems that
nonnull references would.

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Jarrett Billingsley wrote:
 On Sun, Sep 27, 2009 at 2:07 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 
 Yes and no. It introduces an "if" statement for null checking, but only
 for nullable references. If you know your reference can't be null it should
 be non-nullable, and then you don't need to check.

 I much prefer explicit null checks than implicit ones I can't control.

 
 Nonnull types do not create implicit null checks. Nonnull types DO NOT
 need to be checked. And nullable types WOULD force explicit null
 checks.

Forcing checks on nullables is just as bad, not all nullables need to be 
checked every time they're used.

 What about non-nan floats? Or non-invalid characters? I fear nonnull
 references are a first step in the wrong direction. The focus should be
 about implementing variable initialization checks to the compiler, since
 this solves the issue with any variable, not just references. The flow
 analysis can also be reused for many other optimizations.

 
 hash_t foo(Object o) { return o.toHash(); }
 foo(null); // bamf, I just killed your function.
 
 Forcing initialization of locals does NOT solve all the problems that
 nonnull references would.

You didn't kill my function, you shot yourself in the foot. Something 
trivial to debug.

Sep 27 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sun, Sep 27, 2009 at 3:42 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 Jarrett Billingsley wrote:
 Nonnull types do not create implicit null checks. Nonnull types DO NOT
 need to be checked. And nullable types WOULD force explicit null
 checks.

 Forcing checks on nullables is just as bad, not all nullables need to be
 checked every time they're used.

You don't get it, do you. If you have a reference that doesn't need to
be checked every time it's used, you make it a *nonnull reference*.
You *only* use nullable variables for references where the nullness of
the reference should change the program logic.

And if you're talking about things like:

Foo? x = someFunc();

if(x is null)
{
    // one path
}
else
{
    // use x here
}

and you're expecting the "use x here" clause to force you to do
(cast(Foo)x) every time you want to use x? That's not the case. The
condition of the if has *proven* x to be nonnull in the else clause,
so no null checks - at compile time or at runtime - have to be
performed there, nor does it have to be cast to a nonnull reference.

And if you have a nullable reference that you know is not null for the
rest of the function? Just put "assert(x !is null)" and everything
that follows will assume it's not null.

 hash_t foo(Object o) { return o.toHash(); }
 foo(null); // bamf, I just killed your function.

 Forcing initialization of locals does NOT solve all the problems that
 nonnull references would.

 You didn't kill my function, you shot yourself in the foot. Something
 trivial to debug.

You're dodging. You claim that forcing variable initialization solves
the same problem that nonnull references do. It doesn't.

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Jarrett Billingsley:

 And if you have a nullable reference that you know is not null for the
 rest of the function? Just put "assert(x !is null)" and everything
 that follows will assume it's not null.

Asserts tend to vanish in release mode, so it may be better to use something
different. A possibility is to use the enforce() some people have shown here.

Another possibility is the very strange assume() of Visual C++, that I may
appreciate for other purposes too:
http://msdn.microsoft.com/en-us/library/1b3fsfxw(loband).aspx

Bye,
bearophile

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Jeremie Pelletier:

 The focus should be 
 about implementing variable initialization checks to the compiler, since 
 this solves the issue with any variable, not just references. The flow 
 analysis can also be reused for many other optimizations.

Are you willing to give your help to implement about 5-10% if this feature? :-)

Bye,
bearophile

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

bearophile wrote:
 Jeremie Pelletier:
 
 The focus should be 
 about implementing variable initialization checks to the compiler, since 
 this solves the issue with any variable, not just references. The flow 
 analysis can also be reused for many other optimizations.

 
 Are you willing to give your help to implement about 5-10% if this feature? :-)
 
 Bye,
 bearophile

Sure, I would love to help implement flow analysis, I don't know enough 
of the current dmd semantic analysis internals yet, but I'm slowly 
getting there.

Jeremie

Sep 27 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the 
 correlation with discussion
 on "Dispatching on a variant" and noting:

 
 Thanks for pointing it out. The facilities in D enable one to construct 
 a non-nullable type, and they are appropriate for many designs. 

No. There is no means to disable default construction.

 I just 
 don't see them as a replacement for *all* reference types.

Non-nullable references should be the default.


Andrei

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the 
 correlation with discussion
 on "Dispatching on a variant" and noting:

 Thanks for pointing it out. The facilities in D enable one to 
 construct a non-nullable type, and they are appropriate for many designs. 

 
 No. There is no means to disable default construction.
 
 I just don't see them as a replacement for *all* reference types.

 
 Non-nullable references should be the default.
 
 
 Andrei

Like I said in another post of this thread, I believe the issue here is 
more over initializer semantics than null/non-null references. This is 
what's causing most of the errors anyways.

Can't the compiler just throw a warning if a variable is used before 
initialization, and allow "= null" to bypass this ("= void" would still 
be considered uninitialized). Same thing for fields.

It would be much more convenient than new type variants, both to 
implement and to use.

It could even be used for any type, the default initializer in D is a 
cute idea, but not a performance friendly one. I would much prefer the 
compiler to allow "int a" but warn me if I use it before assigning 
anything to it than assigning it to zero, and then assigning it to the 
value I wanted. "= void" is nice but I'm pretty sure I'm way over a 
thousand uses of it so far.

Jeremie

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jeremie Pelletier wrote:
 It could even be used for any type, the default initializer in D is a 
 cute idea, but not a performance friendly one. I would much prefer the 
 compiler to allow "int a" but warn me if I use it before assigning 
 anything to it than assigning it to zero, and then assigning it to the 
 value I wanted. "= void" is nice but I'm pretty sure I'm way over a 
 thousand uses of it so far.

The compiler, when -O is used, should remove nearly all the redundant 
initializations.

Sep 26 2009

BCS <none anon.com> writes:

Hello Walter,

 Jeremie Pelletier wrote:
 
 It could even be used for any type, the default initializer in D is a
 cute idea, but not a performance friendly one. I would much prefer
 the compiler to allow "int a" but warn me if I use it before
 assigning anything to it than assigning it to zero, and then
 assigning it to the value I wanted. "= void" is nice but I'm pretty
 sure I'm way over a thousand uses of it so far.
 

 The compiler, when -O is used, should remove nearly all the redundant
 initializations.
 

Sweet, so you already have a bunch of the logic needed to check make sure 
non-null references get initialized.

Sep 27 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the 
 correlation with discussion
 on "Dispatching on a variant" and noting:

 Thanks for pointing it out. The facilities in D enable one to 
 construct a non-nullable type, and they are appropriate for many designs. 

 
 No. There is no means to disable default construction.

Ack, I remember we talked about this, I guess I don't remember the 
resolution.

Sep 26 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the 
 correlation with discussion
 on "Dispatching on a variant" and noting:

 Thanks for pointing it out. The facilities in D enable one to 
 construct a non-nullable type, and they are appropriate for many 
 designs. 

 No. There is no means to disable default construction.

 
 Ack, I remember we talked about this, I guess I don't remember the 
 resolution.

The resolution was that the language will allow delete'ing the unwanted 
constructor:

struct NonNull(T) if (is(T == class))
{
     delete this();
     ...
}


Andrei

Sep 27 2009

Christopher Wright <dhasenan gmail.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the 
 correlation with discussion
 on "Dispatching on a variant" and noting:

 Thanks for pointing it out. The facilities in D enable one to 
 construct a non-nullable type, and they are appropriate for many designs. 

 
 No. There is no means to disable default construction.

I looked into this slightly. You'd have to do mark non-nullable fields 
as requiring ctor initialization, prevent reallocating arrays of 
non-nullables, and a few other things. At the time I wasn't considering 
struct constructors; without them, you'd have to forbid structs that 
contain non-nullable fields, but with them, it's okay.

 I just don't see them as a replacement for *all* reference types.

 
 Non-nullable references should be the default.
 
 
 Andrei

Sep 27 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 04:07, Walter Bright wrote:
 Justin Johansson wrote:
 Walter, in the heat of this thread I hope you haven't missed the
 correlation with discussion
 on "Dispatching on a variant" and noting:

 Thanks for pointing it out. The facilities in D enable one to construct
 a non-nullable type, and they are appropriate for many designs. I just
 don't see them as a replacement for *all* reference types.

No one was claiming that.

to reiterate -  non-null references are *not* a replacement for *all* 
reference types, they are just a better, safer *default*.

You can use nullable references when needed (that what all T? code 
snippets are all about) it just isn't the default.

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter, I can already see lot of confusion in this thread. Let's think well
about this topic, because it's almost as important as implementing good
multiprocessing in D2. It's among the most important things that may happen to
D (and it's also a breaking change for D2, so it can't be implemented later),
and it's one of the few things where D has a chance to become a little better


I see lot of failed communication about this topic, so someone may have to
write a text that explains the basics and avoid misunderstandings about basic
things. Maybe even a DEP. I think Andrei has understood this topic, so he may
write it :-) Otherwise I can build a Wiki page...

(A possibility is to implement this idea in a branch of D2, test it for a month
or so, and then if it's seen as good it can be kept. The main problem of this
idea is that implementing such changes requires some work, and wasting time
isn't nice).

Bye,
bearophile

Sep 26 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Walter Bright wrote:
 ...
 
 It all depends on what you prefer a program to do when it encounters a
 program bug:
 
 1. Immediately stop and produce an indication that the program failed
 
 2. Soldier on and silently produce garbage output
 
 I prefer (1).
 
 ...

*sigh*  Walter, I really admire you as a programmer.  But this is about
the most blatant strawman argument I've seen in a while.

Firstly, as others have indicated, the whole point of non-null would be
to REQUIRE initialisation to something useful.

"But the user will just assign to something useless to get around that!"

You mean like how everyone wraps every call in try{...}catch(Exception
e){} to shut the damn exceptions up?  Or uses pointer arithmetic and
casts to get at those pesky private members?

If someone is actively trying to break the type system, it's their
goddamn fault!  Honestly, I don't care about the hacks they employ to
defeat the system because they're going to go around blindly shooting
themselves in the foot no matter what they do.

It's like arguing that safety rails are pointless because people can
just jump over them.  BESIDES, if they fall off, you get this really
loud "crunch" followed by a shower of blood; then it's OBVIOUS that
something's wrong.

And what about the people who AREN'T complete idiots, who maybe
sometimes just accidentally trip and would quite welcome a safety rail
there?

Besides which, the non-null proposal is always accompanied by the
proposal to add nullable object references as T? (or whatever; the
syntax is irrelevant at this point).  If a programmer really wants a
null-initialised object reference, which is she more likely to do?

  class NullFoo : Foo
  {
    void member1() { throw new Exception("NULL!"); }
    void member2() { throw new Exception("NULL!"); }
    ...
  }

  Foo bar = new NullFoo;

or

  Foo? bar;

Since the reason they're trying to circumvent the non-null protection is
because of laziness, I assert they're far more likely to go with the
second than the first.

And it's STILL better because you couldn't implicitly cast between Foo?
and Foo.  They would HAVE to insert an explicit cast or check.

  Foo quxx = enforceNN(bar);

Finally, let me re-post something I wrote the last time this came up:

 The problem with null dereference problems isn't knowing that they're
 there: that's the easy part.  You helpfully get an exception to the
 face when that happens. The hard part is figuring out *where* the
 problem originally occurred. It's not when the exception is thrown
 that's the issue; it's the point at which you placed a null reference
 in a slot where you shouldn't have.

 Yes, we have invariants and contracts; but inevitably you're going to
 forget one, and it's that one slip-up that's going to bite you in the
 rear.

 One could use roughly the same argument for non-null references as for
 const: you could document it, but documentation is inevitably wrong or
 out of date.  :P

Sep 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

Daniel Keep wrote:
 "But the user will just assign to something useless to get around that!"
 
 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

They do just that in Java because of the checked-exceptions thing. I 
have a reference to Bruce Eckel's essay on it somewhere in this thread. 
The observation in the article was it wasn't just moron idiot 
programmers doing this. It was the guru programmers doing it, all the 
while knowing it was the wrong thing to do. The end result was the 
feature actively created the very problems it was designed to prevent.


 Or uses pointer arithmetic and
 casts to get at those pesky private members?

That's entirely different, because privacy is selected by the 
programmer, not the language. I don't have any issue with a user-defined 
type that is non-nullable (Andrei has designed a type constructor for that).


 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

True, but it's still not a good idea to design a language feature that 
winds up, in reality, encouraging bad programming practice. It 
encourages bad practice in a way that is really, really hard to detect 
in a code review.

I like programming mistakes to be obvious, not subtle. There's nothing 
subtle about a null pointer exception. There's plenty subtle about the 
wrong default value.


 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

Null pointer seg faults *are* a safety rail. They keep an errant program 
from causing further damage.


 Finally, let me re-post something I wrote the last time this came up:
 
 The problem with null dereference problems isn't knowing that they're
 there: that's the easy part.  You helpfully get an exception to the
 face when that happens. The hard part is figuring out *where* the
 problem originally occurred. It's not when the exception is thrown
 that's the issue; it's the point at which you placed a null reference
 in a slot where you shouldn't have.


It's a lot harder to track down a bug when the bad initial value gets 
combined with a lot of other data first. The only time I've had a 
problem finding where a null came from (because they tend to fail very 
close to their initialization point) is when the null was caused by 
another memory corruption problem. Non-nullable references won't 
mitigate that.

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 
 They do just that in Java because of the checked-exceptions thing. I 
 have a reference to Bruce Eckel's essay on it somewhere in this thread. 
 The observation in the article was it wasn't just moron idiot 
 programmers doing this. It was the guru programmers doing it, all the 
 while knowing it was the wrong thing to do. The end result was the 
 feature actively created the very problems it was designed to prevent.
 
 
 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 
 That's entirely different, because privacy is selected by the 
 programmer, not the language. I don't have any issue with a user-defined 
 type that is non-nullable (Andrei has designed a type constructor for 
 that).
 
 
 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 
 True, but it's still not a good idea to design a language feature that 
 winds up, in reality, encouraging bad programming practice. It 
 encourages bad practice in a way that is really, really hard to detect 
 in a code review.
 
 I like programming mistakes to be obvious, not subtle. There's nothing 
 subtle about a null pointer exception. There's plenty subtle about the 
 wrong default value.
 
 
 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 
 Null pointer seg faults *are* a safety rail. They keep an errant program 
 from causing further damage.

Null pointer seg faults *not being able to happen* are much more safe. :)

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Ary Borenszweig wrote:
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. I 
 have a reference to Bruce Eckel's essay on it somewhere in this 
 thread. The observation in the article was it wasn't just moron idiot 
 programmers doing this. It was the guru programmers doing it, all the 
 while knowing it was the wrong thing to do. The end result was the 
 feature actively created the very problems it was designed to prevent.


 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 That's entirely different, because privacy is selected by the 
 programmer, not the language. I don't have any issue with a 
 user-defined type that is non-nullable (Andrei has designed a type 
 constructor for that).


 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 True, but it's still not a good idea to design a language feature that 
 winds up, in reality, encouraging bad programming practice. It 
 encourages bad practice in a way that is really, really hard to detect 
 in a code review.

 I like programming mistakes to be obvious, not subtle. There's nothing 
 subtle about a null pointer exception. There's plenty subtle about the 
 wrong default value.


 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 Null pointer seg faults *are* a safety rail. They keep an errant 
 program from causing further damage.

 
 Null pointer seg faults *not being able to happen* are much more safe. :)

There is no such thing as "not being able to happen" :)

Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

I seem to be the only one who sees Walter's side of things in this 
thread :o)

For nonnulls to *really* be enforcable you'd have to get rid of the cast 
system entirely.

Sep 26 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 11:23 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:

 There is no such thing as "not being able to happen" :)

 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

 I seem to be the only one who sees Walter's side of things in this thread
 :o)

Why the hell would the compiler allow that to begin with? Why bother
implementing nonnull references only to allow the entire system to be
broken?

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 11:23 PM, Jeremie Pelletier <jeremiep gmail.com> wrote:
 
 There is no such thing as "not being able to happen" :)

 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

 I seem to be the only one who sees Walter's side of things in this thread
 :o)

 
 Why the hell would the compiler allow that to begin with? Why bother
 implementing nonnull references only to allow the entire system to be
 broken?

Because D is a practical language that let the programmer do whatever he 
wants, even shoot his own foot if he wants to. Doing so just isn't as 
implicit as in C.

Walter understands there are some cases where you want to override the 
type system, that's why casts are in D, too many optimizations rely on it.

Sep 26 2009

downs <default_357-line yahoo.de> writes:

Jeremie Pelletier wrote:
 Jarrett Billingsley wrote:
 On Sat, Sep 26, 2009 at 11:23 PM, Jeremie Pelletier
 <jeremiep gmail.com> wrote:

 There is no such thing as "not being able to happen" :)

 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

 I seem to be the only one who sees Walter's side of things in this
 thread
 :o)

 Why the hell would the compiler allow that to begin with? Why bother
 implementing nonnull references only to allow the entire system to be
 broken?

 
 Because D is a practical language that let the programmer do whatever he
 wants, even shoot his own foot if he wants to. Doing so just isn't as
 implicit as in C.
 
 Walter understands there are some cases where you want to override the
 type system, that's why casts are in D, too many optimizations rely on it.

Sure, but if you set out to break it the compiler really can't (or shouldn't)
help you. This whole debate, as far as I know, is about defaults, i.e.
preventing *unintentional* nulls.

Sep 27 2009

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around 
 that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. I 
 have a reference to Bruce Eckel's essay on it somewhere in this 
 thread. The observation in the article was it wasn't just moron idiot 
 programmers doing this. It was the guru programmers doing it, all the 
 while knowing it was the wrong thing to do. The end result was the 
 feature actively created the very problems it was designed to prevent.


 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 That's entirely different, because privacy is selected by the 
 programmer, not the language. I don't have any issue with a 
 user-defined type that is non-nullable (Andrei has designed a type 
 constructor for that).


 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 True, but it's still not a good idea to design a language feature 
 that winds up, in reality, encouraging bad programming practice. It 
 encourages bad practice in a way that is really, really hard to 
 detect in a code review.

 I like programming mistakes to be obvious, not subtle. There's 
 nothing subtle about a null pointer exception. There's plenty subtle 
 about the wrong default value.


 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 Null pointer seg faults *are* a safety rail. They keep an errant 
 program from causing further damage.

 Null pointer seg faults *not being able to happen* are much more safe. :)

 
 There is no such thing as "not being able to happen" :)
 
 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;
 
 I seem to be the only one who sees Walter's side of things in this 
 thread :o)
 
 For nonnulls to *really* be enforcable you'd have to get rid of the cast 
 system entirely.

It's a systems programming language. You can screw up the type system if 
you really want to. But then it would still fall back to the lovely 
segfault. If you don't screw with it, you're safe with static checking. 
It's a clean win.


-- 
Tomasz Stachowiak
http://h3.team0xf.com/
h3/h3r3tic on #D freenode

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around 
 that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. I 
 have a reference to Bruce Eckel's essay on it somewhere in this 
 thread. The observation in the article was it wasn't just moron idiot 
 programmers doing this. It was the guru programmers doing it, all the 
 while knowing it was the wrong thing to do. The end result was the 
 feature actively created the very problems it was designed to prevent.


 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 That's entirely different, because privacy is selected by the 
 programmer, not the language. I don't have any issue with a 
 user-defined type that is non-nullable (Andrei has designed a type 
 constructor for that).


 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 True, but it's still not a good idea to design a language feature 
 that winds up, in reality, encouraging bad programming practice. It 
 encourages bad practice in a way that is really, really hard to 
 detect in a code review.

 I like programming mistakes to be obvious, not subtle. There's 
 nothing subtle about a null pointer exception. There's plenty subtle 
 about the wrong default value.


 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 Null pointer seg faults *are* a safety rail. They keep an errant 
 program from causing further damage.

 Null pointer seg faults *not being able to happen* are much more safe. :)

 
 There is no such thing as "not being able to happen" :)
 
 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

Object is not-nullable, Object? (or whatever syntax you like) is 
nullable. So that line is a compile-time error: you can't cast a null to 
an Object (because Object *can't* be null).

You might be the only one here that understands Walter's point. But 
Walter is wrong. ;-)

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Ary Borenszweig wrote:
 Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around 
 that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. I 
 have a reference to Bruce Eckel's essay on it somewhere in this 
 thread. The observation in the article was it wasn't just moron 
 idiot programmers doing this. It was the guru programmers doing it, 
 all the while knowing it was the wrong thing to do. The end result 
 was the feature actively created the very problems it was designed 
 to prevent.


 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 That's entirely different, because privacy is selected by the 
 programmer, not the language. I don't have any issue with a 
 user-defined type that is non-nullable (Andrei has designed a type 
 constructor for that).


 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 True, but it's still not a good idea to design a language feature 
 that winds up, in reality, encouraging bad programming practice. It 
 encourages bad practice in a way that is really, really hard to 
 detect in a code review.

 I like programming mistakes to be obvious, not subtle. There's 
 nothing subtle about a null pointer exception. There's plenty subtle 
 about the wrong default value.


 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 Null pointer seg faults *are* a safety rail. They keep an errant 
 program from causing further damage.

 Null pointer seg faults *not being able to happen* are much more 
 safe. :)

 There is no such thing as "not being able to happen" :)

 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

 
 Object is not-nullable, Object? (or whatever syntax you like) is 
 nullable. So that line is a compile-time error: you can't cast a null to 
 an Object (because Object *can't* be null).
 
 You might be the only one here that understands Walter's point. But 
 Walter is wrong. ;-)

union A {
	Object foo;
	Object? bar;
}

Give me a type system, and I will find backdoors :)

I didn't say Walter was right or wrong, I said I understand his point of 
view. The sweet spot most likely lie in the middle of both arguments 
seen in this thread, and that's not an easy one to pinpoint!

I think we should much rather enforce variable initialization in D than 
nullable/non-nullable types. The error after all is that an unitialized 
reference triggers a segfault.

What if using 'Object obj;' raises a warning "unitialized variable" and 
makes everyone wanting non-null references happy, and 'Object obj = 
null;' raises no warning and makes everyone wanting to keep the current 
system (all two of us!) happy.

I believe it's a fair compromise.

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around 
 that!"

 You mean like how everyone wraps every call in 
 try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. 
 I have a reference to Bruce Eckel's essay on it somewhere in this 
 thread. The observation in the article was it wasn't just moron 
 idiot programmers doing this. It was the guru programmers doing it, 
 all the while knowing it was the wrong thing to do. The end result 
 was the feature actively created the very problems it was designed 
 to prevent.


 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 That's entirely different, because privacy is selected by the 
 programmer, not the language. I don't have any issue with a 
 user-defined type that is non-nullable (Andrei has designed a type 
 constructor for that).


 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 True, but it's still not a good idea to design a language feature 
 that winds up, in reality, encouraging bad programming practice. It 
 encourages bad practice in a way that is really, really hard to 
 detect in a code review.

 I like programming mistakes to be obvious, not subtle. There's 
 nothing subtle about a null pointer exception. There's plenty 
 subtle about the wrong default value.


 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety 
 rail
 there?

 Null pointer seg faults *are* a safety rail. They keep an errant 
 program from causing further damage.

 Null pointer seg faults *not being able to happen* are much more 
 safe. :)

 There is no such thing as "not being able to happen" :)

 Object thisCannotPossiblyBeNullInAnyWayWhatsoever = cast(Object)null;

 Object is not-nullable, Object? (or whatever syntax you like) is 
 nullable. So that line is a compile-time error: you can't cast a null 
 to an Object (because Object *can't* be null).

 You might be the only one here that understands Walter's point. But 
 Walter is wrong. ;-)

 
 union A {
     Object foo;
     Object? bar;
 }
 
 Give me a type system, and I will find backdoors :)

Ah, nice one.

Well, I see you can always break the type system. The point is to break 
it as little as possible while obtaining the most out of it without it 
bothering you.

Sep 26 2009

Christopher Wright <dhasenan gmail.com> writes:

Jeremie Pelletier wrote:
 What if using 'Object obj;' raises a warning "unitialized variable" and 
 makes everyone wanting non-null references happy, and 'Object obj = 
 null;' raises no warning and makes everyone wanting to keep the current 
 system (all two of us!) happy.
 
 I believe it's a fair compromise.

It's a large improvement, but only for local variables. If your segfault 
has to do with a local variable, unless your function is monstrously 
large, it should be easy to fix, without changing the type system.

The larger use case is when you have an aggregate member that cannot be 
null. This can be solved via contracts, but they are tedious to write 
and ubiquitous.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Christopher Wright wrote:
 Jeremie Pelletier wrote:
 What if using 'Object obj;' raises a warning "unitialized variable" 
 and makes everyone wanting non-null references happy, and 'Object obj 
 = null;' raises no warning and makes everyone wanting to keep the 
 current system (all two of us!) happy.

 I believe it's a fair compromise.

 
 It's a large improvement, but only for local variables. If your segfault 
 has to do with a local variable, unless your function is monstrously 
 large, it should be easy to fix, without changing the type system.
 
 The larger use case is when you have an aggregate member that cannot be 
 null. This can be solved via contracts, but they are tedious to write 
 and ubiquitous.

But how would you enforce a nonnull type over an aggregate in the first 
place? If you can, you could also apply the same initializer semantics I 
suggested earlier.

Look at this for example:

struct A {
	Object cannotBeNull;
}

void main() {
	A* a = new A;
}

Memory gets initialized to zero, and you have a broken non-null type. 
You could have the compiler throw an error here, but the compiler cannot 
possibly know about all data creation methods such as malloc, calloc or 
any other external allocator.

You could even do something like:

Object* foo = calloc(Object.sizeof);

and the compiler would let you dereference foo resulting in yet another 
broken nonnull variable.

Non-nulls are a cute idea when you have a type system that is much 
stricter than D's, but there are just way too many workarounds to make 
it crash in D.

Sep 26 2009

downs <default_357-line yahoo.de> writes:

Jeremie Pelletier wrote:
 Christopher Wright wrote:
 Jeremie Pelletier wrote:
 What if using 'Object obj;' raises a warning "unitialized variable"
 and makes everyone wanting non-null references happy, and 'Object obj
 = null;' raises no warning and makes everyone wanting to keep the
 current system (all two of us!) happy.

 I believe it's a fair compromise.

 It's a large improvement, but only for local variables. If your
 segfault has to do with a local variable, unless your function is
 monstrously large, it should be easy to fix, without changing the type
 system.

 The larger use case is when you have an aggregate member that cannot
 be null. This can be solved via contracts, but they are tedious to
 write and ubiquitous.

 
 But how would you enforce a nonnull type over an aggregate in the first
 place? If you can, you could also apply the same initializer semantics I
 suggested earlier.
 
 Look at this for example:
 
 struct A {
     Object cannotBeNull;
 }
 
 void main() {
     A* a = new A;
 }
 
 Memory gets initialized to zero, and you have a broken non-null type.
 You could have the compiler throw an error here, but the compiler cannot
 possibly know about all data creation methods such as malloc, calloc or
 any other external allocator.
 
 You could even do something like:
 
 Object* foo = calloc(Object.sizeof);
 
 and the compiler would let you dereference foo resulting in yet another
 broken nonnull variable.
 
 Non-nulls are a cute idea when you have a type system that is much
 stricter than D's, but there are just way too many workarounds to make
 it crash in D.

"Here are some cases you haven't mentioned yet. This proves that the compiler
can't possibly be smart enough. "

Yeeeeeah.

In the above case, why not implicitly put the cannotBeNull check into the
struct invariant? That's where it belongs, imho.

Regarding your example, it's calloc(size_t.sizeof). And a) we probably can't
catch that case except with in/out null checks on every method, but then again,
how often have you done that? I don't think it's relevant enough to be relevant
to this thread. :)

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

downs wrote:
 Jeremie Pelletier wrote:
 Christopher Wright wrote:
 Jeremie Pelletier wrote:
 What if using 'Object obj;' raises a warning "unitialized variable"
 and makes everyone wanting non-null references happy, and 'Object obj
 = null;' raises no warning and makes everyone wanting to keep the
 current system (all two of us!) happy.

 I believe it's a fair compromise.

 It's a large improvement, but only for local variables. If your
 segfault has to do with a local variable, unless your function is
 monstrously large, it should be easy to fix, without changing the type
 system.

 The larger use case is when you have an aggregate member that cannot
 be null. This can be solved via contracts, but they are tedious to
 write and ubiquitous.

 But how would you enforce a nonnull type over an aggregate in the first
 place? If you can, you could also apply the same initializer semantics I
 suggested earlier.

 Look at this for example:

 struct A {
     Object cannotBeNull;
 }

 void main() {
     A* a = new A;
 }

 Memory gets initialized to zero, and you have a broken non-null type.
 You could have the compiler throw an error here, but the compiler cannot
 possibly know about all data creation methods such as malloc, calloc or
 any other external allocator.

 You could even do something like:

 Object* foo = calloc(Object.sizeof);

 and the compiler would let you dereference foo resulting in yet another
 broken nonnull variable.

 Non-nulls are a cute idea when you have a type system that is much
 stricter than D's, but there are just way too many workarounds to make
 it crash in D.

 
 "Here are some cases you haven't mentioned yet. This proves that the compiler
can't possibly be smart enough. "
 
 Yeeeeeah.

I allocate most structs on the gc, unless I need them only for the scope 
of a function (that includes RVO). All objects are on the gc already, so 
it's a pretty major case. The argument was to protect aggregate fields, 
I'm just pointing out that their usage usually is preventing an easy 
implementation. I'm not saying its impossible.

Besides, what I said was, if its possible to enforce these fields to be 
null/non-null, you can enforce them to be properly initialized in such 
case, making nulls/non-nulls nearly useless.

 In the above case, why not implicitly put the cannotBeNull check into the
struct invariant? That's where it belongs, imho.

Exactly, what's the need for null/non-null types then?

 Regarding your example, it's calloc(size_t.sizeof). And a) we probably can't
catch that case except with in/out null checks on every method, but then again,
how often have you done that? I don't think it's relevant enough to be relevant
to this thread. :)

Actually, sizeof currently returns the size of the reference, so its 
always going to be the same as size_t.sizeof.

Sep 27 2009

downs <default_357-line yahoo.de> writes:

Jeremie Pelletier wrote:
 downs wrote:
 Jeremie Pelletier wrote:
 Christopher Wright wrote:
 Jeremie Pelletier wrote:
 What if using 'Object obj;' raises a warning "unitialized variable"
 and makes everyone wanting non-null references happy, and 'Object obj
 = null;' raises no warning and makes everyone wanting to keep the
 current system (all two of us!) happy.

 I believe it's a fair compromise.

 It's a large improvement, but only for local variables. If your
 segfault has to do with a local variable, unless your function is
 monstrously large, it should be easy to fix, without changing the type
 system.

 The larger use case is when you have an aggregate member that cannot
 be null. This can be solved via contracts, but they are tedious to
 write and ubiquitous.

 But how would you enforce a nonnull type over an aggregate in the first
 place? If you can, you could also apply the same initializer semantics I
 suggested earlier.

 Look at this for example:

 struct A {
     Object cannotBeNull;
 }

 void main() {
     A* a = new A;
 }

 Memory gets initialized to zero, and you have a broken non-null type.
 You could have the compiler throw an error here, but the compiler cannot
 possibly know about all data creation methods such as malloc, calloc or
 any other external allocator.

 You could even do something like:

 Object* foo = calloc(Object.sizeof);

 and the compiler would let you dereference foo resulting in yet another
 broken nonnull variable.

 Non-nulls are a cute idea when you have a type system that is much
 stricter than D's, but there are just way too many workarounds to make
 it crash in D.

 "Here are some cases you haven't mentioned yet. This proves that the
 compiler can't possibly be smart enough. "

 Yeeeeeah.

 
 I allocate most structs on the gc, unless I need them only for the scope
 of a function (that includes RVO). All objects are on the gc already, so
 it's a pretty major case. The argument was to protect aggregate fields,
 I'm just pointing out that their usage usually is preventing an easy
 implementation. I'm not saying its impossible.
 
 Besides, what I said was, if its possible to enforce these fields to be
 null/non-null, you can enforce them to be properly initialized in such
 case, making nulls/non-nulls nearly useless.
 
 In the above case, why not implicitly put the cannotBeNull check into
 the struct invariant? That's where it belongs, imho.

 
 Exactly, what's the need for null/non-null types then?
 

You're twisting my words.

Checking for null in the struct invariant would be an _implementation_ of
non-nullable types in structs.

Isn't the whole point of defaulting to non-nullable types that we don't have to
check for it manually, i.e. in the user-defined invariant?

I think we should avoid having to build recursive checks for null-ness for
every type we define.

 Regarding your example, it's calloc(size_t.sizeof). And a) we probably
 can't catch that case except with in/out null checks on every method,
 but then again, how often have you done that? I don't think it's
 relevant enough to be relevant to this thread. :)

 
 Actually, sizeof currently returns the size of the reference, so its
 always going to be the same as size_t.sizeof.

Weird. I remembered that differently. Thanks.

Sep 27 2009

"Nick Sabalausky" <a a.a> writes:

"Jeremie Pelletier" <jeremiep gmail.com> wrote in message 
news:h9mmre$1i8j$1 digitalmars.com...
 Ary Borenszweig wrote:
 Object is not-nullable, Object? (or whatever syntax you like) is 
 nullable. So that line is a compile-time error: you can't cast a null to 
 an Object (because Object *can't* be null).

 union A {
 Object foo;
 Object? bar;
 }

 Give me a type system, and I will find backdoors :)

Unions are nothing more than an alternate syntax for a reinterpret cast. And 
it's an arguably worse syntax because unlike casts, uses of it are 
indistinguishable from normal safe code, there's nothing to grep for. As 
such, unions should never be considered any more safe than cast(x)y. The 
following is just as dangerous as your example above and doesn't even touch 
the issue of nullability/non-nulability:

union A {
int foo;
float bar;
}

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Nick Sabalausky wrote:
 "Jeremie Pelletier" <jeremiep gmail.com> wrote in message 
 news:h9mmre$1i8j$1 digitalmars.com...
 Ary Borenszweig wrote:
 Object is not-nullable, Object? (or whatever syntax you like) is 
 nullable. So that line is a compile-time error: you can't cast a null to 
 an Object (because Object *can't* be null).

 union A {
 Object foo;
 Object? bar;
 }

 Give me a type system, and I will find backdoors :)

 
 Unions are nothing more than an alternate syntax for a reinterpret cast. And 
 it's an arguably worse syntax because unlike casts, uses of it are 
 indistinguishable from normal safe code, there's nothing to grep for. As 
 such, unions should never be considered any more safe than cast(x)y. The 
 following is just as dangerous as your example above and doesn't even touch 
 the issue of nullability/non-nulability:
 
 union A {
 int foo;
 float bar;
 }
 

Yet it's the only way I know of to do bitwise logic on floating points 
in D to extract the exponent, sign and mantissa for example.

And yes they are much, much more than a simple reinterpret cast, a 
simple set of casts will not set the size of the union to its largest 
member. Unions make for elegant types which can have many valid 
representations:

union Vec3F {
	struct { float x, y, z; }
	float[3] v;
}

I just can't picture D without unions :)

Sep 28 2009

Jari-Matti =?UTF-8?B?TcOka2Vsw6Q=?= <jmjmak utu.fi.invalid> writes:

Jeremie Pelletier wrote:

 Nick Sabalausky wrote:
 union A {
 int foo;
 float bar;
 }
 

 
 Yet it's the only way I know of to do bitwise logic on floating points
 in D to extract the exponent, sign and mantissa for example.

You could add built-in methods for those operations to the float type:

float bar;

boolean s = bar.sign;
...

Union is very flexible, but unfortunately it's also one of the features that 
can break the type safety in D.

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Jari-Matti M�kel� wrote:
 Jeremie Pelletier wrote:
 
 Nick Sabalausky wrote:
 union A {
 int foo;
 float bar;
 }

 Yet it's the only way I know of to do bitwise logic on floating points
 in D to extract the exponent, sign and mantissa for example.

 
 You could add built-in methods for those operations to the float type:
 
 float bar;
 
 boolean s = bar.sign;
 ...

That would be so inefficient in some cases, you don't always want to 
shift the data like bar.sign implies.

 Union is very flexible, but unfortunately it's also one of the features that 
 can break the type safety in D.

That's the best thing about systems languages: to have a core set of 
rules, and to be able to purposely break them. Even better, you still 
pass go and still get $200.

I don't want a language that takes me by the hand for a walk in the 
park. I want a language that keeps me on my toes and punch me in the 
face every now and then :)

Sep 28 2009

Jari-Matti =?UTF-8?B?TcOka2Vsw6Q=?= <jmjmak utu.fi.invalid> writes:

Jeremie Pelletier wrote:

 Jari-Matti Mäkelä wrote:
 Jeremie Pelletier wrote:
 
 Nick Sabalausky wrote:
 union A {
 int foo;
 float bar;
 }

 Yet it's the only way I know of to do bitwise logic on floating points
 in D to extract the exponent, sign and mantissa for example.

 
 You could add built-in methods for those operations to the float type:
 
 float bar;
 
 boolean s = bar.sign;
 ...

 
 That would be so inefficient in some cases, you don't always want to
 shift the data like bar.sign implies.

It depends on the boolean representation. I see no reason why a built-in 
feature should be slower than some bitwise logic operation in user code. 
After all, the set of operations the language provides for the user is a 
subset of all possible operations the language implementation can do.

Sep 28 2009

bearophile <bearophileHUGS lycos.com> writes:

Jari-Matti M.:

 It depends on the boolean representation. I see no reason why a built-in 
 feature should be slower than some bitwise logic operation in user code. 
 After all, the set of operations the language provides for the user is a 
 subset of all possible operations the language implementation can do.

I agree. One of the best qualities of C++ is that it often allows the
programmers to build abstractions with no or minimal cost. A good systems
language is a language that allows you to define a built-in looking syntactic
construct (for example a function) that for example allows you to access and
use parts of a floating point number with the same efficiency of C/asm code.

Bye,
bearophile

Sep 28 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 28/09/2009 12:05, Jeremie Pelletier wrote:
 Nick Sabalausky wrote:
 "Jeremie Pelletier" <jeremiep gmail.com> wrote in message
 news:h9mmre$1i8j$1 digitalmars.com...
 Ary Borenszweig wrote:
 Object is not-nullable, Object? (or whatever syntax you like) is
 nullable. So that line is a compile-time error: you can't cast a
 null to an Object (because Object *can't* be null).

 union A {
 Object foo;
 Object? bar;
 }

 Give me a type system, and I will find backdoors :)

 Unions are nothing more than an alternate syntax for a reinterpret
 cast. And it's an arguably worse syntax because unlike casts, uses of
 it are indistinguishable from normal safe code, there's nothing to
 grep for. As such, unions should never be considered any more safe
 than cast(x)y. The following is just as dangerous as your example
 above and doesn't even touch the issue of nullability/non-nulability:

 union A {
 int foo;
 float bar;
 }

 Yet it's the only way I know of to do bitwise logic on floating points
 in D to extract the exponent, sign and mantissa for example.

 And yes they are much, much more than a simple reinterpret cast, a
 simple set of casts will not set the size of the union to its largest
 member. Unions make for elegant types which can have many valid
 representations:

 union Vec3F {
 struct { float x, y, z; }
 float[3] v;
 }

 I just can't picture D without unions :)

here's a type-safe alternative
note: untested

struct Vec3F {
   float[3] v;
   alias v[0] x;
   alias v[1] y;
   alias v[2] z;
}

D provides alignment control for structs, why do we need to have a 
separate union construct if it is just a special case of struct alignment?

IMO the use cases for union are very rare and they all can be redesigned 
in a type safe manner.
when software was small and simple, hand tuning code with low level 
mechanisms (such as unions and even using assembly) made a lot of sense. 
Today's software is typically far more complex and is way to big to risk 
loosing safety features for marginal performance gains.

micro optimizations simply doesn't scale.

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Yigal Chripun wrote:
 On 28/09/2009 12:05, Jeremie Pelletier wrote:
 Nick Sabalausky wrote:
 "Jeremie Pelletier" <jeremiep gmail.com> wrote in message
 news:h9mmre$1i8j$1 digitalmars.com...
 Ary Borenszweig wrote:
 Object is not-nullable, Object? (or whatever syntax you like) is
 nullable. So that line is a compile-time error: you can't cast a
 null to an Object (because Object *can't* be null).

 union A {
 Object foo;
 Object? bar;
 }

 Give me a type system, and I will find backdoors :)

 Unions are nothing more than an alternate syntax for a reinterpret
 cast. And it's an arguably worse syntax because unlike casts, uses of
 it are indistinguishable from normal safe code, there's nothing to
 grep for. As such, unions should never be considered any more safe
 than cast(x)y. The following is just as dangerous as your example
 above and doesn't even touch the issue of nullability/non-nulability:

 union A {
 int foo;
 float bar;
 }

 Yet it's the only way I know of to do bitwise logic on floating points
 in D to extract the exponent, sign and mantissa for example.

 And yes they are much, much more than a simple reinterpret cast, a
 simple set of casts will not set the size of the union to its largest
 member. Unions make for elegant types which can have many valid
 representations:

 union Vec3F {
 struct { float x, y, z; }
 float[3] v;
 }

 I just can't picture D without unions :)

 
 here's a type-safe alternative
 note: untested
 
 struct Vec3F {
   float[3] v;
   alias v[0] x;
   alias v[1] y;
   alias v[2] z;
 }
 
 D provides alignment control for structs, why do we need to have a 
 separate union construct if it is just a special case of struct alignment?

These aliases won't compile, and that was only one out of many union uses.

 IMO the use cases for union are very rare and they all can be redesigned 
 in a type safe manner.

Not always true.

 when software was small and simple, hand tuning code with low level 
 mechanisms (such as unions and even using assembly) made a lot of sense. 
 Today's software is typically far more complex and is way to big to risk 
 loosing safety features for marginal performance gains.
 
 micro optimizations simply doesn't scale.

Again, that's a lazy view on programming. High level constructs are 
useful to isolate small and simple algorithms which are implemented at 
low level.

These aren't just marginal performance gains, they can easily be up to 
15-30% improvements, sometimes 50% and more. If this is too complex or 
the risk is too high for you then don't use a systems language :)

Sep 28 2009

bearophile <bearophileHUGS lycos.com> writes:

Jeremie Pelletier:

 Not always true.

I agree, I'm using D also because it offers unions. Sometimes they are useful.

But beside normal C unions that I don't want to remove from C, it can be also
useful to have safe automatic tagged unions of Cyclone. They are safer and give
just a little less performance compared to C unions. In D they may be denoted
with "record" or "tunion" or just "safe union" to save keywords. They always
contain an invisible tag (that can be read with a special built-in union
method, like Unioname.tagcheck). Such "safe unions" may even become the only
ones allowed in SafeD modules!

The following is from Cyclone docs:
<<
The C Standard says that if you read out any member of a union other than the
last one written, the result is undefined.
To avoid this problem, Cyclone provides a built-in form of tagged union and
always ensures that the tag is correlated with the last member written in the
union. In particular, whenever a tagged union member is updated, the compiler
inserts code to update the tag associated with the union. Whenever a member is
read, the tag is consulted to ensure that the member was the last one written.
If not, an exception is thrown.

Thus, the aforementioned example can be rewritten in Cyclone like this:

 tagged union U { int i; int *p; };
void pr(union U x) {
  if (tagcheck(x.i))
    printf("int(%d)",x.i);
  else
    printf("ptr(%d)",*x.p);
}

The  tagged qualifier indicates to the compiler that U should be a tagged
union. The operation tagcheck(x.i) returns true when i was the last member
written so it can be used to extract the value.



 Again, that's a lazy view on programming. High level constructs are 
 useful to isolate small and simple algorithms which are implemented at 
 low level.

Software is inherently multi-scale. Probably in 90-95% of the code of a program
micro-optimizations aren't that necessary because those operations are done
only once in a while. But then it often happens that certain loops are done an
enormous amount of times, so even small inefficiencies inside them lead to low
performance. That's why profiling helps.

This can be seen by how HotSpot (and modern dynamic language JITters work):
usually virtual calls like you can find in a D program are quick, they don't
slow down code. Yet if a dynamic call prevents the compile to perform a
critical inlining or such dynamic call is left in the middle of a critical
code, it may lead to a slower program. That's why I have Java code go 10-30%
faster than D code compiled with LDC, not because of the GC and memory
allocations, but just because LDC isn't smart enough to inline certain virtual
methods.

------------------------------------

More quotations from the Cyclone documentation:

In contrast, Cyclone's analysis extends to struct, union members, and pointer
contents to ensure everything is initialized before it is used. This has two
benefits: First, we tend to catch more bugs this way, and second, programmers
don't pay for the overhead of automatic initialization on top of their own
initialization code.<


This is right on-topic:
This requires little effort from the programmer, but the NULL checks slow down
getc. To repair this, we have extended Cyclone with a new kind of pointer,
called a �never-NULL� pointer, and indicated with � � instead of �*�. For
example, in Cyclone you can declare

int getc(FILE  );
indicating that getc expects a non-NULL FILE pointer as its argument. This
one-character change tells Cyclone that it does not need to insert NULL checks
into the body of getc. If getc is called with a possibly-NULL pointer, Cyclone
will insert a NULL check at the call :<



Goto C's goto statements can lead to safety violations when they are used to
jump into scopes. Here is a simple example:

int z;
{ int x = 0xBAD; goto L; }
{ int *y = &z;
L: *y = 3; // Possible segfault
}

Cyclone's static analysis detects this situation and signals an error. A goto
that does not enter a scope is safe, and is allowed in Cyclone. We apply the
same analysis to switch statements, which suffer from a similar vulnerability
in C.<

Bye,
bearophile

Sep 28 2009

Christopher Wright <dhasenan gmail.com> writes:

bearophile wrote:
 Jeremie Pelletier:
 Again, that's a lazy view on programming. High level constructs are 
 useful to isolate small and simple algorithms which are implemented at 
 low level.

 
 Software is inherently multi-scale. Probably in 90-95% of the code of a
program micro-optimizations aren't that necessary because those operations are
done only once in a while. But then it often happens that certain loops are
done an enormous amount of times, so even small inefficiencies inside them lead
to low performance. That's why profiling helps.
 
 This can be seen by how HotSpot (and modern dynamic language JITters work):
usually virtual calls like you can find in a D program are quick, they don't
slow down code. Yet if a dynamic call prevents the compile to perform a
critical inlining or such dynamic call is left in the middle of a critical
code, it may lead to a slower program. That's why I have Java code go 10-30%
faster than D code compiled with LDC, not because of the GC and memory
allocations, but just because LDC isn't smart enough to inline certain virtual
methods.

Certainly agreed on virtual calls: on my machine, I timed a simple 
example as executing 65 interface calls per microsecond, 85 virtual 
calls per microsecond, and 210 non-member function calls per 
microsecond. So you should almost never worry about the cost of 
interface calls since they're so cheap, but they are 3.5 times slower 
than non-member functions.

In most cases, the body of a method is a lot more expensive than the 
method call, so even when optimizing, it won't often benefit you to use 
free functions rather than class or interface methods.

Sep 28 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Christopher Wright wrote:
 bearophile wrote:
 Jeremie Pelletier:
 Again, that's a lazy view on programming. High level constructs are 
 useful to isolate small and simple algorithms which are implemented 
 at low level.

 Software is inherently multi-scale. Probably in 90-95% of the code of 
 a program micro-optimizations aren't that necessary because those 
 operations are done only once in a while. But then it often happens 
 that certain loops are done an enormous amount of times, so even small 
 inefficiencies inside them lead to low performance. That's why 
 profiling helps.

 This can be seen by how HotSpot (and modern dynamic language JITters 
 work): usually virtual calls like you can find in a D program are 
 quick, they don't slow down code. Yet if a dynamic call prevents the 
 compile to perform a critical inlining or such dynamic call is left in 
 the middle of a critical code, it may lead to a slower program. That's 
 why I have Java code go 10-30% faster than D code compiled with LDC, 
 not because of the GC and memory allocations, but just because LDC 
 isn't smart enough to inline certain virtual methods.

 
 Certainly agreed on virtual calls: on my machine, I timed a simple 
 example as executing 65 interface calls per microsecond, 85 virtual 
 calls per microsecond, and 210 non-member function calls per 
 microsecond. So you should almost never worry about the cost of 
 interface calls since they're so cheap, but they are 3.5 times slower 
 than non-member functions.

Thanks for posting these interesting numbers. I seem to recall that 
interface dispach in D does a linear search in the interfaces list, so 
you may want to repeat your tests with a variable number of interfaces, 
and a variable position of the interface being used.

Andrei

Sep 28 2009

bearophile <bearophileHUGS lycos.com> writes:

Christopher Wright:

 Certainly agreed on virtual calls: on my machine, I timed a simple 
 example as executing 65 interface calls per microsecond, 85 virtual 
 calls per microsecond, and 210 non-member function calls per 
 microsecond. So you should almost never worry about the cost of 
 interface calls since they're so cheap, but they are 3.5 times slower 
 than non-member functions.


The main problem of virtual calls in D are the missed inlining opportunities.

------------

Andrei Alexandrescu:

 I seem to recall that 
 interface dispach in D does a linear search in the interfaces list, so 
 you may want to repeat your tests with a variable number of interfaces, 
 and a variable position of the interface being used.

The following is a D port of the well known "Richards" benchmark. This specific
version is object oriented, its classes are final (otherwise the code gets
quite slower with LDC) and it has getters/setters. It contains an interface:
http://codepad.org/kO3MJK60

You can run it at the command line giving it 10000000.

On a Celeron 2 GHz if you replace the interface with an abstract class the
running time goes from 2.16 to 1.58 seconds, compiled with:
ldc -O5 -release -inline

Compiled with DMD the running time seems about unchanged. I have no idea why.
Maybe some of you can tell me.

In a day or two I'll release many more timings and tests about this Richards
benchmark.

Bye,
bearophile

Sep 28 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-09-28 15:36:05 -0400, bearophile <bearophileHUGS lycos.com> said:

 Compiled with DMD the running time seems about unchanged. I have no 
 idea why. Maybe some of you can tell me.

If I recall correctly, implementing an interface adds a variable to an 
class which contains a pointer to that interface's vtable 
implementation for that particular class. An interface pointer points 
to that variable inside the object instead (not at the beginning of the 
object allocated space), and calling a function on it involves 
dereferencing the interface's vtable, and calling the right function. 
Obtaining the real "this" pointer for calling the function involves 
looking at the first value in the interface's vtable which contains an 
offset you can substract from the interface pointer to get the object 
pointer.

So basically, if I recall well how it works, calling a function on an 
interface reference involves one more substraction than calling a 
member function a class reference, which is pretty marginal.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 28 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Thanks for posting these interesting numbers. I seem to recall that 
 interface dispach in D does a linear search in the interfaces list, so 
 you may want to repeat your tests with a variable number of interfaces, 
 and a variable position of the interface being used.

No, it is done with one indirection.

interface IA { void foo(); }

interface IB : IA { }

class C : IA { void foo() { } }

void test(C c)
{
     c.foo();
}

========================================

test:
                 enter   4,0
                 mov     ECX,[EAX]
                 call    dword ptr 014h[ECX]
                 leave
                 ret

Sep 28 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

No, it is done with one indirection.<

If even Andrei, a quite intelligent person that has written big books on C++,
may be wrong on such a basic thing, then I think there's a problem.

It can be good to create an html page that explains how some basic things of D
are implemented in the front-end. Such page can also contain box & arrow images
that show how structures and memory are organized for various of such data
structures.

Such html page is useful for both normal programmers that want to understand
what's under the hood, and for people that may want to fix/modify the front-end.

Bye,
bearophile

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

bearophile wrote:
 Walter Bright:
 
 No, it is done with one indirection.<

 
 If even Andrei, a quite intelligent person that has written big books on C++,
may be wrong on such a basic thing, then I think there's a problem.
 
 It can be good to create an html page that explains how some basic things of D
are implemented in the front-end. Such page can also contain box & arrow images
that show how structures and memory are organized for various of such data
structures.
 
 Such html page is useful for both normal programmers that want to understand
what's under the hood, and for people that may want to fix/modify the front-end.
 
 Bye,
 bearophile

I agree, the ABI documentation on digitalmars.com is far from complete, 
I had to learn a lot of it through trial and error. What was especially 
confusing was the interface reference vs the interface info vs the 
interface's classinfo vs the referenced object, I wrote an internal 
wrapper struct to make most of the casts go away:

struct Interface {
	Object object() const {
		return cast(Object)(cast(void*)&this - interfaceinfo.offset);
	}

	immutable(InterfaceInfo)* interfaceinfo() const {
		return **cast(InterfaceInfo***)&this;
	}

	immutable(ClassInfo) classinfo() const {
		return interfaceinfo.classinfo;
	}
}

immutable struct InterfaceInfo {
	ClassInfo			classinfo;
	void*[]				vtbl;
	ptrdiff_t			offset;
}

These two made implementing D internals a whole lot easier! I think only 
InterfaceInfo is in druntime (and its confusingly named Interface in there).

Sep 29 2009

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 If even Andrei, a quite intelligent person that has written big books
 on C++, may be wrong on such a basic thing, then I think there's a
 problem.

Not everyone is an expert on everything, and how vptrs and vtbl[]s and 
casting actually work for multiple inheritance is far from being a basic 
thing.

Furthermore, different compilers implement these things differently. 
Last I heard, Java did it the way Andrei described.

Don Clugston wrote an article a few years ago on this, and found a wide 
variety of implementation strategies. The Digital Mars one was the 
fastest <g>.

Sep 29 2009

"Dejan Lekic" <dejan.lekic gmail.com> writes:

Walter, is that article publicly available?

Oct 02 2009

Don <nospam nospam.com> writes:

Dejan Lekic wrote:
 Walter, is that article publicly available?

http://www.codeproject.com/KB/cpp/FastDelegate.aspx

Oct 02 2009

"Dejan Lekic" <dejan.lekic gmail.com> writes:

Thanks Don! \o/

Oct 02 2009

"Saaa" <empty needmail.com> writes:

bearophile wrote
 Walter Bright:

No, it is done with one indirection.<

 If even Andrei, a quite intelligent person that has written big books on 
 C++, may be wrong on such a basic thing, then I think there's a problem.

 It can be good to create an html page that explains how some basic things 
 of D are implemented in the front-end. Such page can also contain box & 
 arrow images that show how structures and memory are organized for various 
 of such data structures.

 Such html page is useful for both normal programmers that want to 
 understand what's under the hood, and for people that may want to 
 fix/modify the front-end.


?:)
I seem to have requested the thing you here ask for.
(within 24 hours even)
http://d.puremagic.com/issues/show_bug.cgi?id=3351

Sep 30 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Saaa wrote:
 bearophile wrote
 Walter Bright:

 No, it is done with one indirection.<

 If even Andrei, a quite intelligent person that has written big books on 
 C++, may be wrong on such a basic thing, then I think there's a problem.

 It can be good to create an html page that explains how some basic things 
 of D are implemented in the front-end. Such page can also contain box & 
 arrow images that show how structures and memory are organized for various 
 of such data structures.

 Such html page is useful for both normal programmers that want to 
 understand what's under the hood, and for people that may want to 
 fix/modify the front-end.

 
 
 ?:)
 I seem to have requested the thing you here ask for.
 (within 24 hours even)
 http://d.puremagic.com/issues/show_bug.cgi?id=3351 

I wonder whether this would be a good topic for TDPL. Currently I'm 
thinking it's too low-level. I do plan to insert a short section about 
implementation, just not go deep inside the object model.

Andrei

Sep 30 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Andrei Alexandrescu wrote:
 Saaa wrote:
 bearophile wrote
 Walter Bright:

 No, it is done with one indirection.<

 If even Andrei, a quite intelligent person that has written big books 
 on C++, may be wrong on such a basic thing, then I think there's a 
 problem.

 It can be good to create an html page that explains how some basic 
 things of D are implemented in the front-end. Such page can also 
 contain box & arrow images that show how structures and memory are 
 organized for various of such data structures.

 Such html page is useful for both normal programmers that want to 
 understand what's under the hood, and for people that may want to 
 fix/modify the front-end.


 ?:)
 I seem to have requested the thing you here ask for.
 (within 24 hours even)
 http://d.puremagic.com/issues/show_bug.cgi?id=3351 

 
 I wonder whether this would be a good topic for TDPL. Currently I'm 
 thinking it's too low-level. I do plan to insert a short section about 
 implementation, just not go deep inside the object model.
 
 Andrei

Maybe that's a topic for an appendix of the book. It is really useful to 
know the internals of a language, even if you don't directly use them it 
can impact design choices.

Right now the best way to learn these internals is still to go hack and 
slash with the compiler's runtime implementation.

Besides, there is no such thing as too low-level :)

Sep 30 2009

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 I wonder whether this would be a good topic for TDPL. Currently I'm 
 thinking it's too low-level. I do plan to insert a short section about 
 implementation, just not go deep inside the object model.

It's a very good topic for the book. Any good book about computer languages
teaches not just a language, but also good programming practices and some
general computer science too. In a big book about a system language I want to
see "under the cover" topics too, otherwise I'll need to buy another book to
learn them :-) So it's good for a book about a system language to explain how
some parts of the compiler are implemented, because such parts are code too
(and the level of such code can be the same, if someday will translate the D
front-end to D).
For example I have appreciated the chapter about Python Dict implementation in
a chapter of "Beautiful code".
I think you aren't interested in my help any more, but I hope you will follow
this suggestion of mine (I'll buy your book anyway, but I know what I'd like to
find in it). On the other hand writing about topics you don't know enough about
may be negative, in such situation avoiding the topic may be better.

Bye,
bearophile

Sep 30 2009

"Saaa" <empty needmail.com> writes:

Andrei Alexandrescu wrote
 I wonder whether this would be a good topic for TDPL. Currently I'm 
 thinking it's too low-level. I do plan to insert a short section about 
 implementation, just not go deep inside the object model.

 Andrei

I'd really love to see more about implementations as it makes me twitch
to use something I don't really know the impact of.

As for using diagrams and other visual presentations:
Please use them as much as possible;
e.g. Pointers without arrows is like a film without moving pictures :)

Sep 30 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Saaa wrote:
 Andrei Alexandrescu wrote
 I wonder whether this would be a good topic for TDPL. Currently I'm 
 thinking it's too low-level. I do plan to insert a short section about 
 implementation, just not go deep inside the object model.

 Andrei

 
 I'd really love to see more about implementations as it makes me twitch
 to use something I don't really know the impact of.
 
 As for using diagrams and other visual presentations:
 Please use them as much as possible;
 e.g. Pointers without arrows is like a film without moving pictures :)

I do have the clasic arrow drawings that illustrate how reference 
semantics works, but by and large I'm not talented with drawing. If 
anyone in this group has such an inclination and would want to 
collaborate with me on the book, let me know. Send me your portfolio :o).

Andrei

Sep 30 2009

Christopher Wright <dhasenan gmail.com> writes:

Andrei Alexandrescu wrote:
 I seem to recall that 
 interface dispach in D does a linear search in the interfaces list, so 
 you may want to repeat your tests with a variable number of interfaces, 
 and a variable position of the interface being used.

Such numbers are not interesting to me. On average, each class I write 
implements one interface. I rarely use inheritance and interfaces in the 
same class.

But your information is incorrect. Here's what happens:

object of class A
| vtable
|   | classinfo pointer
|   | methods...
| fields...
| interface vtable
|   | struct Interface*
|   | methods

struct Interface
{
    ptrdiff_t this_offset;
    ClassInfo interfaceInfo;
}

There are two ways to implement interface calls with this paradigm. The 
compiler way:

interface I
{
    void doStuff(int arg);
}
class A
{
    void doStuff(int arg) { writefln("do stuff! %s", arg); }

    // this method actually goes into the interface vtable
    ReturnType!doStuff __I_doStuff(ParameterTypeTuple!doStuff args)
    {
       auto iface = cast(Interface*)this.vtable[0];
       this = this + iface.this_offset;
       return doStuff(args);
    }
}


You can also do it with the runtime, but that's a lot harder. It would 
be effectively the same code.

Sep 29 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 28/09/2009 15:28, Jeremie Pelletier wrote:
 here's a type-safe alternative
 note: untested

 struct Vec3F {
 float[3] v;
 alias v[0] x;
 alias v[1] y;
 alias v[2] z;
 }

 D provides alignment control for structs, why do we need to have a
 separate union construct if it is just a special case of struct
 alignment?

 These aliases won't compile, and that was only one out of many union uses.

what other use cases for unions exist that cannot be redesigned in a 
safer way?

 IMO the use cases for union are very rare and they all can be
 redesigned in a type safe manner.

 Not always true.

 when software was small and simple, hand tuning code with low level
 mechanisms (such as unions and even using assembly) made a lot of
 sense. Today's software is typically far more complex and is way to
 big to risk loosing safety features for marginal performance gains.

 micro optimizations simply doesn't scale.

 Again, that's a lazy view on programming. High level constructs are
 useful to isolate small and simple algorithms which are implemented at
 low level.

One way to define programming is "being lazy". You ask the machine to do 
your work since you are lazy to do it yourself.

your view above about simple algorithms which are implemented at low 
level is exactly the place where we disagree.

Have you ever heard of Stalin (i'm not talking about the dictator)?

I was pointing to a trade off at play here:
you can write low level hand optimized code that is hard to maintain and 
reason about (for example, providing formal proof of correctness). You 
gained some small, non scalable performance gains and lost on other 
fronts like proving correctness of your code.

the other way would be to write high level very regular code that can be 
maintained, easier to reason about and leave optimization to the tools. 
granted, there could be some initial performance hit compared to the 
previous approach but this is more portable:
hardware changes do not affect code, you just need to re-run the tool. 
new optimization techniques can be employed by running a newer version 
of the tool, etc.

I should also note that the second approach is already applied by 
compilers. unless you use inline ASM, the compiler will not use the 
entire ASM instruction set which contains special cases for performance 
tuning.

 These aren't just marginal performance gains, they can easily be up to
 15-30% improvements, sometimes 50% and more. If this is too complex or
 the risk is too high for you then don't use a systems language :)

your approach makes sense if your are implementing say a calculator.
It doesn't scale to larger projects. Even C++ has overhead compared to 
assembly yet you are writing performance critical code in c++, right?

Java had a reputation of being slow yet today performance critical 
servers are written in Java and not in C++ in order to get faster 
execution.

Sep 28 2009

bearophile <bearophileHUGS lycos.com> writes:

Yigal Chripun:

Have you ever heard of Stalin (i'm not talking about the dictator)?<

Stalin accepts only a certain subset of Scheme, you can't use some of the
nicest things.
And while ShedSkin is slow, Stalin is really slow, so slow that compiling
largish programs becomes not handy (I think times like 100 seconds for 500
lines-long programs, I don't know if such timings have improved in the
meantime, I hope so).


 the other way would be to write high level very regular code that can be 
 maintained, easier to reason about and leave optimization to the tools.

Life is usually a matter of finding a balance. If you care of performance you
don't use Scheme, you use a handy language that doesn't force the compiler to


Bye,
bearophile

Sep 28 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 29/09/2009 00:31, Nick Sabalausky wrote:
 "Yigal Chripun"<yigal100 gmail.com>  wrote in message
 news:h9r37i$tgl$1 digitalmars.com...
 These aren't just marginal performance gains, they can easily be up to
 15-30% improvements, sometimes 50% and more. If this is too complex or
 the risk is too high for you then don't use a systems language :)

 your approach makes sense if your are implementing say a calculator.
 It doesn't scale to larger projects. Even C++ has overhead compared to
 assembly yet you are writing performance critical code in c++, right?

 It's *most* important on larger projects, because it's only on big systems
 where small inefficiencies actually add up to a large performance drain.

 Try writing a competitive real-time graphics renderer or physics simulator
 (especially for a game console where you're severely limited in your choice
 of compiler - if you even have a choice), or something like Pixar's renderer
 without *ever* diving into asm, or at least low-level "unsafe" code. And
 when it inevitably hits some missing optimization in the compiler and runs
 like shit, try explaining to the dev lead why it's better to beg the
 compiler vender to add the optimization you want and wait around hoping they
 finally do so, instead of just throwing in that inner optimization in the
 meantime.

 You can still leave the safe/portable version in there for platforms for
 which you haven't provided a hand-optimization. And unless you didn't know
 what you were doing, that inner optimization will still be small and highly
 isolated. And since it's so small and isolated, not only can you still throw
 in tests for it, but it's not as much harder as you would think to veryify
 correctness. And if/when your compiler finally does get the optimization you
 want, you can just rip out the hand-optimization and revert back to that
 "safe/portable" version that you had still left in anyway as a fallback.

I think you took my post to an extreme, I actually do agree with the 
above description.

what you just said was basically:
1. write portable/safe version
2. profile to find bottlenecks that the tools can't optimize and 
optimize those only while still keeping the portable version.

My objection was to what i feel was Jeremie's description of writing 
code from the get go in low level hand optimized way instead of what you 
described in your own words:

 And unless you didn't know
 what you were doing, that inner optimization will still be small and highly
 isolated.

Sep 28 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Yigal Chripun wrote:
 On 29/09/2009 00:31, Nick Sabalausky wrote:
 "Yigal Chripun"<yigal100 gmail.com>  wrote in message
 news:h9r37i$tgl$1 digitalmars.com...
 These aren't just marginal performance gains, they can easily be up to
 15-30% improvements, sometimes 50% and more. If this is too complex or
 the risk is too high for you then don't use a systems language :)

 your approach makes sense if your are implementing say a calculator.
 It doesn't scale to larger projects. Even C++ has overhead compared to
 assembly yet you are writing performance critical code in c++, right?

 It's *most* important on larger projects, because it's only on big 
 systems
 where small inefficiencies actually add up to a large performance drain.

 Try writing a competitive real-time graphics renderer or physics 
 simulator
 (especially for a game console where you're severely limited in your 
 choice
 of compiler - if you even have a choice), or something like Pixar's 
 renderer
 without *ever* diving into asm, or at least low-level "unsafe" code. And
 when it inevitably hits some missing optimization in the compiler and 
 runs
 like shit, try explaining to the dev lead why it's better to beg the
 compiler vender to add the optimization you want and wait around 
 hoping they
 finally do so, instead of just throwing in that inner optimization in the
 meantime.

 You can still leave the safe/portable version in there for platforms for
 which you haven't provided a hand-optimization. And unless you didn't 
 know
 what you were doing, that inner optimization will still be small and 
 highly
 isolated. And since it's so small and isolated, not only can you still 
 throw
 in tests for it, but it's not as much harder as you would think to 
 veryify
 correctness. And if/when your compiler finally does get the 
 optimization you
 want, you can just rip out the hand-optimization and revert back to that
 "safe/portable" version that you had still left in anyway as a fallback.

 
 I think you took my post to an extreme, I actually do agree with the 
 above description.
 
 what you just said was basically:
 1. write portable/safe version
 2. profile to find bottlenecks that the tools can't optimize and 
 optimize those only while still keeping the portable version.
 
 My objection was to what i feel was Jeremie's description of writing 
 code from the get go in low level hand optimized way instead of what you 
 described in your own words:

That wasn't what I said, I don't low level hand optimize everything, I 
do profiling first, only a few parts *known* to me to require 
optimizations (ie matrix multiplication) are written in sse from the 
beginning with a high level fallback, there just happen to be a lot of 
them :)

What I argued about was your view on today's software being too big and 
complex to bother optimize it.

 And unless you didn't know
 what you were doing, that inner optimization will still be small and 
 highly
 isolated.

Sep 29 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 29/09/2009 16:41, Jeremie Pelletier wrote:

 What I argued about was your view on today's software being too big and
 complex to bother optimize it.

that is not what I said.
I was saying that hand optimized code needs to be kept at minimum and 
only for visible bottlenecks, because the risk of introducing low-level 
unsafe code is bigger in more complex and bigger software.

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Yigal Chripun wrote:
 On 29/09/2009 16:41, Jeremie Pelletier wrote:
 
 What I argued about was your view on today's software being too big and
 complex to bother optimize it.

 
 that is not what I said.
 I was saying that hand optimized code needs to be kept at minimum and 
 only for visible bottlenecks, because the risk of introducing low-level 
 unsafe code is bigger in more complex and bigger software.

What's wrong with taking a risk? If you know what you're doing where is 
the risk, and if now how will you learn? If you write your software 
correctly, you could add countless assembly optimizations and never 
compromise the security of the entire thing, because these optimizations 
are isolated, so if it crashes there you have only a narrow area to 
debug within.

There are some parts where hand optimizing is almost useless, like 
network I/O since latency is already so high having a faster code won't 
make a difference.

And sometimes the optimization doesn't even need assembly, it just 
requires using a different high level construct or a different 
algorithm. The first optimization is to get the most efficient data 
structures with the most efficient algorithms for a given task, and THEN 
if you can't optimize it more you dig into assembly.

People seem to think assembly is something magical and incredibly hard, 
it's not.

Jeremie

Sep 30 2009

Don <nospam nospam.com> writes:

Jeremie Pelletier wrote:
 Yigal Chripun wrote:
 On 29/09/2009 16:41, Jeremie Pelletier wrote:

 What I argued about was your view on today's software being too big and
 complex to bother optimize it.

 that is not what I said.
 I was saying that hand optimized code needs to be kept at minimum and 
 only for visible bottlenecks, because the risk of introducing 
 low-level unsafe code is bigger in more complex and bigger software.

 
 What's wrong with taking a risk? If you know what you're doing where is 
 the risk, and if now how will you learn? If you write your software 
 correctly, you could add countless assembly optimizations and never 
 compromise the security of the entire thing, because these optimizations 
 are isolated, so if it crashes there you have only a narrow area to 
 debug within.
 
 There are some parts where hand optimizing is almost useless, like 
 network I/O since latency is already so high having a faster code won't 
 make a difference.
 
 And sometimes the optimization doesn't even need assembly, it just 
 requires using a different high level construct or a different 
 algorithm. The first optimization is to get the most efficient data 
 structures with the most efficient algorithms for a given task, and THEN 
 if you can't optimize it more you dig into assembly.
 
 People seem to think assembly is something magical and incredibly hard, 
 it's not.
 
 Jeremie

Also, if you're using asm on something other than a small, simple loop, 
you're probably doing something badly wrong. Therefore, it should always 
be localised, and easy to test thoroughly. I don't think local extreme 
optimisation is a big risk.

Greater risks come from using more complicated algorithms. Brute-force 
algorithms are always the easiest ones to get right <g>.

Sep 30 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Don wrote:
 Jeremie Pelletier wrote:
 Yigal Chripun wrote:
 On 29/09/2009 16:41, Jeremie Pelletier wrote:

 What I argued about was your view on today's software being too big and
 complex to bother optimize it.

 that is not what I said.
 I was saying that hand optimized code needs to be kept at minimum and 
 only for visible bottlenecks, because the risk of introducing 
 low-level unsafe code is bigger in more complex and bigger software.

 What's wrong with taking a risk? If you know what you're doing where 
 is the risk, and if now how will you learn? If you write your software 
 correctly, you could add countless assembly optimizations and never 
 compromise the security of the entire thing, because these 
 optimizations are isolated, so if it crashes there you have only a 
 narrow area to debug within.

 There are some parts where hand optimizing is almost useless, like 
 network I/O since latency is already so high having a faster code 
 won't make a difference.

 And sometimes the optimization doesn't even need assembly, it just 
 requires using a different high level construct or a different 
 algorithm. The first optimization is to get the most efficient data 
 structures with the most efficient algorithms for a given task, and 
 THEN if you can't optimize it more you dig into assembly.

 People seem to think assembly is something magical and incredibly 
 hard, it's not.

 Jeremie

 
 Also, if you're using asm on something other than a small, simple loop, 
 you're probably doing something badly wrong. Therefore, it should always 
 be localised, and easy to test thoroughly. I don't think local extreme 
 optimisation is a big risk.

That's also how I do it once I find the ideal algorithm, I've never had 
any problems or seen any risk with this technique, I did see some good 
performance gains however.

 Greater risks come from using more complicated algorithms. Brute-force 
 algorithms are always the easiest ones to get right <g>.

I'm not sure I agree with that. Those algorithms are pretty isolated and 
really easy to write unittests for so I don't see where the risk is when 
writing more complex algorithms, it's obviously harder, but not riskier.

On the other hand, things like GUI libraries are one big package where 
unittests are useless most of the time, that's a much greater risk even 
with straightforward and trivial code.

I read somewhere that the best optimizer is between your ears, I have 
yet to see someone or something prove that quote wrong! Besides how are 
you going to get comfortable with "complex" stuff if you never play with 
it, its really only complex when you're learning it, once it has been 
assimilated by the brain its become almost trivial to use.

Sep 30 2009

language_fan <foo bar.com.invalid> writes:

Wed, 30 Sep 2009 12:05:29 -0400, Jeremie Pelletier thusly wrote:

 Don wrote:
 Greater risks come from using more complicated algorithms. Brute-force
 algorithms are always the easiest ones to get right <g>.

 
 I'm not sure I agree with that. Those algorithms are pretty isolated and
 really easy to write unittests for so I don't see where the risk is when
 writing more complex algorithms, it's obviously harder, but not riskier.

Do you recommend writing larger algorithms like a hard real-time 
distributed (let's say e.g. for 100+ processes/nodes) garbage collector 
or even larger stuff like btrfs or ntfs file system drivers in assembly? 
Don't you care about portability? Of course it would be nice to provide 
optimal solution for each platform and for each use case, but 
unfortunately the TCO thinking managers do not often agree.

Sep 30 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

language_fan wrote:
 Wed, 30 Sep 2009 12:05:29 -0400, Jeremie Pelletier thusly wrote:
 
 Don wrote:
 Greater risks come from using more complicated algorithms. Brute-force
 algorithms are always the easiest ones to get right <g>.

 I'm not sure I agree with that. Those algorithms are pretty isolated and
 really easy to write unittests for so I don't see where the risk is when
 writing more complex algorithms, it's obviously harder, but not riskier.

 
 Do you recommend writing larger algorithms like a hard real-time 
 distributed (let's say e.g. for 100+ processes/nodes) garbage collector 
 or even larger stuff like btrfs or ntfs file system drivers in assembly? 
 Don't you care about portability? Of course it would be nice to provide 
 optimal solution for each platform and for each use case, but 
 unfortunately the TCO thinking managers do not often agree.

Why does everyone associate complexity with assembly? You can write a 
more complex algorithm in the same language as the original one and get 
quite a good performance boost (ie binary search vs walking an array). 
Assembly is only useful to optimize when you found the optimal algorithm 
and want to lower its overhead a step further.

I don't recommend any language anyways, the base algorithm is often 

or assembly its gonna do the same thing at different performance levels.

For example a simple binary search is already faster in D than in say 
JavaScript, but its even faster in assembly than in D, that doesn't make 
your entire program harder to code, nor does it change the logic.

Sep 30 2009

language_fan <foo bar.com.invalid> writes:

Wed, 30 Sep 2009 17:05:18 -0400, Jeremie Pelletier thusly wrote:

 language_fan wrote:
 Wed, 30 Sep 2009 12:05:29 -0400, Jeremie Pelletier thusly wrote:
 
 Don wrote:
 Greater risks come from using more complicated algorithms.
 Brute-force algorithms are always the easiest ones to get right <g>.

 I'm not sure I agree with that. Those algorithms are pretty isolated
 and really easy to write unittests for so I don't see where the risk
 is when writing more complex algorithms, it's obviously harder, but
 not riskier.

 
 Do you recommend writing larger algorithms like a hard real-time
 distributed (let's say e.g. for 100+ processes/nodes) garbage collector
 or even larger stuff like btrfs or ntfs file system drivers in
 assembly? Don't you care about portability? Of course it would be nice
 to provide optimal solution for each platform and for each use case,
 but unfortunately the TCO thinking managers do not often agree.

 
 Why does everyone associate complexity with assembly? You can write a
 more complex algorithm in the same language as the original one and get
 quite a good performance boost (ie binary search vs walking an array).
 Assembly is only useful to optimize when you found the optimal algorithm
 and want to lower its overhead a step further.
 
 I don't recommend any language anyways, the base algorithm is often

 or assembly its gonna do the same thing at different performance levels.
 
 For example a simple binary search is already faster in D than in say
 JavaScript, but its even faster in assembly than in D, that doesn't make
 your entire program harder to code, nor does it change the logic.

Well I meant that we can assume the algorithm choice is already optimal.

Porting the high level program to assembly tends to grow the line count 
quite a bit. For instance I have experience converting Java code to 
Scala, and C++ to Haskell. In both cases the LOC will decrease about 
50-90%. If you convert things like foreach, ranges, complex expressions, 
lambdas, an scope() constructs to assembly, it will increase the line 
count at least one order of magnitude. Reading the lower level code is 
much harder. And you lose important safety nets like the type system.

Sep 30 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

language_fan wrote:
 Wed, 30 Sep 2009 17:05:18 -0400, Jeremie Pelletier thusly wrote:
 
 language_fan wrote:
 Wed, 30 Sep 2009 12:05:29 -0400, Jeremie Pelletier thusly wrote:

 Don wrote:
 Greater risks come from using more complicated algorithms.
 Brute-force algorithms are always the easiest ones to get right <g>.

 I'm not sure I agree with that. Those algorithms are pretty isolated
 and really easy to write unittests for so I don't see where the risk
 is when writing more complex algorithms, it's obviously harder, but
 not riskier.

 Do you recommend writing larger algorithms like a hard real-time
 distributed (let's say e.g. for 100+ processes/nodes) garbage collector
 or even larger stuff like btrfs or ntfs file system drivers in
 assembly? Don't you care about portability? Of course it would be nice
 to provide optimal solution for each platform and for each use case,
 but unfortunately the TCO thinking managers do not often agree.

 Why does everyone associate complexity with assembly? You can write a
 more complex algorithm in the same language as the original one and get
 quite a good performance boost (ie binary search vs walking an array).
 Assembly is only useful to optimize when you found the optimal algorithm
 and want to lower its overhead a step further.

 I don't recommend any language anyways, the base algorithm is often

 or assembly its gonna do the same thing at different performance levels.

 For example a simple binary search is already faster in D than in say
 JavaScript, but its even faster in assembly than in D, that doesn't make
 your entire program harder to code, nor does it change the logic.

 
 Well I meant that we can assume the algorithm choice is already optimal.
 
 Porting the high level program to assembly tends to grow the line count 
 quite a bit. For instance I have experience converting Java code to 
 Scala, and C++ to Haskell. In both cases the LOC will decrease about 
 50-90%. If you convert things like foreach, ranges, complex expressions, 
 lambdas, an scope() constructs to assembly, it will increase the line 
 count at least one order of magnitude. Reading the lower level code is 
 much harder. And you lose important safety nets like the type system.

Yeah but I don't rate my code based on the number of lines I write, but 
rather on how well it performs :)

I usually only go into assembly after profiling, or when I know from the 
start its gonna be faster, such as matrix multiplication.

If lines of code were more important than performance, you'd get entire 
OSes and all their programs written in javascript, and you'd wait 20 
minutes for your computer to boot.

Sep 30 2009

Don <nospam nospam.com> writes:

language_fan wrote:
 Wed, 30 Sep 2009 12:05:29 -0400, Jeremie Pelletier thusly wrote:
 
 Don wrote:
 Greater risks come from using more complicated algorithms. Brute-force
 algorithms are always the easiest ones to get right <g>.

 I'm not sure I agree with that. Those algorithms are pretty isolated and
 really easy to write unittests for so I don't see where the risk is when
 writing more complex algorithms, it's obviously harder, but not riskier.

 
 Do you recommend writing larger algorithms like a hard real-time 
 distributed (let's say e.g. for 100+ processes/nodes) garbage collector 
 or even larger stuff like btrfs or ntfs file system drivers in assembly? 
 Don't you care about portability? Of course it would be nice to provide 
 optimal solution for each platform and for each use case, but 
 unfortunately the TCO thinking managers do not often agree.

You deal with this by ensuring that you have a clear division between 
"simple but needs to be as fast as possible" (which you do low-level 
optimisation on) and "complicated, but less speed critical".
It's a classic problem of separation of concerns: you need to ensure 
that no piece of code has requirements to be fast AND clever at the same 
time.

Incidentally, it's usually not possible to make something optimally fast 
unless it's really simple.
So no, you should never do something complicated in asm.

Oct 01 2009

Don <nospam nospam.com> writes:

Jeremie Pelletier wrote:
 Don wrote:
 Jeremie Pelletier wrote:
 Yigal Chripun wrote:
 On 29/09/2009 16:41, Jeremie Pelletier wrote:

 What I argued about was your view on today's software being too big 
 and
 complex to bother optimize it.

 that is not what I said.
 I was saying that hand optimized code needs to be kept at minimum 
 and only for visible bottlenecks, because the risk of introducing 
 low-level unsafe code is bigger in more complex and bigger software.

 What's wrong with taking a risk? If you know what you're doing where 
 is the risk, and if now how will you learn? If you write your 
 software correctly, you could add countless assembly optimizations 
 and never compromise the security of the entire thing, because these 
 optimizations are isolated, so if it crashes there you have only a 
 narrow area to debug within.

 There are some parts where hand optimizing is almost useless, like 
 network I/O since latency is already so high having a faster code 
 won't make a difference.

 And sometimes the optimization doesn't even need assembly, it just 
 requires using a different high level construct or a different 
 algorithm. The first optimization is to get the most efficient data 
 structures with the most efficient algorithms for a given task, and 
 THEN if you can't optimize it more you dig into assembly.

 People seem to think assembly is something magical and incredibly 
 hard, it's not.

 Jeremie

 Also, if you're using asm on something other than a small, simple 
 loop, you're probably doing something badly wrong. Therefore, it 
 should always be localised, and easy to test thoroughly. I don't think 
 local extreme optimisation is a big risk.

 
 That's also how I do it once I find the ideal algorithm, I've never had 
 any problems or seen any risk with this technique, I did see some good 
 performance gains however.
 
 Greater risks come from using more complicated algorithms. Brute-force 
 algorithms are always the easiest ones to get right <g>.

 
 I'm not sure I agree with that. Those algorithms are pretty isolated and 
 really easy to write unittests for so I don't see where the risk is when 
 writing more complex algorithms, it's obviously harder, but not riskier.

By "riskier" I mean "more chance of containing an error".

I'm partly basing this on my recent experience with writing BigInt. The 
low-level asm routines are easy to get right, and it's easy to tell when 
you've go them wrong. They do brute-force stuff, like schoolbook O(n^2) 
multiplication, and importantly, _there are no special cases_ because it 
needs to be fast.
But the higher-level O(n^1.3) multiplication algorithms are full of 
special cases, and that's where the bugs are.

Oct 01 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 30/09/2009 16:53, Jeremie Pelletier wrote:
 Yigal Chripun wrote:
 On 29/09/2009 16:41, Jeremie Pelletier wrote:

 What I argued about was your view on today's software being too big and
 complex to bother optimize it.

 that is not what I said.
 I was saying that hand optimized code needs to be kept at minimum and
 only for visible bottlenecks, because the risk of introducing
 low-level unsafe code is bigger in more complex and bigger software.

 What's wrong with taking a risk? If you know what you're doing where is
 the risk, and if now how will you learn? If you write your software
 correctly, you could add countless assembly optimizations and never
 compromise the security of the entire thing, because these optimizations
 are isolated, so if it crashes there you have only a narrow area to
 debug within.

 There are some parts where hand optimizing is almost useless, like
 network I/O since latency is already so high having a faster code won't
 make a difference.

 And sometimes the optimization doesn't even need assembly, it just
 requires using a different high level construct or a different
 algorithm. The first optimization is to get the most efficient data
 structures with the most efficient algorithms for a given task, and THEN
 if you can't optimize it more you dig into assembly.

 People seem to think assembly is something magical and incredibly hard,
 it's not.

 Jeremie

When I said optimizing, I meant lowering the implementation level by 
using lower level language constructs (pointers vs. references for 
example) and asm instead of D.
Assume that the choice of algorithm and data structures is optimal.

Like language_fan wrote, when you lower the level your increase your LOC 
and your loose all sorts of safety features.
statistically speaking there's about a bug per 2000LOC on average so you 
also increase the chance of a bug.
All that together mean a higher risk.

your ASM implementation of binary search could be slightly faster than a 
comparable Haskel implementation, but the latter would be much easier to 
formally prove that it's correct.

I don't know about you, but I prefer hospital equipment, airplanes, 
cars, etc, to be correct even if they'll be a couple percent slower.

Sep 30 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Sep 26, 2009 at 10:41 PM, Walter Bright
<newshound1 digitalmars.com> wrote:

 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 Null pointer seg faults *are* a safety rail. They keep an errant program
 from causing further damage.

If you haven't crawled out from under your rock in the last twenty
years, I'd like to point out that the accepted definition of safety
and program correctness has changed since then.

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

 The only time I've had a 
 problem finding where a null came from (because they tend to fail very 
 close to their initialization point) is when the null was caused by 
 another memory corruption problem. Non-nullable references won't 
 mitigate that.

There are some ways to reduce the number/probability of memory corruptions too
in a C-like language. Memory regions, region analysis, etc. We can discuss
about this too, but this is another topic.

Bye,
bearophile

Sep 26 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 
 They do just that in Java because of the checked-exceptions thing. I
 have a reference to Bruce Eckel's essay on it somewhere in this thread.
 The observation in the article was it wasn't just moron idiot
 programmers doing this. It was the guru programmers doing it, all the
 while knowing it was the wrong thing to do. The end result was the
 feature actively created the very problems it was designed to prevent.

Checked exceptions are a bad example: you can't not use them.  No one is
proposing to remove null from the language.  If we WERE, you would be
quite correct.

But we're not.

If someone doesn't want to use non-null references, then they don't use
them.

 Or uses pointer arithmetic and
 casts to get at those pesky private members?

 
 That's entirely different, because privacy is selected by the
 programmer, not the language. I don't have any issue with a user-defined
 type that is non-nullable (Andrei has designed a type constructor for
 that).

Good grief, that's what non-null references are!

  Object foo = new Object;
      // Dear Mr. Compiler, I would like a non-nullable
      // reference to an Object, please!  Here's the object
      // I want you to use.

  Object? bar;
      // Dear Mr. Compiler, I would like a nullable reference
      // to an object, please!  Just initialise with null, thanks.

How is that not selected by the programmer?  The programmer is in
complete control.  We are not asking for the language to unilaterally
declare null to be a sin, we want to be given the choice to say we don't
want it!

Incidentally, on the subject of non-null as a UDT, that would be a
largely acceptable solution for me.  The trouble is that in order to do
it, you'd need to be able to block default initialisation,

  **which is precisely what you're arguing against**

You can't have it both ways.

 If someone is actively trying to break the type system, it's their
 goddamn fault!  Honestly, I don't care about the hacks they employ to
 defeat the system because they're going to go around blindly shooting
 themselves in the foot no matter what they do.

 
 True, but it's still not a good idea to design a language feature that
 winds up, in reality, encouraging bad programming practice. It
 encourages bad practice in a way that is really, really hard to detect
 in a code review.

Whether or not it encourages it is impossible to determine at this
juncture because I can't think of a language comparable to D that has it.

Things that are "like" it don't count.

Ignoring that, you're correct that if someone decides to abuse non-null
references, it's going to be less than trivial to detect.

 I like programming mistakes to be obvious, not subtle. There's nothing
 subtle about a null pointer exception. There's plenty subtle about the
 wrong default value.

I think this is a fallacy.  You're assuming a person who is actively
going out of their way to misuse the type system.  I'll repeat myself:

  Foo bar = arbitrary_default;

is harder to do than

  Foo? bar;

Which does exactly what they want: it relieves them of the need to
initialise, and gives a relatively safe default value.

I mean, people could abuse a lot of things in D.  Pointers, certainly.
DEFINITELY inline assembler.  But we don't get rid of them because at
some point you have to say "you know what?  If you're going to play with
fire, that's your own lookout."

The only way you're ever going to have a language that's actually safe
no matter how ignorant, stupid or just outright suicidal the programmer
is would be to implement a compiler for SIMPLE:

http://esoteric.voxelperfect.net/wiki/SIMPLE

 And what about the people who AREN'T complete idiots, who maybe
 sometimes just accidentally trip and would quite welcome a safety rail
 there?

 
 Null pointer seg faults *are* a safety rail. They keep an errant program
 from causing further damage.

Really?

"
I used to work at Boeing designing critical flight systems. Absolutely
the WRONG failure mode is to

**pretend nothing went wrong**

and happily return

**default values**

and show lovely green lights on the instrument panel. The right thing is to

**immediately inform the pilot that something went wrong and INSTANTLY
SHUT THE BAD SYSTEM DOWN**

before it does something really, really bad, because now it is in an
unknown state. The pilot then follows the procedure he's trained to,
such as engage the backup.
"

Think of the compiler as the autopilot.

Pretending nothing went wrong is passing a null into a function that
doesn't expect it, or shoving it into a field that's not meant to be null.

Null IS a happy default value that can be passed around without
consequence from the type system.

Immediately informing the pilot is refusing to compile because the code
looks like it's doing something wrong.

A NPE is the thermonuclear option of error handling.  Your program blows
up, tough luck, try again.  Debugging is forensics, just like picking
through a mound of dead bodies and bits of fuselage; if it's come to
that, there's a problem.

Non-nullable references are the compiler (or autopilot) putting up the
red flag and saying "are you really sure you want to do this?  I mean,
it LOOKS wrong to me!"

 Finally, let me re-post something I wrote the last time this came up:

 The problem with null dereference problems isn't knowing that they're
 there: that's the easy part.  You helpfully get an exception to the
 face when that happens. The hard part is figuring out *where* the
 problem originally occurred. It's not when the exception is thrown
 that's the issue; it's the point at which you placed a null reference
 in a slot where you shouldn't have.


 
 It's a lot harder to track down a bug when the bad initial value gets
 combined with a lot of other data first. The only time I've had a
 problem finding where a null came from (because they tend to fail very
 close to their initialization point) is when the null was caused by
 another memory corruption problem. Non-nullable references won't
 mitigate that.

Only when the nulls are assigned and used locally.

I've had code before when a null accidentally snuck into an object
through a constructor that was written before the field existed.

The object gets passed around.  No problem; it's not null. It gets
stored inside other things, pulled out.  The field itself is pulled out
and passed around, put into other things.

And THEN the program blows up.

You can't run a debugger backwards through time, because that's what you
need to do to figure out where the bloody thing came from.  The NPE
tells you there is a problem, but it doesn't tell you WHY or WHERE.

It's your leg dropping off from necrosis and the doctor going "gee, I
guess you're sick."

It's the plane smashing into the ground and killing everyone inside, a
specialised team spending a month analysing the wreckage and saying
"well, this screw came loose but BUGGERED if we can work out why."

Then, after several more crashes, someone finally realises that it
didn't come loose, it was never there to begin with.  "Oh!  THAT'S why
they keep crashing!

"Gee, would've been nice if the plane wouldn't have taken off without it."

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Daniel Keep wrote:
 
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. I
 have a reference to Bruce Eckel's essay on it somewhere in this thread.
 The observation in the article was it wasn't just moron idiot
 programmers doing this. It was the guru programmers doing it, all the
 while knowing it was the wrong thing to do. The end result was the
 feature actively created the very problems it was designed to prevent.

 A NPE is the thermonuclear option of error handling.  Your program blows
 up, tough luck, try again.  Debugging is forensics, just like picking
 through a mound of dead bodies and bits of fuselage; if it's come to
 that, there's a problem.
 
 It's your leg dropping off from necrosis and the doctor going "gee, I
 guess you're sick."
 
 It's the plane smashing into the ground and killing everyone inside, a
 specialised team spending a month analysing the wreckage and saying
 "well, this screw came loose but BUGGERED if we can work out why."
 
 Then, after several more crashes, someone finally realises that it
 didn't come loose, it was never there to begin with.  "Oh!  THAT'S why
 they keep crashing!
 
 "Gee, would've been nice if the plane wouldn't have taken off without it."

I like your analogies. :)

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Ary Borenszweig wrote:
 Daniel Keep wrote:
 Walter Bright wrote:
 Daniel Keep wrote:
 "But the user will just assign to something useless to get around 
 that!"

 You mean like how everyone wraps every call in try{...}catch(Exception
 e){} to shut the damn exceptions up?

 They do just that in Java because of the checked-exceptions thing. I
 have a reference to Bruce Eckel's essay on it somewhere in this thread.
 The observation in the article was it wasn't just moron idiot
 programmers doing this. It was the guru programmers doing it, all the
 while knowing it was the wrong thing to do. The end result was the
 feature actively created the very problems it was designed to prevent.

 A NPE is the thermonuclear option of error handling.  Your program blows
 up, tough luck, try again.  Debugging is forensics, just like picking
 through a mound of dead bodies and bits of fuselage; if it's come to
 that, there's a problem.

 It's your leg dropping off from necrosis and the doctor going "gee, I
 guess you're sick."

 It's the plane smashing into the ground and killing everyone inside, a
 specialised team spending a month analysing the wreckage and saying
 "well, this screw came loose but BUGGERED if we can work out why."

 Then, after several more crashes, someone finally realises that it
 didn't come loose, it was never there to begin with.  "Oh!  THAT'S why
 they keep crashing!

 "Gee, would've been nice if the plane wouldn't have taken off without 
 it."

 
 I like your analogies. :)

I also do, but try and picture a plane sophisticated to the point it can 
notice missing screws and ask yourself the following question: what is 
making sure such a screw detection system works correctly.

That's really just taking a problem and sending it to another team to 
solve, at the end of the day, it's still a problem. Besides, explosions 
are cool!

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

drop the idea of initializing variables whenever you declare them. Just 
leave them like this:

int i;

and then later initialize them when you need them, for example different 
values depending on some conditions. Then you'll realize how powerful is 
having the compiler stop variables that are not initialized *in the 
context of a function, not necessarily in the same line of their 
declaration*. It's always a win: you get a compile time error, you don't 
have to wait to get an error at runtime.

Until you do that, you won't understand what most people are answering 
to you.

But I know what you'll answer. You'll say "what about pointers?", "what 
about ref parameters?", "what about out parameters?", and then someone 

No point disussing non-null variables without also having the compiler 
stop uninitialized variables.

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Ary Borenszweig wrote:
 Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 drop the idea of initializing variables whenever you declare them. Just 
 leave them like this:

 int i;

 and then later initialize them when you need them, for example different 
 values depending on some conditions. Then you'll realize how powerful is 
 having the compiler stop

I meant "spot"

Sep 26 2009

bearophile <bearophileHUGS lycos.com> writes:

Ary Borenszweig:


 drop the idea of initializing variables whenever you declare them. Just 
 leave them like this:

[...]
 Until you do that, you won't understand what most people are answering 
 to you.

Something similar happens in other fields too. I have had long discussions with
nonbiologists about evolutionary matters. Later I have understood that those
discussions weren't very useful, the best thing for them, to understand why and
how evolution happens, is to do a week of field etology, studying how insects
on a wild lawn interact, compete, fight and cooperate with each other. If you
have some expert that shows you things in just a week you can see lot of
things. At that point you have some common frame of reference that allows you
to understand how evolution happens :-) Practical experience is important.

Bye,
bearophile

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Ary Borenszweig wrote:
 Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 drop the idea of initializing variables whenever you declare them. Just 
 leave them like this:

 int i;

 and then later initialize them when you need them, for example different 
 values depending on some conditions. Then you'll realize how powerful is 
 having the compiler stop variables that are not initialized *in the 
 context of a function, not necessarily in the same line of their 
 declaration*. It's always a win: you get a compile time error, you don't 
 have to wait to get an error at runtime.

 Until you do that, you won't understand what most people are answering 
 to you.

 But I know what you'll answer. You'll say "what about pointers?", "what 
 about ref parameters?", "what about out parameters?", and then someone 

 No point disussing non-null variables without also having the compiler 
 stop uninitialized variables.

All null values are uninitialized, but not all initializers are null, 
especially the void initializer. You can't always rely on initializers 
in your algorithms, you can always rely on null.

Kinda like all pointers are references, but not all references are 
pointers. You can't do pointer arithmetic on references.

Sep 26 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 drop the idea of initializing variables whenever you declare them. 
 Just leave them like this:

 int i;

 and then later initialize them when you need them, for example 
 different values depending on some conditions. Then you'll realize how 
 powerful is having the compiler stop variables that are not 
 initialized *in the context of a function, not necessarily in the same 
 line of their declaration*. It's always a win: you get a compile time 
 error, you don't have to wait to get an error at runtime.

 Until you do that, you won't understand what most people are answering 
 to you.

 But I know what you'll answer. You'll say "what about pointers?", 
 "what about ref parameters?", "what about out parameters?", and then 

 No point disussing non-null variables without also having the compiler 
 stop uninitialized variables.

 All null values are uninitialized, but not all initializers are null, 
 especially the void initializer.

I don't see your point here. "new Object()" is not a null intiializer 
nor "1"... so?

  You can't always rely on initializers
 in your algorithms, you can always rely on null.

Yes, I can always rely on initializers in my algorithm. I can, if the 
compiler lets me safely initialize them whenever I want, not necessarily 
in the line I declare them.

Sep 26 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Ary Borenszweig wrote:
 Jeremie Pelletier wrote:
 Ary Borenszweig wrote:
 Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 drop the idea of initializing variables whenever you declare them. 
 Just leave them like this:

 int i;

 and then later initialize them when you need them, for example 
 different values depending on some conditions. Then you'll realize 
 how powerful is having the compiler stop variables that are not 
 initialized *in the context of a function, not necessarily in the 
 same line of their declaration*. It's always a win: you get a compile 
 time error, you don't have to wait to get an error at runtime.

 Until you do that, you won't understand what most people are 
 answering to you.

 But I know what you'll answer. You'll say "what about pointers?", 
 "what about ref parameters?", "what about out parameters?", and then 

 No point disussing non-null variables without also having the 
 compiler stop uninitialized variables.

 All null values are uninitialized, but not all initializers are null, 
 especially the void initializer.

 I don't see your point here. "new Object()" is not a null intiializer 
 nor "1"... so?

Object o = void;

  You can't always rely on initializers
 in your algorithms, you can always rely on null.

 Yes, I can always rely on initializers in my algorithm. I can, if the 
 compiler lets me safely initialize them whenever I want, not necessarily 
 in the line I declare them.

Nope, never got interested in these to tell the truth. I only did C, 
C++, D and x86 assembly in systems programming, I have quite a 
background in PHP and JavaScript also.

I played with a lot of languages, but those are the ones I use on a 
daily basis. I would like to get into Python or Ruby someday, I only 
hear good things about these two. I know LUA has less overhead than 
Python, but it's more of a support language to implement easy scripting 
over C than a standalone language, I already have my LUA bindings for D 
ready to do just that.

I like extremes :)

Sep 26 2009

language_fan <foo bar.com.invalid> writes:

Sun, 27 Sep 2009 00:08:50 -0400, Jeremie Pelletier thusly wrote:

 Ary Borenszweig wrote:


 
 Nope, never got interested in these to tell the truth. I only did C,
 C++, D and x86 assembly in systems programming, I have quite a
 background in PHP and JavaScript also.

So you only know imperative procedural programming + some features of 
hybrid OOP languages that are not even proper OOP languages.

 
 I played with a lot of languages, but those are the ones I use on a
 daily basis. I would like to get into Python or Ruby someday, I only
 hear good things about these two. I know LUA has less overhead than
 Python

Oh, the only difference between LUA and Python is the overhead?! That's 
a... pretty performance oriented view on languages.

 I like extremes :)

If you like extremes, why have you not programming in Haskell or Coq? Too 
scary? You are often arguing against languages and concepts you have 
never used. The other people here who make these suggestions are more 
experienced with various languages.

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

language_fan wrote:
 Sun, 27 Sep 2009 00:08:50 -0400, Jeremie Pelletier thusly wrote:
 
 Ary Borenszweig wrote:


 Nope, never got interested in these to tell the truth. I only did C,
 C++, D and x86 assembly in systems programming, I have quite a
 background in PHP and JavaScript also.

 
 So you only know imperative procedural programming + some features of 
 hybrid OOP languages that are not even proper OOP languages.

This is what I know best, yeah. I did a lot of work in functional 
programming too, but not enough to add them to the above list.

What is proper OOP anyways? It's a feature offered by the language, not 
a critical design that must obey to some strict standard rules.  Be it 
class based or prototype based, supporting single or multiple 
inheritance, using abstract base classes or interfaces, having funny 
syntax for ctors and whatnot or using the class name or even 'this', its 
still OOP. If you wan't to call me on not knowing 15 languages like you 
do, I have to call you on not knowing the differences in OOP models.

 I played with a lot of languages, but those are the ones I use on a
 daily basis. I would like to get into Python or Ruby someday, I only
 hear good things about these two. I know LUA has less overhead than
 Python

 
 Oh, the only difference between LUA and Python is the overhead?! That's 
 a... pretty performance oriented view on languages.

Yes, I have a performance oriented view, I write a lot of real time 
code, and I hate unresponsive code in general. Now I didn't say it was 
the only difference, what I said is that it's one influencing a lot 
companies and people to pick LUA over Python for scripting.

 I like extremes :)

 
 If you like extremes, why have you not programming in Haskell or Coq? Too 
 scary? You are often arguing against languages and concepts you have 
 never used. The other people here who make these suggestions are more 
 experienced with various languages.

I meant extremes as in full machine control / no control whatsoever, not 
in language semantics :)

I just haven't found a use for Haskell or Coq for what I do yet.

Sep 27 2009

language_fan <foo bar.com.invalid> writes:

Sun, 27 Sep 2009 12:35:23 -0400, Jeremie Pelletier thusly wrote:

 language_fan wrote:
 Sun, 27 Sep 2009 00:08:50 -0400, Jeremie Pelletier thusly wrote:
 
 Ary Borenszweig wrote:


 Nope, never got interested in these to tell the truth. I only did C,
 C++, D and x86 assembly in systems programming, I have quite a
 background in PHP and JavaScript also.

 
 So you only know imperative procedural programming + some features of
 hybrid OOP languages that are not even proper OOP languages.

 
 This is what I know best, yeah. I did a lot of work in functional
 programming too, but not enough to add them to the above list.
 
 What is proper OOP anyways? It's a feature offered by the language, not
 a critical design that must obey to some strict standard rules.  Be it
 class based or prototype based, supporting single or multiple
 inheritance, using abstract base classes or interfaces, having funny
 syntax for ctors and whatnot or using the class name or even 'this', its
 still OOP. If you wan't to call me on not knowing 15 languages like you
 do, I have to call you on not knowing the differences in OOP models.

I must say I have not studied languages that much, only the concepts and 
theory - starting from formal definitions like operational or 
denotational semantics, and some more informal ones. I can professionally 
write code in only about half a dozen languages, but learning new ones is 
trivial if the task requires it.

Generally the common thing for proper pure OOP languages is 'everything 
is an object' mentality. Because of this property there is no strict 
distinction between primitive non-OOP types and OOP types in pure OOP 
languages. In some languages e.g. number values are objects. In others 
there are no static members and even classes are objects, so called meta-
objects. In some way you can see this purity even in UML. If we go into 
details, various OOP languages have major differences in their semantics.

What I meant above is that I know a lot of developers who have a similar 
background as you do. It is really easy to use all of those languages 
without actually using the OOP features in them, at least properly (for 
instance PHP does not even have a real OOP system, it is a cheap rip-off 
of mainstream languages - just look at the scoping rules). I have seen 
Java code where the developer never constructs new objects and only uses 
static methods because he fears the heap allocation is expensive. 
Discussing OOP and language concepts is really hard if you lack the 
theoretical underpinning. It is sad to say this but the best source for 
this knowledge are academic CS books, but nowadays even wikipedia is 
starting to have good articles on the subject.

Sep 27 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

language_fan wrote:
 Sun, 27 Sep 2009 12:35:23 -0400, Jeremie Pelletier thusly wrote:
 
 language_fan wrote:
 Sun, 27 Sep 2009 00:08:50 -0400, Jeremie Pelletier thusly wrote:

 Ary Borenszweig wrote:


 Nope, never got interested in these to tell the truth. I only did C,
 C++, D and x86 assembly in systems programming, I have quite a
 background in PHP and JavaScript also.

 So you only know imperative procedural programming + some features of
 hybrid OOP languages that are not even proper OOP languages.

 This is what I know best, yeah. I did a lot of work in functional
 programming too, but not enough to add them to the above list.

 What is proper OOP anyways? It's a feature offered by the language, not
 a critical design that must obey to some strict standard rules.  Be it
 class based or prototype based, supporting single or multiple
 inheritance, using abstract base classes or interfaces, having funny
 syntax for ctors and whatnot or using the class name or even 'this', its
 still OOP. If you wan't to call me on not knowing 15 languages like you
 do, I have to call you on not knowing the differences in OOP models.

 
 I must say I have not studied languages that much, only the concepts and 
 theory - starting from formal definitions like operational or 
 denotational semantics, and some more informal ones. I can professionally 
 write code in only about half a dozen languages, but learning new ones is 
 trivial if the task requires it.
 
 Generally the common thing for proper pure OOP languages is 'everything 
 is an object' mentality. Because of this property there is no strict 
 distinction between primitive non-OOP types and OOP types in pure OOP 
 languages. In some languages e.g. number values are objects. In others 
 there are no static members and even classes are objects, so called meta-
 objects. In some way you can see this purity even in UML. If we go into 
 details, various OOP languages have major differences in their semantics.
 
 What I meant above is that I know a lot of developers who have a similar 
 background as you do. It is really easy to use all of those languages 
 without actually using the OOP features in them, at least properly (for 
 instance PHP does not even have a real OOP system, it is a cheap rip-off 
 of mainstream languages - just look at the scoping rules). I have seen 
 Java code where the developer never constructs new objects and only uses 
 static methods because he fears the heap allocation is expensive. 
 Discussing OOP and language concepts is really hard if you lack the 
 theoretical underpinning. It is sad to say this but the best source for 
 this knowledge are academic CS books, but nowadays even wikipedia is 
 starting to have good articles on the subject.

I agree, Wikipedia is often the first source I check to learn on 
different concepts, then I search for online papers and documentation, 
dig into source code (Google's code search is a gem), and finally books.

I'm not most programmers, and I'm sure you aren't either. I like to 
learn as much of the semantics and implementation details behind a 
language as I can, only then do I feel I know the language, I like to 
make the best out of everything in the languages I use, not specialize 
in a subset of it.

I don't believe in a perfect programming model, I believe in many 
different models each having their pros and cons that can live in the 
same language forming an all-around solution. That's why I usually stay 
away from 'pure' languages because they impose a single point of view of 
the world, that doesn't mean its a bad one, I just like to look at the 
world from different angles at the same time.

Sep 27 2009

Jason House <jason.james.house gmail.com> writes:

Walter Bright Wrote:

 Denis Koroskin wrote:
 One more:
 
 T foo(bool someCondition)
 {
     T? t;
     if (someCondition) t = someInitializer();
     // ...
 
     if (t.isNull) { // not initialized yet
         // ...
     }
 
     return enforce(t); // throws if t is not initialized yet, because 
 foo *must* return a valid value by a contract
 }

 
 It seems to me you've got null references there anyway?
 
 What would you do about:
 
     T[] a;
     a[i] = foo();
 
 where you want to have unused slots be null (or empty, or nothing)?

Your example segfaults. A is null.

If T is const or immutable, special care is also required.

Sep 26 2009

Chad J <chadjoan __spam.is.bad__gmail.com> writes:

Walter Bright wrote:
 ...

Admittedly I didn't read the whole thread.  It is hueg liek xbox.

I'll try and explain this non-nullable by default thing in my own way.

Consider a programmer wanting to define a variable.  I will draw a
decision tree that they would use in a language that has non-nullable
(and nullable) references:

          Programmer needs to declare reference variable...
                               |
                               |
                               |
                      Do they know how to
       yes <--------    initialize it?     --------> no
        |                                            |
        |                                            |
        |                                            |
        v                                            |
 Type t = someExpression();                          |
                                                     v
                                    yes <--------- Brains? ---> no
                                     |                          |
                                     |                          |
                                     v                          v
                                 Type? t;               Type t = dummy;
                         (Explicitly declare)         (Why would anyone)
                         (it to be nullable)             (do this?!?)


So having both kinds of reference types works out like that.

Working with nulls as in current D is as easy as using a nullable type.
 When you need to pass a nullable type to a non-nullable variable or as
a non-nullable function argument, you just manually check for the null
like you should anyways:

Type? t;
... code ...
// If you're lazy.
assert(t);
func(t);

OR, better yet:

Type? t;
... code ...
if ( t )
    func(t);
else
    // Explicitly handle the null value,
    // attempting error recovery if appropriate.

I actually don't know if the syntax would be that nice, but I can dream.

But I still haven't addressed the second part of this:
Which is default?  nullable or non-nullable?
Currently nullable is the default.

Let's consult a table.

+---------------------+--------------+--------------+
|                     |  default is  |  default is  |
|                     | non-nullable |   nullable   |
+---------------------+--------------+--------------+
| Programmer DOESN'T  |   Compiler   | Segfault in  |
| initialize the var. |    error.    | distant file |
| ((s)he forgets)     |   Fast fix.  |      *       |
+---------------------+--------------+--------------+
| Programmer DOES     |  Everything  |  Everything  |
| initialize the var. |   is fine.   |   is fine.   |
+---------------------+--------------+--------------+
| Programmer uses     |  They don't. |  They don't. |
|    dummy variable.  |Nullable used.| Segfault in  |
|                     |  segfault**  | distant file*|
+---------------------+--------------+--------------+

* They will have hours of good fun finding where the segfault-causing
null came from.  If the project is non-trivial, the null may have
crossed hands over a number of function calls, ditched the police by
hiding in a static variable or some class until the heat dies down, or
whoops aliasing.  Sometimes stack traces help, sometimes they don't.  We
don't even have stack traces without hacking our D installs :/

** Same as *, but less likely since functions are more likely to reject
possibly null values, and thus head off the null's escape routes at
compile time.


I can see a couple issues with non-nullable by default:
- This:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=96834
- It complicates the language just a bit more.  I'm willing to
grudgingly honor this as a reason for not implementing the feature.

Sep 26 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 26 Sep 2009 17:08:32 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  >  
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/  
  >
  >
  > :)

 I think he's wrong.

Analogies aside, we have 2 distinct problems here, with several solutions  
for each.  I jotted down what I think are the solutions being discussed  
and the Pros and Cons of each are.

Problem 1. Developer of a function wants to ensure non-null values are  
passed into his function.

Solution 1:

   Rely on the hardware feature to do the checking for you.

   Pros: Easy to do, simple to implement, optimal performance (hardware's  
going to do this anyways).
   Cons: Runtime error instead of compile-time, Error doesn't always occur  
close to the problem, not always easy to get a stack trace.

Solution 2:

   Check for null once the values come into the function, throw an  
exception.

   Pros: Works with the exception system.
   Cons: Manual implementation required, performance hit for every function  
call, Runtime error instead of compile-time, Error doesn't always occur  
close to the problem.

Solution 3:

   Build the non-null requirement into the function signature (note, the  
requirement is optional, it's still possible to use null references if you  
want).

   Pros: Easy to implement, Compile-time error, hard to "work around" by  
putting a dummy value, sometimes no performance hit, most times very  
little performance hit, allows solution 1 and 2 if you want, runtime  
errors occur AT THE POINT things went wrong not later.
   Cons: Non-zero performance hit (you have to check for null sometimes  
before assignment!)

Solution 4:

   Pros: Works with the exception system, easy to implement.
   Cons: Huge performance hit (except in OS where segfault can be hooked),  
Error doesn't always occur close to the problem.

-----------------------

Problem 2. Developer forgets to initialize a declared reference type, but  
uses it.

Solution 1:

   Assign a default value of null.  Rely on hardware to tell you when you  
use it later that you screwed up.

   Pros: Easy to do, simple to implement, optimal performance (hardware's  
going to do this anyways).
   Cons: Runtime error instead of compile-time, Error doesn't always occur  
close to the problem, not always easy to get a stack trace.

Solution 2:

   Require assignment, even if assignment to null. (The "simple" solution)

   Pros: Easy to implement, forces the developer to clarify his  
requirements -- reminding him that there may be a problem.
   Cons: May be unnecessary, forces the developer to make a decision, may  
result in a dummy value being assigned reducing to solution 1.

Solution 3:

   Build into the type the requirement that it can't be null, therefore  
checking for non-null on assignment.  A default value isn't allowed.  A  
nullable type is still allowed, which reduces to solution 1.

   Pros: Easy to implement, solution 1 is still possible, compile-time  
error on misuse, error occurs at the point things went wrong, no  
performance hit (except when you convert a nullable type to a non-nullable  
type), allows solution 3 for first problem.
   Cons: Non-zero performance hit when assigning nullable to non nullable  
type.

Solution 4:

   Compiler performs flow analysis, giving an error when an unassigned  

   Pros: Compile-time error, with good flow analysis allows correct code  
even when assignment isn't done on declaration.
   Cons: Difficult to implement, sometimes can incorrectly require  
assignment if flow is too complex, can force developer to manually assign  
null or dummy value.

*NOTE* for solution 3 I purposely did NOT include the con that it makes  
people assign a dummy value.  I believe this argument to be invalid, since  
it's much easier to just declare the variable as a nullable equivalent  
type (as other people have pointed out).  That problem is more a factor of  
solutions 2 and 4.

----------------------

Anything I missed?

After looking at all the arguments, and brainstorming myself, I think I  
prefer the non-nullable defaults (I didn't have a position on this concept  
before this thread, and I had given it some thought).

errors frequently, usually it was something I forgot to initialize or  
return.  It definitely does not cause the "assign dummy value" syndrome as  
Walter has suggested.  Experience with languages that do a good job of  
letting the programmer know when he made an actual mistake makes a huge  
difference.

I think the non-nullable default will result in even less of a temptation  
to assign a dummy value.

-Steve

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

    Build the non-null requirement into the function signature (note, the  
 requirement is optional, it's still possible to use null references if you  
 want).
 
    Pros: Easy to implement, Compile-time error, hard to "work around" by  
 putting a dummy value, sometimes no performance hit, most times very  
 little performance hit, allows solution 1 and 2 if you want, runtime  
 errors occur AT THE POINT things went wrong not later.
    Cons: Non-zero performance hit (you have to check for null sometimes  
 before assignment!)

To implement it well (and I think it has to be implemented well) it's not so
easy to implement. You have to face the problem I've discussed about about
multiple object initializations inside various ifs.
Also see what downs and I have said regarding arrays of nonnullables.

Among the cons you also have to consider that there's a little more complexity
in the language (two different kinds of references, and such things must also
be explained in the docs and understood by novice D programmers. It's not a
common feature, so they have to learn it).

Another thing to add to the cons is that every layer of compile-time
constraints you add to a language they also add a little amount of rigidity
that has a cost (because you have to add ? and you sometimes may need casts to
break such rigidity). Dynamic languages show that constraints have a cost.

Bye,
bearophile

Sep 27 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 27/09/2009 17:51, bearophile wrote:
 Steven Schveighoffer:

 Build the non-null requirement into the function signature (note,
 the requirement is optional, it's still possible to use null
 references if you want).

 Pros: Easy to implement, Compile-time error, hard to "work around"
 by putting a dummy value, sometimes no performance hit, most times
 very little performance hit, allows solution 1 and 2 if you want,
 runtime errors occur AT THE POINT things went wrong not later.
 Cons: Non-zero performance hit (you have to check for null
 sometimes before assignment!)

 To implement it well (and I think it has to be implemented well) it's
 not so easy to implement. You have to face the problem I've discussed
 about about multiple object initializations inside various ifs. Also
 see what downs and I have said regarding arrays of nonnullables.

I don't accept this argument about nested if statements. D has a 
procedural "if" statement. Of course it doesn't mesh together with 
non-nullable references, you're trying to fit a square peg in a round hole.
the solution is to write a more functional style code. if D ever 
implements true tuples that would be a perfect use case for them.

(T1 t1, T2 t2) = init();
t1.foo;
t2.bar;

 Among the cons you also have to consider that there's a little more
 complexity in the language (two different kinds of references, and
 such things must also be explained in the docs and understood by
 novice D programmers. It's not a common feature, so they have to
 learn it).

that's true. Not only this needs to be taught and pointed out to newbies 
it should also be encouraged as the D way so that it will be used by 
default.
 Another thing to add to the cons is that every layer of compile-time
 constraints you add to a language they also add a little amount of
 rigidity that has a cost (because you have to add ? and you sometimes
 may need casts to break such rigidity). Dynamic languages show that
 constraints have a cost.

 Bye, bearophile

Sep 27 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 27 Sep 2009 11:51:27 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

    Build the non-null requirement into the function signature (note, the
 requirement is optional, it's still possible to use null references if  
 you
 want).

    Pros: Easy to implement, Compile-time error, hard to "work around" by
 putting a dummy value, sometimes no performance hit, most times very
 little performance hit, allows solution 1 and 2 if you want, runtime
 errors occur AT THE POINT things went wrong not later.
    Cons: Non-zero performance hit (you have to check for null sometimes
 before assignment!)

 To implement it well (and I think it has to be implemented well) it's  
 not so easy to implement. You have to face the problem I've discussed  
 about about multiple object initializations inside various ifs.

I think you are referring to a combination of this solution and flow  
analysis?  I didn't mention that solution, but it is possible.  I agree it  
would be more complicated, but I did say that as a con for flow analysis.

 Also see what downs and I have said regarding arrays of nonnullables.

Yes, arrays of non-nullables will be more cumbersome, I should add that as  
a con.  Thanks.

 Among the cons you also have to consider that there's a little more  
 complexity in the language (two different kinds of references, and such  
 things must also be explained in the docs and understood by novice D  
 programmers. It's not a common feature, so they have to learn it).

It's not a common feature, but in practice, one doesn't usually need  
nullable types for most cases, it's only certain cases where it's needed.

For example, no extra docs are needed for:

auto a = new A(); // works, non-nullable is fine

And maybe even for:

A a; // error, must assign non-null  value

because that's a common feature of compilers.

It's similar in my view to shared.  Shared adds a level of complexity that  
needs to be understood if you want to use shared variables, but most of  
the time, your variables are not shared, so no extra thought is required.

 Another thing to add to the cons is that every layer of compile-time  
 constraints you add to a language they also add a little amount of  
 rigidity that has a cost (because you have to add ? and you sometimes  
 may need casts to break such rigidity). Dynamic languages show that  
 constraints have a cost.

The cost needs to be weighed against the cost of the alternatives.  I  
think all the solutions have a cost.  Dynamic languages have a cost too.   
I've been developing in php lately, and I don't know how many times I had  
a bug that I slightly mis-typed a variable name, which still was valid  
code because the language thought I was just declaring a new variable :)   
And to get the IDE to recognize types, I sometimes have to put in a line  
like this:

// uncomment for autocomplete
// x = new ClassType(); printf("Error, please remove line %d\n",  
__LINE__); throw new Exception();

Which I comment out when I'm running, but I uncomment to have the IDE  
recognize that x is a ClassType (for autocomplete).

I think if there was a solution that cost nothing, it would be the clear  
winner.

-Steve

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

I've never seen any suggestion that Boeing (or Airbus, or the FAA) has
changed its philosophy on this. Do you have a reference?

I like to read many books, I have read about this in the chapter Cofee cups in
the piloting room, in the book "Turn Signals Are The Facial Expressions Of
Automobiles" by Donand Normand. It talks about the "strong silent type" of
computer automation, discussed in "In the age of smart machine: The future of
work and Power", by Zuboff.

An example of such problem is explained in:
NTSB 1986 Aircraft Accident report - China Airlines 747-SP, N4522V, 300
nautical miles northwest San Francisco, California, February 19, 1985 (Rapp.
NTSB/AAR-86/03), Washington DC.:
http://libraryonline.erau.edu/online-full-text/ntsb/aircraft-accident-reports/AAR86-03.pdf

It shows a problem that a better design in the autopilot interface can avoid
(probably things have improved since 1985).

They keep improving all the time.

See the example I've shown that shows why what you have said can be dangerous
anyway. A sudden automatic switch off of the autopilot can be dangerous,
because people need time to understand what's happening, when the situation
goes from a mostly automatic one to a mostly manual one.

Of course the pilot must have a button to immediately regain manual control
when she/he wants so. My point was different, that a sudden automatic full
disable of autopilot can be dangerous.

Software used to work that way in the past, but this is not set in stone.
The famous crash of Ariane was caused by ultra-rigid reaction to errors in
software.
Do you know fuzzy logic? One of the purposes of fuzzy logic is to design
control systems (that can be used for washing machines, cameras, missiles, etc)
that work and fail gracefully. They don't work in two binary ways
perfect/totallywrong. A graceful failure may have avoided the Ariane to crash
and go boom.
Today people are studying software systems based on fuzzy logic, neural
networks, support vector machines, and more, that are designed to keep working
despite some small problems and faults.

In some situations (like a TAC machine in an hospital) you may want it to
switch off totally if a problem is found, instead of a graceful failure. On the
other hand if you put such TAC in a poor hospital in Africa you may want
something that keeps working even if some small trouble is present, because a
less than perfect machine is going to be the standard situation where there is
very little money, and a reduced functionality TAC may save lot of people
anyway (if it emits too much X rays it's better to switch it off).

Bye,
bearophile

Sep 27 2009

BCS <none anon.com> writes:

Hello bearophile,

 Do you know fuzzy logic? One of the purposes of fuzzy logic is to
 design control systems (that can be used for washing machines,
 cameras, missiles, etc) that work and fail gracefully. They don't work
 in two binary ways perfect/totallywrong. A graceful failure may have
 avoided the Ariane to crash and go boom.
 
 Today people are studying software systems based on fuzzy logic,
 neural networks, support vector machines, and more, that are designed
 to keep working despite some small problems and faults.

But this still assumes some degree of reliability of the code doing the fuzzy 
logic. If I had to guess, I'd expect that the systems you mention are designed 
to function under external faults (some expected input vanishes or some other 
component in a distributed system fails). It would be almost impossible to 
make a program that can work correctly once it has had an internal fault. 
Once that has happened, I think Walter is correct and the only thing to do 
is shut down. In the auto pilot case, this could amount to kill off the current 
auto pilot process and boot up a very simple fly-straight-and-level program 
to take over while the pilot reacts to a nice loud klaxon.

Sep 27 2009

bearophile <bearophileHUGS lycos.com> writes:

BCS:

 But this still assumes some degree of reliability of the code doing the fuzzy 
 logic.

Fuzzy logic can also be "run" by hardware, fuzzy logic engine chips. Such chips
are usually cheap. You can find them in some consumer products.

The fuzzy logic rules can also be converted to correct programs by automatic
software generators. (Of course mistakes etc are always possible.)

Bye,
bearophile

Sep 27 2009

Don <nospam nospam.com> writes:

Walter Bright wrote:
 Denis Koroskin wrote:
  > On Sat, 26 Sep 2009 22:30:58 +0400, Walter Bright
  > <newshound1 digitalmars.com> wrote:
  >> D has borrowed ideas from many different languages. The trick is to
  >> take the good stuff and avoid their mistakes <g>.
  >
  > How about this one:
  > 
 http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/ 

  >
  >
  > :)

 I think he's wrong.

 Getting rid of null references is like solving the problem of dead 
 canaries in the coal mines by replacing them with stuffed toys.

Let's go back a step. The problem being addressed is this: inadvertent 
null references are an EXTREMELY common bug in D. For example, it's a 
bug which *every* C++ refugee gets hit by. I have experienced it 
ridiculously often in D.

*** The problem of null references is an order of magnitude worse in D 
than in C++, because classes in D use reference semantics. ***

Eliminating that category of bug at compile time would have a huge 
benefit for code quality. "Non-nullable references by default" is just a 
proposed solution. Maybe if D had better flow analysis, the demand for 
non-nullable references wouldn't be so great.
(Neither is a pure subset of the other, flow analysis works for all 
variables, non-nullable references catches more complex logic errors. 
But there is a very significant overlap).

Interestingly, while working on CTFE, I noticed that the CTFE code has a 
  lot in common with flow analysis. I can easily imagine the same code 
being reused.

Sep 29 2009

bearophile <bearophileHUGS lycos.com> writes:

Don:

 Maybe if D had better flow analysis, the demand for 
 non-nullable references wouldn't be so great.


the flow analysis C#compiler performs, the need for non-nullable references is
not so strong.


 (Neither is a pure subset of the other, flow analysis works for all 
 variables, non-nullable references catches more complex logic errors. 
 But there is a very significant overlap).

I like how you can see things a little more clearly than other people (like me).
Flow analysis helps for all variables, but it's limited in the scope.
Nonnullable references are a program-wide contract, their effect extends to
called functions, etc. And helps avoid null tests inside them too.
Probably flow analysis is the most important among such two features. I think
having both is better, they can work in synergy.

Bye,
bearophile

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

bearophile wrote:
 Don:
 
 Maybe if D had better flow analysis, the demand for 
 non-nullable references wouldn't be so great.

 

to the flow analysis C#compiler performs, the need for non-nullable references
is not so strong.

Which is what I said half a dozen times in this thread :)


 (Neither is a pure subset of the other, flow analysis works for all 
 variables, non-nullable references catches more complex logic errors. 
 But there is a very significant overlap).

 
 I like how you can see things a little more clearly than other people (like
me).
 Flow analysis helps for all variables, but it's limited in the scope.
Nonnullable references are a program-wide contract, their effect extends to
called functions, etc. And helps avoid null tests inside them too.
 Probably flow analysis is the most important among such two features. I think
having both is better, they can work in synergy.
 
 Bye,
 bearophile

Flow analysis must be implemented by the compiler, nonnull references 
can be enforced by a runtime wrapper (much like smart_ptr enforces 
addref and release calls in C++, you don't see smart_ptr being moved in 
the language spec even if half the C++ community would drool over the idea).

The best thing about flow analysis is that we can take away the whole 
default initializer idea, since it was made to make non-initialized 
variable errors easy to pinpoint in the first place, not as a 
convenience to turn "int a = 0;" into "int a;".

Besides DMD must have some basic flow analysis already since it does 
notice when a code path does not return, it just need to be extended to 
include unitialized variables.

Sep 29 2009

bearophile <bearophileHUGS lycos.com> writes:

Jeremie Pelletier:

 Flow analysis must be implemented by the compiler, nonnull references 
 can be enforced by a runtime wrapper

The point of nonnull references is all in its compile-time enforced constraints.


 Besides DMD must have some basic flow analysis already since it does 
 notice when a code path does not return, it just need to be extended to 
 include unitialized variables.

You have probably missed them, but flow analysis in D was discussed a lot in
the past. I don't think Walter wants to implement it. If you help implement it,
showing that it can be done, he may change his mind.

Bye,
bearophile

Sep 29 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

bearophile wrote:
 Jeremie Pelletier:
 
 Flow analysis must be implemented by the compiler, nonnull references 
 can be enforced by a runtime wrapper

 
 The point of nonnull references is all in its compile-time enforced
constraints.
 
 
 Besides DMD must have some basic flow analysis already since it does 
 notice when a code path does not return, it just need to be extended to 
 include unitialized variables.

 
 You have probably missed them, but flow analysis in D was discussed a lot in
the past. I don't think Walter wants to implement it. If you help implement it,
showing that it can be done, he may change his mind.
 
 Bye,
 bearophile

I'll try and hack at it in a few weeks when I get some free time. Its 
definitely standing high on my D wishlist.

Jeremie

Sep 29 2009

bearophile <bearophileHUGS lycos.com> writes:

If nonnull class references are added to D, then it can be good to add nonnull
struct pointers too, to avoid similar bugs.


as the default one when the struct reference is null).

Bye,
bearophile

Sep 30 2009

Max Samukha <spambox d-coding.com> writes:

On Wed, 30 Sep 2009 08:53:57 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

If nonnull class references are added to D, then it can be good to add nonnull
struct pointers too, to avoid similar bugs.


as the default one when the struct reference is null).

Bye,
bearophile


value types. You can box them, use pointers to them in 'unsafe'
context but you can't directly allocate them on heap. So there is no
nullable references to structs. When you do:

struct S
{
  public int x;
}

S s = new S(); // x is default initialized to 0

the struct is still allocated on stack and passed by value.

S s;
s.x = 1; //error, struct is not initialized


control. One example:

struct S
{
   public Object obj;
}

S s = new S();
s.obj.ToString(); // null-reference exception;

Sep 30 2009

Max Samukha <spambox d-coding.com> writes:

On Wed, 30 Sep 2009 18:26:20 +0300, Max Samukha <spambox d-coding.com>
wrote:

On Wed, 30 Sep 2009 08:53:57 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

If nonnull class references are added to D, then it can be good to add nonnull
struct pointers too, to avoid similar bugs.


as the default one when the struct reference is null).

Bye,
bearophile


value types. You can box them, use pointers to them in 'unsafe'
context but you can't directly allocate them on heap. So there is no
nullable references to structs. When you do:

struct S
{
  public int x;
}

S s = new S(); // x is default initialized to 0

Ok. I have rechecked this one and it appears that you don't have to
initialize a struct if it is a POD (Microsoft names such structs
'unmanaged', of course) or you don't access fields that are
references. For example:

struct S
{
   public int x;
   public Object obj;
}

S s;
s.x = 1; // ok,  we can get away without initialization for now
s.obj.ToString(); // compile-time error

Sep 30 2009

bearophile <bearophileHUGS lycos.com> writes:

Max Samukha:


 value types.

Yes, you are right.

But in D structs can be allocated on the heap too, so I think having optional
nonnull struct pointers can be useful. The syntax and usage is similar to
normal struct pointers.

Bye,
bearophile

Sep 30 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Wed, Sep 30, 2009 at 12:44 PM, bearophile <bearophileHUGS lycos.com> wrote:
 Max Samukha:


 value types.

 Yes, you are right.

 But in D structs can be allocated on the heap too, so I think having optional
nonnull struct pointers can be useful. The syntax and usage is similar to
normal struct pointers.

I don't know why a struct pointer would be different than any other
pointer. That is, you'd have S* and S*? as well as int* and int*?.

Sep 30 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Wed, 30 Sep 2009 23:08:40 +0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Wed, Sep 30, 2009 at 12:44 PM, bearophile <bearophileHUGS lycos.com>  
 wrote:
 Max Samukha:


 value types.

 Yes, you are right.

 But in D structs can be allocated on the heap too, so I think having  
 optional nonnull struct pointers can be useful. The syntax and usage is  
 similar to normal struct pointers.

 I don't know why a struct pointer would be different than any other
 pointer. That is, you'd have S* and S*? as well as int* and int*?.

Note that C stdlib (and other libraries/bindings) will need to be updated  
to reflect changes, e.g.

extern(C) void*? malloc(size_t size); // may return null!

which is great because it will provide additional safety. I've seen quite  
a lot of code that don't test returned value against null (which is a  
mistake, I believe).

Sep 30 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Wed, Sep 30, 2009 at 3:30 PM, Denis Koroskin <2korden gmail.com> wrote:

 Note that C stdlib (and other libraries/bindings) will need to be updated to
 reflect changes, e.g.

 extern(C) void*? malloc(size_t size); // may return null!

 which is great because it will provide additional safety. I've seen quite a
 lot of code that don't test returned value against null (which is a mistake,
 I believe).

Wonderful. Don't you love self-documenting code that forces you to use
it correctly? :P

Sep 30 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-09-30 15:30:02 -0400, "Denis Koroskin" <2korden gmail.com> said:

 Note that C stdlib (and other libraries/bindings) will need to be 
 updated  to reflect changes, e.g.
 
 extern(C) void*? malloc(size_t size); // may return null!
 
 which is great because it will provide additional safety. I've seen 
 quite  a lot of code that don't test returned value against null (which 
 is a  mistake, I believe).

Which makes me think of this: pointers being non-nullable by default 
will make it easy to make mistakes when writing C bindings. A 
programmer might see this C declaration:

	void* malloc(size_t size);

and naively translate it to D like this:

	extern(C) void* malloc(size_t size);

without noticing the change in semantics.

For pointer arguments it's not much of a problem: the worse that can 
happen is that it blocks you from passing a null value when you should 
(in which case you can update the bindings). For a return value it's 
more troublesome because you're implicitly adding a promise that the 
function will not return null, and you might not realize it's wrong 
until it does indeed return null and your program crashes with a 
segfault.

Not that I think it's worth bothering too much, but it's something to 
keep in mind.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 30 2009

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:

 For a return value it's 
 more troublesome because you're implicitly adding a promise that the 
 function will not return null, and you might not realize it's wrong 
 until it does indeed return null and your program crashes with a 
 segfault.

I see.
It's a matter of how much you value safety in your language. If you want a
safer language, like Cyclone tries to be, the compiler may disallow function
signatures like:
extern(C) void* foo(size_t size);

And force to use:
extern(C) void*? foo(size_t size);

Because the D compiler can't be sure that foo() returns a nonnull. In such
situation you may just use the function like that, that returns a nullable
pointer. This gives no overhead, and no safety.

Or you can add a bit of overhead and use something like an enforce (or an if)
to create a nonnullable pointer from the nullable result of foo().

Finally if you are very sure your C function never returns a null, and you
really want to use a nonnullable pointer around in your code, but you don't
want to pay for the little null test in the D code, then you hard cast the
nullable pointer to a nonnullable one, but I hope this is done in really
uncommon situations.

I think this may solve the problem in a good enough way.

Bye,
bearophile

Sep 30 2009

Yigal Chripun <yigal100 gmail.com> writes:

On 30/09/2009 18:44, bearophile wrote:
 Max Samukha:


 value types.

 Yes, you are right.

 But in D structs can be allocated on the heap too, so I think having optional
nonnull struct pointers can be useful. The syntax and usage is similar to
normal struct pointers.

 Bye,
 bearophile

why not just use references with structs?

struct S { ... }
S* sPtr = new S;
S sRef = *sPtr; // non-null ref

there is no need for non-null pointers.

Sep 30 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Yigal Chripun wrote:
 On 30/09/2009 18:44, bearophile wrote:
 Max Samukha:


 value types.

 Yes, you are right.

 But in D structs can be allocated on the heap too, so I think having 
 optional nonnull struct pointers can be useful. The syntax and usage 
 is similar to normal struct pointers.

 Bye,
 bearophile

 
 why not just use references with structs?
 
 struct S { ... }
 S* sPtr = new S;
 S sRef = *sPtr; // non-null ref
 
 there is no need for non-null pointers.

Because sRef wouldn't be a reference but a copy.

Sep 30 2009

Justin <no spam.com> writes:

bearophile Wrote:

 Max Samukha:
 

 value types.

 
 Yes, you are right.
 
 But in D structs can be allocated on the heap too, so I think having optional
nonnull
 struct pointers can be useful. The syntax and usage is similar to normal
struct pointers.
 
 Bye,
 bearophile

AYK, in C++ structs are just classes with public protection (for members) by
default, or, if you like,
classes are just structs with private protection (for members) by default. 
Other than protection rules
there aint any *useful* difference semantically and you can have pointers or
references to both.

To a certain extent Bjarne would have done C++ a favour by either (1)
not introducing "class" as a separate language entity and simply resigning to

#define class struct

and adding the OO the features to struct,
or (2), instead of bleeding OO features into structs, heeding

"teacher, teacher leave the struct alone"

and introducing "class" as the clean addition that C++ brought to C.

With D, the designer has made some rather brilliant distinction between classes
and structs
in terms of value vs reference.  In doing so he has sorted out the dot (.) vs
arrow (->) mess
that is the infamous signature of C++.

Quoting from the D spec:

"Structs and unions are meant as simple aggregations of data, or as a way to
paint a
data structure over hardware or an external type. External types can be defined
by the operating
system API, or by a file format. Object oriented features are provided with the
class data type."

I think it would be a billion dollar mistake to make the last sentence in that
quote obsolete.

But, if people insist, then let's be consistent and apply the same logic (re
optional
nonnull pointers being useful) to unions as well (and, for that matter, to
anything that
can be pointed to).  Ahggg, shock horror.

But then again, out of the shock and horror, maybe something beautiful can
evolve.

Cheers

-- Justin Johansson

Oct 02 2009

Max Samukha <spambox d-coding.com> writes:

On Wed, 30 Sep 2009 18:26:20 +0300, Max Samukha <spambox d-coding.com>
wrote:


control.

I'll probably never learn to proof-read my opuses. It should have been
"flow analysis".

Sep 30 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Null references redux